- A week ago: "’It’s not right!’ shouted the weeping Hilton."
- Today, re the Sopranos finale:[*] "’To go out like that was not right,’" a reader complained.
Year: 2007
OGDC Talk Slides Online
I’ve been getting a fair number of requests for my slides from last week’s keynote at OGDC. They’re now available here.
Trip Report: April 2007 ISO C++ Standards Meeting
The ISO C++ standards committee met on April 15-20, 2007 in Oxford, UK. Despite unseasonably sunny and warm weather and good European beer (the best I tried was Budweiser, and no, I’m not kidding because it’s not the American one), we made good progress toward our goal of having a complete draft of C++09 out for public review this year.
(Aside: I delayed posting this article until the post-meeting mailing was online on the ISO C++ website, so that I could add live public links to the referenced documents, for your viewing pleasure.)
New Language Features Voted Into (Draft) C++09
The following new features progressed to the point of being voted into the draft standard at this meeting. I’ve included a link to the final paper number so that you can read up on the gory details of your favorite proposal. (Note that I’m not mentioning changes that are smaller or are just reformulations that have no effect on the semantics of your code, such as that we replaced the wording about sequence points in the standard while maintaining the same effect.)
Template aliases (aka typedef templates, generalized typedefs) [N2258]
This proposal was coauthored by Gabriel Dos Reis and Bjarne Stroustrup. I’ve written before (here in an article and here in a committee proposal) about why you might want to be able to write a typedef that nails down some, but not all, template parameters. The example from the first article is:
// Not legal C++, but it would be nice…
//
template<typename T>
typedef std::map<std::string, T> Registry;// If only we could do the above, we could then write things like:
//
Registry<Employee> employeeRoster;
Registry<void (*)(int)> callbacks;
Registry<ClassFactory*> factories;
What C++09 actually added was not only the equivalent of typedef templates, but something more general. Template aliases allow you to get the same effect as the above using the following syntax:
// Legal in C++00
//
template<typename T>
using Registry = std::map<std::string, T>;// Now these all work as intended
//
Registry<Employee> employeeRoster;
Registry<void (*)(int)> callbacks;
Registry<ClassFactory*> factories;
An example from the second article is motivated by making policy-based-designed templates more usable. Referring to one form of Andrei’s SmartPtr template, I showed that we can write the following wrap-the-typedef-in-a-struct workaround today:
// Today’s workaround
//
template<typename T>
struct shared_ptr {
typedef Loki::SmartPtr<
T, // note that T still varies, but everything else is fixed
RefCounted, NoChecking, false, PointsToOneObject, SingleThreaded, SimplePointer<T>
>
type;
};shared_ptr<int>::type p; // sample usage, “::Type” is ugly
One of the primary reasons (but not the only one) why policy-based smart pointers didn’t make it into C++09 was usability: Although Loki::SmartPtr can express shared_ptr easily as just one of its configurations, the user has to type lots of weird stuff for the policy parameters that is hard to completely hide behind a simple typedef, and so the user would be less inclined to use the feature:
shared_ptr<int>::type p; // the best usability we could provide for a policy-based Shared Ptr
shared_ptr<int> p; // what actually won and is now in C++09
Too bad we didn’t have template aliases then, because now we could have our cake and eat it too:
// Legal in C++09
//
template<typename T>
using shared_ptr =
Loki::SmartPtr<
T,
RefCounted, NoChecking, false, PointsToOneObject, SingleThreaded, SimplePointer<T>
>;
shared_ptr<int> p; // using the alias lets us have a completely natural syntax
This aliasing feature extends beyond templates and is also essentially a generalized replacement for typedef. For example:
// Legal C++98 and C++09
//
typedef int Size;
typedef void (*handler_t)( int );// Legal C++09
//
using Size = int;
using handler_t = void (*)( int );
Especially for such function pointer types, the new form of "using" is simpler than typedef and lets programmers avoid playing guess-where-the-typedef-name-goes because of the C declaration syntax weirdness, at least in this place.
Variadic templates (aka "type varargs" for templates) [N2242; see N2087 for a more readable description and rationale]
This proposal was coauthored by Doug Gregor, Jaakko Järvi, Jens Maurer, and Jason Merrill.
You know how you can call a function like printf with a variable argument list (C’s good old varargs)? Well, there are times you’d like to do that for template arguments lists, too, and now you can. For example, today the C++09 tuple type needs to be declared with some fixed number of defaulted parameters:
// Drawn from draft C++09, before variadic templates
// (when the best tool we had was to hardcode some defaulted parameters)
//
template<
class T1 = unspecified ,
class T2 = unspecified ,
… ,
class TN = unspecified
> class tuple;template<class T1, class T2, …, class TN>
tuple<V1, V2, …, VN> make_tuple( const T1&, const T2& , …, const TN& );
These declarations gets uglier as N gets bigger. And what if you want more than N types? Sorry, Charlie; the implementer just has to write enough types to cover most use cases, and the actual number in the current wording would be nonportable in practice as every implementer has to make their own choice about how many types will be enough.
With variadic templates, we can write it much more simply and flexibly (and portably, once compilers implement the feature):
// Simplified version using variadic templates
// (now in the latest working draft of C++09; see Library section below)
//
template<class… Types> class tuple;template<class… Types>
tuple<VTypes…> make_tuple( Types&… );
This is simpler to define, works with any number of types, and will work portably no matter how many types you use. What’s not to like?
For another example that summarizes other ways to use the feature with class types, consider this example provided by Jonathan Caves (thanks, Jon!):
// Legal in C++09
template<typename… M
ixins>
class X : public Mixins…{
public:
X( const Mixins&… mixins ) : Mixins(mixins)… { }
};class A { };
class B { };
class C { };X<A, B, C> x;
For the type X<A,B,C>, the compiler will generate something like the following:
class X<A, B, C> : public A, public B, public C {
public:
X( A const& __a, B const& __b, C const& __c ) : A(__a), B(__b), C(__c) { }
};
Unicode characters and strings [N2249]
This proposal was written by Lawrence Crowl, based on the ISO C technical report containing similar non-normative extensions for C.
C++ has essentially airlifted in the unofficial C extensions for char16_t and char32_t, and made them available in the new header <cuchar>. Here are a few basic uses:
char16_t* str16 = u"some 16-bit string value";
char32_t* str32 = U"some 32-bit string value";
C++ did a few small things differently from C. In particular, C++ requires the the encoding be UTF, and C++ is making these types a normative (required) part of its language standard.
New Library Features Voted Into (Draft) C++09
As usual, as we vote in language extensions, the library working group continues systematically updating the standard library to make use of those extensions. In particular, this time the library got support for:
- Variadic templates (see above, including its application to std::tuple)
- Unicode characters (see above)
- Rvalue references (see next)
Rvalue references, aka "move construction/assignment," are a useful way to express that you’re constructing or assigning from an object that will no longer be used for anything else — including, for example, a temporary object — and so you can often get a decent performance boost by simply stealing the guts of the other object instead of making a potentially expensive deep copy. For example, move assignment for a string would simply take over the internal state of the source string using just a few pointer/integer assignments, and not perform any allocations. For more details, see this motivational paper which describes the significant performance and usability advantages of move semantics. Updating the standard library to use move semantics included notably updating the containers, iterators, and algorithms to support moveable types, as well as deprecating auto_ptr (wow!) and replacing it with a unique_ptr that expresses unique ownership semantics. (Previously, the library had already added shared_ptr and weak_ptr to express shared ownership and shared observation semantics, so reference-counted shared pointers are already covered.)
If you’re interested in the standardese details of the new move semantics support in the standard library, you can find details of the updates in the following papers: N1856 (basic concepts, pair, and unique_ptr), N1857 (strings), N1858 (containers), N1859 (iterators), N1860 (algorithms), N1861 (valarray), and N1862 (iostreams).
There were a few other housekeeping updates, but for template metaprogramming aficionados one highlight will be the std::enable_if type trait. Have you ever wanted to write a function template that gets completely ignored for types the template wasn’t designed to be used with (as opposed to accidentally becoming a best match and hiding some other function the user really intended to call)? If you’re a template library writer and haven’t yet seen enable_if, see here for a nice Boosty discussion, and you might get to love it too.
Next Meetings
Here are the next meetings of the ISO C++ standards committee, with links to meeting information. The meetings are public, and if you’re in the area please feel free to drop by.
- July 15-20, 2007: Toronto, Canada
- September 30 – October 6: Kona, Hawaii, USA [N2289]
Note that the latter is the meeting at which we are hoping to vote out the first complete draft of the new C++ standard, and it runs straight through to Saturday afternoon. So let me insert a caveat for those who might be thinking, "oh yeah, that sounds like the meeting I think I’ll pick to attend, and catch some rays, ha ha" — having a standards meeting in Hawaii sounds nice, but it’s not nearly what you might think. If you come and want to participate, expect to be shut in windowless rooms all day, and frequently all evening — you’ll only actually see the local landscape if you fly in early or stay late, not during the meeting. But we’ll appreciate your help proofreading mind-numbing swathes of the standard!
Thoughts on Scott’s “Red Code / Green Code” Talk
Scott Meyers recently gave an interesting talk at NWCPP. Thanks to Kevin Frei’s recording skills and hardware, you can watch the video here.
Scott’s topic was "Red Code / Green Code: Generalizing const." Here’s the abstract:
C++ compilers allow non-const code to call const code, but going the other way requires a cast. In this talk, Scott describes an approach he’s been pursuing to generalize this notion to arbitrary criteria. For example, thread-safe code should only call other thread-safe code (unless you explicitly permit it on a per-call basis). Ditto for exception-safe code, code not "contaminated" by some open source license, or any other constraint you choose. The approach is based on template metaprogramming (TMP), and the implementation uses the Boost metaprogramming library (Boost.MPL), so constraint violations are, wherever possible, detected during compilation.
(Nit: I think the "const" analogy is a slight red herring. See the Coda at the bottom of this post for a brief discussion.)
Motivation and summary
The motivation is that it might be nice to be able to define categories of constraints on code, and have a convention to enforce that constrained code can’t implicitly call unconstrained code. Scott gives the example that you might want to prevent LGPL’d code from implicitly calling non-LGPL’d code if that would make the non-LGPL’d code subject to LPGL virally. Similarly, you might want to prevent reviewed code from implicitly calling unreviewed code; and so on. Note that these constraints overlap; you can have any combination of LGPL’d/non-LGPL’d and reviewed/unreviewed code.
The basic technique Scott describes is to define a tag type, which is similar to the technique used in standard C++ iostreams to mark iterator categories. You just define some empty types (all we’re doing is generating names):
struct ThreadSafe {};
struct LGPLed {};
struct Reviewed {};
// etc.
Then you arrange for each function to declare its constraints at compile time, for example to say that "my function is LGPL’d" or "my function is Reviewed" or any combination thereof. Scott showed how to conveniently use MPL compile-time collections to write a group of constraints. He provides this general helper template:
template<typename Constraints>
struct CallConstraints {
…
}
Then the callers and callees all traffick in CallConstraints<mpl::vector<ExceptionSafe,Reviewed>>, CallConstraints<mpl::vector<ThreadSafe, LGPLed>>, and so on. Scott also provides ways to opt out or deliberately loosen constraints, via helpers IgnoreConstraints and eraseVal<MyConstraints, SomeConstraint>. He also provides an includes metafunction that you can use to see if one set of constraints is compatible with another set.
But where do you put these constraints, and how do you pass and check them? Scott presented several ways (see the talk for details), but they all had serious drawbacks. Here’s a quick summary of the primary alternative he presented: The idea is to make each participating function that wants to declare constraints a template, and write its constraints inside the function. For example:
template<typename CallerConstraints>
void f( Params params ) {
typedef mpl::vector<ExceptionSafe> MyConstraints;
BOOST_MPL_ASSERT(( includes<MyConstraints, CallerConstraints> ) );
…
}
That’s the basic pattern. When you call another constrained function, you pass along your constraint type, and optionally explicitly relax constraints if you want to loosen them. For example:
template<typename CallerConstraints>
void g() {
typedef mpl::vector<ExceptionSafe, Reviewed> MyConstraints;
BOOST_MPL_ASSERT(( includes<MyConstraints, CallerConstraints> ));
…
// try to call the other function
f<MyConstraints>( params ); // error, trying to call unreviewed code from reviewed code
f<eraseVal<MyConstraints,Reviewed>>( params ); // ok, ignore Reviewed constraint
f<IgnoreConstraints>( params ); // ok, explicitly ignore constraints entirely
}
This has a number of drawbacks:
- Virtual functions vs. compile-time checking: This doesn’t (directly) work for virtual functions, because templates can’t be virtual. See Scott’s talk for details about ways to trade that off against a different design, namely the NVI pattern, or a different drawback, namely run-time checking.
- Template explosion: Every participating function is required to become a template (or to add a new template parameter). That’s more than merely inconvenient; for one thing, it means we have to put every function in a header file (or else wrap it with a template that does the constraints checking and then passes through to the normal function, which is tedious duplication). The other serious problem is that the function template is now going to have to be instantiated once for each unique combination of caller constraints, including even constraints that are added in the future and have nothing to do with this function, even though each instantiation generates identical code.
- Separate checking: The constraints aren’t part of the type of the function, and so we have to compile the body of the function to determine whether the constraints are compatible. In short, we’re not leveraging the type system as much as we might wish to do.
Another minor drawback is that each user has to write the static assertions himself.
My humble contribution (sketch)
I enjoyed Scott’s motivation, and during his talk I thought of a simpler way to pass and check these constraints. I chatted with him about it afterwards, and he’s added it to his talk notes (which should go live at the above talk link over the next few days).
Here’s the idea that occurred to me: What if we make the constraints part of the function signature by passing them as an additional normal parameter, and do the check on the conversion from Constraints<Caller> to Constraints<Callee>? Here’s a sketch of how you’d use it:
void f( Params params, Constraints<ExceptionSafe> myConstraints ) {
…
}void g( Constraints<ExceptionSafe, Reviewed> myConstraints ) {
…
// try to call the other function
f( params, myConstraints ); // error, trying to call unreviewed code from reviewed code
f( params, myConstraints::erase<Reviewed>() ); // ok, ignore Reviewed constraint
f( params, IgnoreConstraints ); // ok, explicitly ignore constraints entirely
}
This has none of the drawbacks of the other alternatives:
- It incurs zero or near-zero space and time overhead: No extra template expansions, no run-time checking, and possibly even no space overhead because a constraint’s information payload is all in its compile-time type (a constraint object itself is empty).
- It doesn’t require users to make all their functions templates: Just add a normal parameter.
- It works naturally with virtual functions: A derived override must have constraints compatible with
the base function, and that follows naturally in that it must match the signature, now including the constraints, of the base. (Future-proofing note: If C++ is ever extended to allow contravariant parameters on virtual functions, that would dovetail perfectly with this technique because derived functions can be less constrained than base functions). - It supports separate compilation and separate checking, by making the constraint part of the function’s type.
We could enable the above technique by bundling up the functionality inside just one Constraints template that would look something like this (sketch only):
// This is pseudocode.
//
class IgnoreConstraints { }; // just a helpertemplate<C1 = Unused, C2 = Unused, … CN = Unused> // could get rid of this redundancy with C++09 variadic templates
struct Constraints {
typedef mpl::vector<C1, C2, …, CN> CSet;// Use the conversion operation as our hook to perform the check.
//
template<typename CallerConstraints>
Constraints( const Constraints<CallerConstraints>& ){
BOOST_MPL_ASSERT(( includes<CSet, Constraints<CallerConstraints>::CSet> ));
}// Allow the caller to explicitly ignore constraints by doing no checks
// on the conversion from the helper type.
//
Constraints( const IgnoreConstraints& ) { }// … provide erase by duplicating Scott’s eraseVal on CSet…
};
Scott has added notes to his slides showing this approach, and intends to write a real article about this. I’ll leave it to him to write a complete implementation based on the above or some variation thereof; this is just to sketch the idea.
Thanks, Scott, for a very interesting talk!
Coda: On the "const-ness" of code
The talk description starts with:
C++ compilers allow non-const code to call const code, but going the other way requires a cast.
This reference to const-ness is intended to be a helpful analogy in the sense that a const member function can’t directly call a non-const member function of the same class. But really that’s the only situation where you could at a stretch say something like "non-const code can’t call const code." The reason this analogy doesn’t really match what the actual (good and useful!) topic of this talk and technique is that const is about data, not code.
In particular, an object can be const, but a function can’t be const. Not ever. Now, at this point someone may immediately object, "Herb, you fool! Of course a function can be const! What about…" and go on to scribble a code example like:
class X {
public:
void f( const Y& y, int i ) const { // "const code"?
g( y, i ); // error — "can’t call non-const code from const code" ?
}
void g( const Y& y, int i ) { } // "non-const code"?
};
"And so clearly," said someone might triumphantly conclude, "here X::f is const code, and X::g is non-const code, right?"
Alas, no. The only difference the const makes is that the implicit this parameter has type const X* in X::f as opposed to type X* in X::g. It is true that X::f can’t call X::g without a cast, but that has nothing to do with "const-ness of code" but rather the constness of data — you can’t implicitly convert a const X* to an X*. To drive the point home, note that the above code is in virtually every way the same as if we’d written f and g as non-member friends and named the parameter explicitly:
void f( const X* this_, const Y& y, int i ) { // "const code"?
g( this_, y, i ); // error: can’t convert const X* to X*
}
void g( X* this_, const Y& y, int i ) { } // "non-const code"?
Now which function is "const" and which is "non-const"? There really is no such thing as a const function — its constness or non-constness depends (as it must) on which parameter we’re talking about:
- Both functions "are const" with respect to their y parameter.
- Neither function "is const" with respect to their i parameter.
- Only f "is const" with respect to its this_ parameter.
Remember that const is always about data, not code. Const always applies to the declaration of an object of some type. A function can use const on any parameter to declare it won’t change the object you pass it in that position, but it’s still about the object, not about the code. True, for a member function you get to write const at the end if you want to, but that’s just syntactic sugar for putting it on the implicit this parameter; the fact that the const lexically gets tacked onto the function is just an artifact of the this parameter being hidden so that you can’t decorate the parameter directly.
But leaving analogies with const-ness aside, there’s real value in the idea of red code / green code which really is about distinguishing different kinds of code, whereas const is about distinguishing different views of data.
Another Spring Talk
I’m back from the East U.S. / Europe trip, and happy to report that ACCU is just as fun as ever and Lisbon is quite beautiful (for the few hours I was able to see it, mostly through windows). In the next day or three I’ll blog about the spring C++ standards meeting we just concluded last week in Oxford… stay tuned.
In the meantime, here’s a quick note about a talk I’ll be giving close to home. (Any talk that doesn’t involve getting on an airplane gets bonus points.) It’s an instance of the continuously-evolving overview of the concurrency landscape and what it means for software, with a roadmap for our next decade or so. If you’re in the Seattle area, you might be interested in swinging by.
Keynote: Software and the Concurrency Revolution
May 10, 2007
Online Game Development Conference
Seattle, Washington, USAAlthough driven by the industry-wide hardware shift to multicore hardware architectures, concurrency is primarily a software revolution. We are now seeing the initial stages of the next major change in software development, as over the next few years the software industry brings concurrency pervasively into mainstream software development, just as it has done in the past for objects, garbage collection, generics and other technologies. This talk summarizes the issues involved, gives an overview of the impact, and describes what to expect over the coming decade.
Talks & Events This Spring
The spring is springing, the birds are singing, and it’s airplane time again (or, still). In the next two months I’m going to be speaking at several events in North America, Europe, and Asia. Here are two public events where I’ll be giving talks — I’m looking forward to seeing many of you there!
Machine Architecture:
Things Your Programming Language Never Told You
April 14, 2007
ACCU 2007, Oxford, United KingdomHigh-level languages insulate the programmer from the machine. That’s a wonderful thing — except when it obscures the answers to the fundamental questions of “What does the program do?” and “How much does it cost?” Programmers are consistently surprised at what simple code actually does and how expensive it can be, because of being unaware of the complexity of the machine on which the program actually runs. This talk examines the “real meanings” and “true costs” of the code we write and run especially on commodity and server systems, by delving into the performance effects of bandwidth vs. latency limitations, the ever-deepening memory hierarchy, the changing costs arising from the hardware concurrency explosion, memory model effects all the way from the compiler to the CPU to the chipset to the cache, and more — and what you can do about them.
Keynote: Software and the Concurrency Revolution
April 17, 2007
Intel EMEA Software Conference, Lisbon, PortugalAlthough driven by the industry-wide hardware shift to multicore hardware architectures, concurrency is primarily a software revolution. We are now seeing the initial stages of the next major change in software development, as over the next few years the software industry brings concurrency pervasively into mainstream software development, just as it has done in the past for objects, garbage collection, generics and other technologies. This talk summarizes the issues involved, gives an overview of the impact, and describes what to expect over the coming decade.
Ratzenberger on the Manual Arts
When someone I’ve just met asks me what I do for a living and I say I work in software, I sometimes hear wistful responses like, "Oh, that’s cool and it must be a lot of fun. All I do is…" followed by carpentry, plumbing, teaching, farming, or another occupation or trade.
To me, that’s backwards: Any of those things is at least as important, and at least as worthy of respect and appreciation, as what we technologists do. I usually respond by saying so, and adding: "When the power goes out, I’m pretty much useless; worse still, everything I’ve ever made stops working. What you make still keeps working when the power is out."
I can make a pretty good argument that the non-technology skills are more valuable/applicable, but the more important thing is that all skills are deserving of respect and appreciation. Technology has done wonders, and it’s super fun to be involved in that and to have our advances and skills recognized; after all, software has helped build the tools we use to make other real things, including cars and buildings and spacecraft. But it’s unfortunate when sometimes people who focus too heavily on the so-called "higher technology" skills forget, or devalue, the people and skills that built their house, their couch, and their car. Nearly everyone has skills and personal value worth appreciating.
John Ratzenberger makes a similar point:
"The manual arts have always taken precedence over the fine arts. I realized that there is no exception to that rule. You can’t name a fine art that isn’t dependent on a manual art. Someone’s got to build a guitar before Bruce Springsteen can go to work. Someone had to build a ceiling before Michelangelo could go to work."
We live in a richly technologically enabled society, which we can and should enjoy. But when a natural disaster (or just a programming glitch) strikes, and we’re suddenly without power and Nothing Works Any More, we realize how fragile our comfort infrastructure can be.
A couple of months ago here in the northwest United States and western Canada, we had a windstorm that left 1.5 million people without power. Our house was dark and cold for just three nights; many of our friends were out for a week. It can be startling to find oneself unable to talk to anyone without physically going to them: Our cell phones didn’t work at our house, because many cell towers had no power. Many people were chagrined to discover that their landline telephones didn’t work either, because even though the phone lines were fine, the telephone (or base station for a cordless phone) that most people attach requires separate power. Skype wasn’t an option, needless to say, even while the laptop batteries held out. Thus cut off, we were reduced to walking, or driving where trees didn’t block the roads and if you could find a gas station whose pump was working (those pumps usually need electricity too).
Interestingly, the home phone would have been fine had we been using a retro 1950’s-era handset. We’ve now purchased one to keep around the house for next time. Sometimes simpler is better, even if you can’t see the caller ID.
After the storm, who was it who restored our comfort infrastructure, removed fallen trees, and repaired broken houses and fences? Primarily, it wasn’t us technology nerds — it was the electricians, the carpenters and the plumbers. How we do appreciate them! Fortunately, those people in return also appreciate the software, smartphones, PDAs, and other wonders that our industry produces that help them in their own work and leisure, which makes us feel good about being able to contribute something useful back, and so we all get to live in a mutual admiration society.
Thanks for the thought, John. Oh, and Cheers!
Migrant Technology Workers
Today, Slashdot is running an article on immigration. The discussion thread has some interesting notes (though, alas, the signal-to-noise seems to be noticeably lower than usual when reading at +5 — this seems to be quite a politically charged topic).
It reminds me of something that happened two years ago at the ACCU conference. I was on a panel that seemed innocuous enough, until one of the questions raised was immigration: ‘Is it a Good Thing or a Bad Thing that people come from other countries in order to do high-tech work here?’ I’d been peripherally aware that this debate was going on in America, was interested to observe that it was also going on in the UK, and was quite surprised at what a hot button it had become. There sure was a lot of discussion with fervent opinions in both directions.
I don’t have an opinion either way on the issue, but I just thought I’d share an interesting (I hope) observation about perspective, as someone who is from Canada and now lives and works in the United States and thinks well of both places and the people in them: It’s interesting to me that in America (as in Canada), I have seen concerns like this about immigration and people coming from elsewhere to perform domestic professional jobs. I can understand the feelings behind those concerns. On the other hand, during the 35 years I lived in Canada I saw equally frequent and vocal concerns about emigration and the "brain drain" of professionals leaving Canada for the United States — notably doctors and high-tech folks, but we did lose a lot of actors too. And Bill Shatner. (Just kidding; I like Shatner.)
Isn’t it interesting that when a skilled person moves, some people in the country they’re joining are worried that they’re arriving, and some people in the country that they’re leaving are equally worried that they’re going?
Just a thought. Back on Slashdot, though, my favorite comment was:
"Mr Gates did mention that 640K skilled immigrants ought to be enough for USA."
Maybe till January 18/19, 2038. :-)
Welcome to Silicon Miami: The System-On-a-Chip Evolution
A lot of people seem to have opinions about whether hardware trends are generally moving things on-chip or off-chip. I just saw another discussion about this on Slashdot today. Here’s part of the summary of that article:
"In the near future the Central Processing Unit (CPU) will not be as central anymore. AMD has announced the Torrenza platform that revives the concept op [sic] co-processors. Intel is also taking steps in this direction with the announcement of the CSI. With these technologies in the future we can put special chips (GPU’s, APU’s, etc. etc.) directly on the motherboard in a special socket. Hardware.Info has published a clear introduction to AMD Torrenza and Intel CSI and sneak peaks [sic] into the future of processors."
Sloppy spelling aside (and, sigh, a good example of why not to live on on spell-check alone), is this a real trend?
Of course it is. But the exact reverse trend is also real, and I happen to think the reverse trend is more likely to dominate in the medium term. I’ll briefly explain why, and support why I think the above is highlighting the wrong trend and making the wrong prediction.
Two Trends, Both Repeating Throughout (Computer) History
Those who’ve been watching, or simply using, CPUs for years have probably seen both of the following apposite [NB, this spelling is intentional] trends, sometimes at the same time for different hardware functions:
- Stuff moves off the CPU. For example, first the graphics are handled by the CPU; then they’re moved off to a separate GPU for better efficiency.
- Stuff moves onto the CPU. For example, first the FPU is a coprocessor; then it’s moved onto the CPU for better efficiency.
The truth is, the wheel turns. It can turn in different directions at the same time for different parts of the hardware. Just because we’re happening to look at a "move off the chip" moment for one set of components does not a trend make.
Consider why things move on or off the CPU:
- When the CPU is already pretty busy much of the time and doesn’t have much spare capacity, people start making noises about moving this or that off "for better efficiency," and they’re right.
- When the CPU is already pretty idle most of the time, or system cost is an issue, people start making the reverse noises "for better efficiency," and they’re right. (Indeed, if you read the Woz interview that I blogged about recently, you’ll notice how he repeatedly emphasizes his wonderful adventures in the art of the latter — namely, doing more with fewer chips. It led directly to the success of the personal computer, years before it would otherwise likely have happened. Thanks, Woz.)
Add to the mix that general-purpose CPUs by definition can’t be as efficient as special-purpose chips, even when they can do comparable work, and we can better appreciate the balanced forces in play and how they can tip one way or another at different times and for different hardware features.
What’s New or Different Now?
So now mix in the current sea change away from ever-faster uniprocessors and toward processors with many, but not as remarkably faster, cores. Will this sway the long-term trend toward on-processor designs or toward co-processor designs?
The first thing that might occur to us is that there’s still a balance of forces. Specifically, we might consider these effects that I mentioned in the Free Lunch paper:
- On the one hand, this is a force in favor of coprocessors, thus moving work off the CPU. A single core isn’t getting faster the way it used to, and we software folks are gluttons for CPU cycles and are always asking the hardware to do more stuff; after all, we hardly ever remove software features. Therefore for many programs CPU cycles are more dear, so we’ll want to use them for the program’s code as much as we can instead of frittering them away on other work. (This reasoning applies mainly to single-threaded programs and non-scaleable multi-threaded programs, of course.)
- On the other hand, this is also a force against coprocessors, for moving work onto the CPU. We’re now getting a bunch (and soon many bunches) of cores, not just one. Until software gets its act together and we start seeing more mainstream manycore-exploiting applications, we’re going to be enjoying a minor embarrassment of riches in terms of spare CPU capacity, and presumably we’ll be happy using those otherwise idle cores to do work that expensive secondary chips might otherwise do. At least until we have applications ready to soak up all those cycles.
So are the forces still in balance, as they have ever been? Are we just going see more on-the-chip / off-the-chip cycles?
In part yes, but the above analysis is looking more at symptoms than at causes — the reasons why things are happening. The real point is more fundamental, and at the heart of why the free lunch is over:
- On the gripping hand, the fundamental reason why we’re getting so many cores on a chip is because CPU designers don’t know what to do with all those transistors. Moore’s Law is still happily handing out a doubling of transistors per chip every 18 months or so (and will keep doing that for probably at least another decade, thank you, despite recurring ‘Moore’s Law is dead!’ discussion threads on popular forums). That’s the main reason why we’re getting multicore parts: About five years ago, commodity CPU designers pretty much finished mining the "make the chip more complex to run single-threaded code faster" path that they had been mining to good effect for 30 years (there will be more gains there, but more incremental than exponential), and so we’re on the road to manycore instead.
But we’re also on the road to doing other things with all those transistors, besides just manycore. After all, manycore isn’t the only, or necessarily the best, use for all those gates. Now, I said "all" deliberately: To be sure you don’t get me wrong, let me emphasize that manycore is a wonderful new world and a great use for many of those transistors and we should be eagerly excited about that; it’s just not the only or best use for all of those transistors.
What Will Dominate Over the Next Decade? More On-CPU Than Off-CPU
It’s no coincidence that companies like AMD are buying companies like ATI. I’m certainly not going out on much of a limb to predict the following:
- Of course we’ll see some GPUs move on-chip. It’s a great way to soak up transistors and increase bandwidth between the CPU and GPU. Knowing how long CPU design/production pipelines are, don’t expect to see this in earnest for about 3-5 years. But do expect to see it.
- Of course we’ll see some NICs move on-chip. It’s a great way to soak up transistors and increase bandwidth between the CPU and NIC.
- Of course we’ll see some [crypto, security checking, etc., and probably waffle-toasting, and shirt ironing] work move on-chip.
Think "system on a chip" (SoC). By the way, I’m not claiming to make any earth-shattering observation here. All of this is based on public information and/or fairly obvious inference, and I’m sure it has been pointed out by others. Much of it already appears on various CPU vendors’ official roadmaps.
There are just too many transistors available, and located too conveniently close to the CPU cores, to not want to take advantage of them. Just think of it in real estate terms: It’s all about "location, location, location." And when you have a low-rent location (those transistors are keep getting cheaper) in prime beachfront property (on-chip), of course there’ll be a mad rush to buy up the property and a construction boom to build high-rises on the beachfront (think silicon Miami) until the property values reach supply-demand equilibrium again (we get to balanced SoC chips that evenly spend those enormous transistor budgets, the same way we’ve already reached balanced traditional systems). It’s a bit like predicting that rain will fall downward. And it doesn’t really matter whether we think skyscrapers on the beach are aesthetically pleasing or not.
Yes, the on-chip/off-chip wheel will definitely keep turning. Don’t quote this five years from now and say it was wrong by pointing at some new coprocessor where some work moved off-chip; of course that will happen too. And so will the reverse. That both of those trends will continue isn’t really news, at least not to anyone who’s been working with computers for the past couple of decades. It’s just part of the normal let’s-build-a-balanced-system design cycle as software demands evolve and different hardware parts progress at different speeds.
The news lies in the balance between the trends: The one by far most likely to dominate over the next decade will be for now-separate parts to move onto the CPU, not away from it. Pundit commentary notwithstanding, the real estate is just too cheap. Miami, here we come.
Rico Mariani Interviewed on Behind the Code
Rico Mariani is a performance guru at Microsoft and a wonderful person. After a number of years on the Visual C++ team since 1.0, he went to MSN and then to CLR land where he now beats the "measure! don’t ship bad performance! measure!" drum to much good effect. It’s a pleasure to work with him. Interestingly, Rico is one of the handful of people I polled for "what should I cover" in the new Machine Architecture talk I’m currently writing for the seminar Bjarne and I are doing next month. In the current draft of that talk, I have two slides titled "Mariani’s Methodology (or, Rico’s Rules)" — I’m sure he’ll be mortified at the capitals.
Rico has just been interviewed on Channel 9. Recommended viewing. It’s sprinkled throughout with everything from useful career advice (whether you’re just starting out or looking for a next project), to interesting insights into early Visual C++ and Web product development, to how to ship at high performance and high quality in any language. Rico’s blog is also a great resource — and don’t get hung up when he talks mostly about .NET managed code, because the principles generalize to performance in any language and on any system.
How do friends congratulate the man? The pictures tell the story. Good on you, Rico!
On a sad note: One of the five people previously interviewed on the Behind the Code series is Technical Fellow and Turing Award winner Jim Gray. Jim is still missing after disappearing in the Pacific on January 28. Widely known and loved, he is greatly missed. Our thoughts are with Jim’s family.