Trip Report: June 2008 ISO C++ Standards Meeting

The ISO C++ committee met in Sophia Antipolis, France on June 6-14. You can find the minutes here (note that these cover only the whole-group sessions, not the breakout technical sessions where we spend most of the week).

Here’s a summary of what we did, with links to the relevant papers to read for more details, and information about upcoming meetings.

Highlights: Complete C++0x draft coming in September

The biggest goal entering this meeting was to make C++0x feature-complete and stay on track to publish a complete public draft of C++0x this September for international review and comment — in ISO-speak, an official Committee Draft or CD. We are going to achieve that, so the world will know the shape of C++0x in good detail this fall.

We’re also now planning to have two rounds of international comment review instead of just one, to give the world a good look at the standard and two opportunities for national bodies to give their comments, the first round starting after our September 2008 meeting and the second round probably a year later. However, the September 2008 CD is “it”, feature-complete C++0x. The only changes expected to be made between that CD and the final International Standard are bug fixes and clarifications. It’s helpful to think of a CD as a feature-complete beta.

Coming into the June meeting, we already had a nearly-complete C++0x internal working draft — most features that will be part of C++0x had already been “checked in.” Only a few were still waiting to become stable enough to vote in, including initializer lists, range-based for loops, and concepts.

Of these, concepts is the long-pole feature for C++0x, which isn’t surprising given that it’s the biggest new language feature we’re adding to Standard C++. A primary goal of the June meeting, therefore, was to make as much progress on concepts as possible, and to see if it would be possible to vote that feature into the C++0x working draft at this meeting. We almost did that, thanks to a lot of work not only in France but also at smaller meetings throughout the winter and spring: For the first time, we ended a meeting with no known issues or controversial points in the concepts standardese wording, and we expect to “check in” concepts into the working draft at the next meeting in September, which both cleared the way for us to publish a complete draft then and motivated the plan to do two rounds of public review rather than one, just to make sure the standard got enough “bake time” in its complete form.

Next, I’ll summarize some of the major features voted into the draft at the June meeting.

Initializer lists

C++0x initializer lists accomplish two main objectives:

  • Uniformity: They provide a uniform initialization syntax you can use consistently everywhere, which is especially helpful when you write templates.
  • Convenience: They provide a general-purpose way of using the C initializer syntax for all sorts of types, notably containers.

See N2215 for more on the motivation and original design, and N2672 and N2679 for the final standardese.

Example: Initializing aggregates vs. classes

It’s convenient that we can initialize aggregates like this:

struct Coordinate1 {
  int i;
  int j;
  //…
};

Coordinate1 c1 = { 1, 2 };

but the syntax is slightly different for classes with constructors:

class Coordinate2 {
public:
  Coordinate2( int i, int j );
  // …
};

Coordinate2 c2( 1, 2 );

In C++0x, you can still do all of the above, but initializer lists give us a regular way to initialize all kinds of types:

Coordinate1 c1 = { 1, 2 };
Coordinate2 c2 = { 1, 2 }; // legal in C++0x

Having a uniform initialization syntax is particularly helpful when writing template code, so that the template can easily work with a wide variety of types.

Example: Initializing arrays vs. containers

One place where the lack of uniform initialization has been particularly annoying — at least to me, when I write test harnesses to exercise the code I show in articles and talks — is when initializing a container with some default values. Don’t you hate it when you want to create a container initialized to some known values, and if it were an array you can just write:

string a[] = { “xyzzy”, “plugh”, “abracadabra” };

but if it’s a container like a vector, you have to default-construct the container and then push every entry onto it individually:

// Initialize by hand today
vector<string> v;
v.push_back( “xyzzy” );
v.push_back( “plugh” );
v.push_back( “abracadabra” );

or, even more embarrassingly, initialize an array first for convenience and then construct the vector as a copy of the array, using the vector constructor that takes a range as an iterator pair:

// Initialize via an array today
string a[] = { “xyzzy”, “plugh”, “abracadabra” }; // put it into an array first
vector<string> v( a, a+3 ); // then construct the vector as a copy

Arrays are weaker than containers in nearly every other way, so it’s annoying that they get this unique convenience just because of their having been built into the language since the early days of C.

The lack of convenient initialization has been even more irritating with maps:

// Initialize by hand today
map<string,string> phonebook;
phonebook[ “Bjarne Stroustrup (cell)” ] = “+1 (212) 555-1212”;
phonebook[ “Tom Petty (home)” ] = “+1 (858) 555-9734”;
phonebook[ “Amy Winehouse (agent)” ] = “+44 20 74851424”;

In C++0x, we can initialize any container with known values as conveniently as arrays:

// Can use initializer list in C++0x
vector<string> v = { “xyzzy”, “plugh”, “abracadabra” };
map<string,string> phonebook =
  { { “Bjarne Stroustrup (cell)”, “+1 (212) 555-1212” },
    { “Tom Petty (home)”, “+1 (858) 555-9734” },
    { “Amy Winehouse (agent)”, “+44 99 74855424” } };

As a bonus, a uniform initialization syntax even makes arrays easier to deal with. For example, the array initializer syntax didn’t support arrays that are dynamically allocated or class members arrays:

// Initialize dynamically allocated array by hand today
int* a = new int[3];
a[0] = 1;
a[1] = 2;
a[2] = 99;

// Initialize member array by hand today
class X {
  int a[3];
public:
  X() { a[0] = 1; a[1] = 2; a[2] = 99; }
};

In C++0x, the array initialization syntax is uniformly available in these cases too:

// C++0x

int* a = new int[3] { 1, 2, 99 };

class X {
  int a[3];
public:
  X() : a{ 1, 2, 99 } {}
};

More concurrency support

Last October, we already voted in a state-of-the-art memory model, atomic operations, and a threading package. That covered the major things we wanted to see in this standard, but a few more were still in progress. Here are the major changes we made this time that relate to concurrency.

Thread-local storage (N2659): To declare a variable which will be instantiated once for each thread, use the thread_local storage class. For example:

class MyClass {
  …
private:
  thread_local X tlsX;
};
void SomeFunc() {
  thread_local Y tlsY;
  …
};

Dynamic initialization and destruction with concurrency (N2660) handles two major cases:

  • Static and global variables can be concurrently initialized and destroyed if you try to access them on multiple threads before main() begins. If more than one thread could initialize (or use) the variable concurrency, however, it’s up to you to synchronize access.
  • Function-local static variables will have their initialization automatically protected; while one thread is initializing the variable, any other threads that enter the function and reach the variable’s declaration will wait for the initialization to complete before they continue on. Therefore you don’t need to guard initialization races or initialization-use races, and if the variable is immutable once constructed then you don’t need to do any synchronization at all to use it safely. If the variable can be written to after construction, you do still need to make sure you synchronize those post-initialization use-use races on the variable.

Thread safety guarantees for the standard library (N2669): The upshot is that the answer to questions like “what do I need to do to use a vector<T> v or a shared_ptr<U> sp thread-safely?” is now explicitly “same as any other object: if you know that an object like v or sp is shared, you must synchronize access to it.” Only a very few objects need to guarantee internal synchronization; the global allocator is one of those, though, and so it gets that stronger guarantee.

The “other features” section below also includes a few smaller concurrency-related items.

More STL algorithms (N2666)

We now have the following new STL algorithms. Some of them fill holes (e.g., “why isn’t there a copy_if or a counted version of algorithm X?”) and others provide handy extensions (e.g., “why isn’t there an all_of or any_of?”).

  • all_of( first, last, pred )
    returns true iff all elements in the range satisfy pred
  • any_of( first, last, pred )
    returns true iff any element in the range satisfies pred
  • copy_if( first, last result, pred)
    the “why isn’t this in the standard?” poster child algorithm
  • copy_n( first, n result)
    copy for a known number of n elements
  • find_if_not( first, last, pred )
    returns an iterator to the first element that does not satisfy pred
  • iota( first, last, value )
    for each element in the range, assigns value and increments value “as if by ++value”
  • is_partitioned( first, last, pred )
    returns true iff the range is already partitioned by pred; that is, all elements that satisfy pred appear before those that don’t
  • none_of( first, last, pred )
    returns true iff none of the elements in the range satisfies pred
  • partition_copy( first, last, out_true, out_false, pred )
    copy the elements that satisfy pred to out_true, the others to out_false
  • partition_point( first, last, pred)
    assuming the range is already partitioned by pred (see is_partitioned above), returns an iterator to the first element that doesn’t satisfy pred
  • uninitialized_copy_n( first, n, result )
    uninitialized_copy for a known number of n elements

Other approved features

  • N2435 Explicit bool for smart pointers
  • N2514 Implicit conversion operators for atomics
  • N2657 Local and unnamed types as template arguments
  • N2658 Constness of lambda functions
  • N2667 Reserved namespaces for POSIX
  • N2670 Minimal support for garbage collection and reachability-based leak detection 
  • N2661 Time and duration support
  • N2674 shared_ptr atomic access
  • N2678 Error handling specification for Chapter 30 (Threads)
  • N2680 Placement insert for standard containers

Next Meeting

Although we just got back from France, the next meeting of the ISO C++ standards committee is already coming up fast:

The meetings are public, and if you’re in the area please feel free to drop by.

28 thoughts on “Trip Report: June 2008 ISO C++ Standards Meeting

  1. GCC’s trunk got some commits regarding its implementation of initializer lists. Even libstdc++-related containers are getting initializer lists ctors, and other stuff. I presume that it’ll be in gcc-4.4.

  2. Hey, thanks for the detailed explanation Korval, much appreciated. That response clearly shows why we need it. Now it’s just back to Herb to explain when we’re going to get it in VC10! ;)

  3. On uniform initialization:

    The reason uniform initialization uses {} rather than parenthesis is because it must be UNIFORM. And by “uniform”, I mean it must have the same look and behavior everywhere. C++ does not have a method of initialization that works 100% everywhere. That’s the purpose of the uniform initialization C++0x feature.

    The problem with parenthesis is that they already have a very important meaning: grouping of expressions. It is possible to do the following:

    ObjectType SomeFunction()
    {
    return {50, 32};
    }

    This will initialize an ObjectType instance with the parameters (or initializer list, depending on construction) given. C/C++ is very clear and specific on the meaning of:

    return (50, 32);

    It already means something, and it’s important that C++0x not needlessly make it mean something very different.

    The expression (50, 32) is an expression containing 2 integers that are modified by the comma operator. The expression {50, 32} is an initializer list, which can be used to initialize object instances.

    For the same reason, this:

    ObjectType a({5, 3}); //#1

    is different from this:

    ObjectType a{5, 3}; //#2

    And here’s now. Let’s say ObjectType looked like:

    struct ObjectType
    {
    ObjectType(std::vector &&vec);
    ObjectType(int first, int size);
    };

    These are two different constructors. Line #1 will call constructor 1 and line #2 will call constructor 2. And here’s why.

    The parenthesis in line #1 specifically mean that you are constructing an ObjectType with 1 parameter. Because you are not using uniform initialization (parenthesis turn off the rules of uniform initilization), it defaults to regular initialization. So basically, you have a parameter list, and the one parameter is an initializer list. C++0x’s initializer list rules say that an initializer list can be used to create a temporary of an appropriate type if no overload takes an initializer list directly. That is the case, and std::vector will be constructable from an initializer list. Therefore, line #1 will call the first constructor with a newly minted std::vector temporary that contains an array of 2 values.

    Line #2, because it uses {} notation, is a case of uniform initialization, so those rules apply. Those rules state (among other things) that if there is no explicit initializer list constructor on the type, the individual elements of the list are considered to be parameters. Therefore, it will call the second constructor; it will NOT convert the list into a std::vector.

    BTW, uniform initialization lists can be heterogenous. If ObjectType is defined as such:

    struct ObjectType
    {
    ObjectType(int theInt, std::string &&strString, float fValue);
    };

    We can create one with this:

    ObjectType a{5, “Hi!”, 56.0f};

    This list cannot be converted into an initalizer list, but it can be used as parameters to the ObjectType constructor. Standard overload resolution rules apply. Additionally:

    ObjectType a{5, {‘H’, ‘i’, !}, 56.0f);

    This will work just as well.

    This is true because std::string will have an initializer list constructor that takes a list of characters. So either “Hi!” or {‘H’, ‘i’, !} will implicitly be converted into a std::string temporary.

    Pretty cool, yes? I would suggest reading Strousup’s (PDF) paper on the subject. http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2008/n2532.pdf

    On usefulness:

    No, there’s not much point to using something like:

    int* a = new int[3]{ 1, 2, 99 };

    That’s simply a thing you can do; it’s not necessarily a particularly useful case. One of the real powers of initializer lists and uniform initialization is this: less error.

    You have a function defined as such:

    ReturnType FuncName()
    {
    \\…
    return ReturnType(5, 32);
    }

    Now, if you want to change the return type, you must track down every return statement and adjust it accordingly. With uniform initialization, you can do:

    ReturnType FuncName()
    {
    \\…
    return {5, 32};
    }

    Which lets the C++ compiler determine what type to use based on the function declaration. So now, all you have to do is change it once (twice for a prototype) and everyone’s happy.

    The same goes for function calls:

    void SomeFunc(Type1 &&t, const Type2 &otherT);

    SomeFunc({4, 2}, {5, 4});

    This will automatically create the parameters ‘t’ and ‘otherT’ (as long as SomeFunc’s parameters are deducible. IE: not unspecified template parameters) for the function call. Now, SomeFunc’s parameter list can change without having to change the code, so long as the objects that are changed take the same constructors. So if Type1 were a std::vector and we decided to use a std::list, not much would change for a user who used initializer lists.

  4. I really didn’t understand why would I ever need something like:

    int* a = new int[3]({ 1, 2, 99 });

    After all, if I know the values in the compile time, I am not gonna be using dynamic creation with new. It’s just unnecessary run-time burden…
    Maybe this example is just not explaining the real motivation and benefit !

  5. “concepts” – This has always appeared to be a misnomer to me. The “concepts” that I’ve seen are all requirements on types, for example “assignable”. Are there any “concepts” that truly represent a concept that is not a requirement? If not, the name “concept” is too general, “requirement” would would fit better.

  6. thread_local X tlsX; ??

    Herb, I hope you aren’t backtracking on Hungarian Notation now that you work for Microsoft. Say it aint so…

  7. What about the BSI position on concepts (that they are not mature enough, and that the next standard has to have full library support for them as well)?

  8. I very much agree with Tadeu that:

    int* a = new int[3]({ 1, 2, 99 });

    seems cleaner to me, but best we don’t give them any excuse to delay VC10 or C++0x any longer! :)

  9. Will C++0x core language support something like Python generators (a very nice syntax), or something like C# enumerables and LINQ (possibly with an “yield return” statement)?

    It is a fundamental feature for a 2010+ language and will merge very well with lambdas and STL algorithms! ;)

  10. Also… about the new STL algorithms, they are very nice… but will they all also have Range versions?, i.e., instead of taking (begin, end), they could take only one generic Range concept, e.g:

    all_of(range, pred);
    for_each(range, pred);

    The (begin, end) syntax seems deprecated with concepts, now that a native array can also be seem as a Range. A pair of pointers would be an exception, rather than the norm, but we could also have:

    int a[5] = {0, 1, 2, 3, 4};
    for_each(a, pred);
    for_each(make_pair(&a[0], &a[5]), pred);

  11. I don’t like the new int[3] { 1, 2, 99 } syntax. Wouldn’t it be better to have parenthesis, as in all initializations?, e.g.:

    // C++0x

    int* a = new int[3]({ 1, 2, 99 });

    class X {
    int a[3];
    public:
    X() : a({ 1, 2, 99 }) {}
    };

    This seems better for humans to parse, and also removes the ugly {…}{…} syntax that seems like a double block or something like that…

  12. If you re-read what I said in 10 above, you will see that I was not arguing or explaining what thread_local means. I was addressing your question about what applying static to a thread_local declaration might mean, i.e. the use of static in this context is to reduce the thread_local variables visibility (to file scope), and NOT about controling the objects lifetime or thread association, so in that context it still seems meaningful to me to be applicable. :)

  13. No, thread_local would have to relate to lifetime. Every thread has its own copy of a thread_local variable. Therefore, a thread_local variable would have to be created after it’s associated thread is and destroyed before the thread is.

  14. I would think that thread_local and static are allowed as in this context static would likely be relating to visibility not variable lifetime.

  15. On a completely different topic, is the word “static” really necessary in declarations like the following?

    thread_local static X tlsX;

    It seems to me that “thread_local” and “static” are mutually exclusive. Static means that there is only one variable of that name ever, and “thread_local” means that there is one per thread. Therefore it makes more sense to just write

    thread_local X tlsX;

  16. “Optional GC” is very tricky phrase here: sounds catchy but can be very bad in practice. D author made a smart example on http://www.digitalmars.com/d/2.0/cpp0x.html: “The problem with an optional garbage collector is that in order to write general purpose libraries, one must assume that there is no garbage collector”.

    I will add here another case: let I reuse two libraries on my project, where one of them is GC aware and another GC agnostic. And my project will not use GC at all. Seeing this case, that “builtin turnable” GC has to be hell smart.

    For me this GC chanting is wasting of time. Better to use it on something else, like simplifying some language constructs.

    Btw. I bet you didn’t know that Python offer “optional GC” (http://docs.python.org/lib/module-gc.html), except no one is enough brave to mess with it ;)

    PS: initializer lists are excelent (excluding containers, I found trick with operator, much more readable :P)

  17. GC if it’s ever in the standard will be optional. I believe that has been stated here. So you won’t be forced to hire a nanny!

  18. Garbage collection removes one of the major things that I love about C++ that I can’t get in a lot of other languages – scoped destructors.

    ObjC 2.0 came out with GC (as a compiler option) and it’s a very good thing to have, don’t get me wrong. However, the apps I write in ObjC 2.0 aren’t the same class of apps I write in C++. I still reserve C++ for the enterprise level, high performance class of apps. The last thing I’m willing to give up is deterministic control over the lifecycle of my resources (within the bounds of compiler optimization, that is). Going with a GC means giving up *a lot*. It’s not just about cleaning up after a sloppy programmer. In my mind that’s not at all what it’s about and never has been.

    There are many different languages out there and most have a place in the world. C++ belongs at the core, powering other languages and operating systems, and doing the true heavy lifting of the computing world. But you won’t find me writing a web framework in it or a GUI. In those places, I’m overjoyed to have a GC.

    So if it’s going to be done, ever, then it had better have a compiler switch with zero GC residue in the object code when it’s set to “off”.

    So, Herb, just make sure the committee understands that, will ya? ;)

  19. So Nanny the Garbage Collector isn’t going to come along and mop up after poor programmers? I reckon that’s a good thing, to be honest; nanny encourages them to remain poor. There are so many features in C++ that can be misused into serious errors that I fear “fixing” one will merely encourage poor programmers not to watch out for the others. After all, the ethos of C, inherited in C++, is that it gives you “enough rope to shoot yourself in the foot” (to quote Holub); GC, to my mind, simply doesn’t fit. I don’t want poor programmers to wallow in nanny’s flannel, I want them to improve.

    I wouldn’t mind so much if such a feature had no impact on performance, but I’ve only ever seen Nanny the Garbage Collector strop like a fifty year old having a tantrum in a village pub. She interrupts everything. No, send her back to the script kiddies’ milk bar where her flannel and her tantrums fit a treat.

  20. Brian,

    Your question boils down to “what is wrong with doing something manually when the computer can do it transparently for me?”

    Nobody forces you to use features. Just ‘cos they are in a language doesn’t mean you have to use them. Although many macho coders take the presence of an esoteric feature as an opportunity to preen.

    And I think it’s funny that you regard C++ as a simple language. It’s many thinks, but simple ain’t one of them.

  21. I really have to disagree about the garbage collection. What is so wrong about “registering” an object to be garbage collected? I’ll take a simpler language with add ons any day over some forced upon bloat. I’m not saying c++ is the greatest thing since sliced bread, just that I’m happy they aren’t forcing unnecessary features.

  22. Sadly, we’ll have to wait till the next draft for garbage collection support. There are some c-libraries out there but nothing c++ that calls constructors and destructors. There are other gotchas as well. Garbage collection really needs to be a language feature and not just a library.

Comments are closed.