[Updated Apr 3 to note automatic deduction of return type.]
The ISO C++ committee met in Bellevue, WA, USA on February 24 to March 1, 2008. Here’s a quick summary of what we did (with links to the relevant papers to read for more details), and information about upcoming meetings.
Lambda functions and closures (N2550)
For me, easily the biggest news of the meeting was that we voted lambda functions and closures into C++0x. I think this will make STL algorithms an order of magnitude more usable, and it will be a great boon to concurrent code where it’s important to be able to conveniently pass around a piece of code like an object, to be invoked wherever the program sees fit (e.g., on a worker thread).
C++ has always supported this via function objects, and lambdas/closures are merely syntactic sugar for writing function object. But, though “merely” a convenience, they are an incredibly powerful convenience for many reasons, including that they can be written right at the point of use instead of somewhere far away.
Example: Write collection to console
For example, let’s say you want to write each of a collection of Widgets to the console.
// Writing a collection to cout, in today’s C++, option 1:
for( vector<Widget>::iterator i = w.begin(); i != w.end(); ++i )
cout << *i << ” “;
Or we can leverage that C++ already has a special-purpose ostream_iterator type that does what we want:
// Writing a collection to cout, in today’s C++, option 2:
copy( w.begin(), w.end(),
ostream_iterator<const Widget>( cout, ” ” ) );
In C++0x, just use a lambda that writes the right function object on the fly:
// Writing a collection to cout, in C++0x:
for_each( w.begin(), w.end(),
[]( const Widget& w ) { cout << w << ” “; } );
(Usability note: The lambda version was the only one I wrote correctly the first time as I tried these examples on compilers to check them. ‘Nuff said. <tease type=”shameless”> Yes, that means I tried it on a compiler. No, I’m not making any product feature announcements about VC++ version 10. At least not right now. </tease>)
Example: Find element with Weight() > 100
For another example, let’s say you want to find an element of a collection of Widgets whose weight is greater than 100. Here’s what you might write today:
// Calling find_if using a functor, in today’s C++:
// outside the function, at namespace scope
class GreaterThan {
int weight;
public:
GreaterThan( int weight_ )
: weight(weight_) { }
bool operator()( const Widget& w ) {
return w.Weight() > weight;
}
};
// at point of use
find_if( w.begin(), w.end(), GreaterThan(100) );
At this point some people will point out that (a) we have C++98 standard binder helpers like bind2nd or (b) that we have Boost’s bind and lambda libraries. They don’t really help much here, at least not if you’re interested in having the code be readable and maintainable. If you doubt, try and see.
In C++0x, you can just write:
// Calling find_if using a lambda, in C++0x:
find_if( w.begin(), w.end(),
[]( Widget& w ) { return w.Weight() > 100; } );
Ah. Much better.
Most algorithms are loops… hmm…
In fact, every loop-like algorithm is now usable as a loop. Quick examples using std::for_each and std::transform:
for_each( v.begin(), v.end(), []( Widget& w )
{
…
… use or modify w …
…
} );
transform( v.begin(), v.end(), output.begin(), []( Widget& w )
{
…
return SomeResultCalculatedFrom( w );
} );
Hmm. Who knows: As C++0x lambdas start to be supported in upcoming compilers, we may start getting more used to seeing “});” as the end of a loop body.
Concurrency teaser
Finally, want to pass a piece of code to be executed on a thread pool without tediously having to define a functor class out at namespace scope? Do it directly:
// Passing work to a thread pool, in C++0x:
mypool.run( [] { cout << “Hello there (from the pool)”; } );
Gnarly.
Other approved features
- N2535 Namespace associations (inline namespace)
- N2540 Inheriting constructors
- N2541 New function declarator syntax
- N2543 STL singly linked lists (forward_list)
- N2544 Unrestricted unions
- N2546 Removal of auto as a storage-class specifier
- N2551 Variadic template versions of std::min, std::max, and std::minmax
- N2554 Scoped allocator model
- N2525 Allocator-specific swap and move behavior
- N2547 Allow lock-free atomic<T> in signal handlers
- N2555 Extended variadic template template parameters
- N2559 Nesting exceptions (aka wrapped exceptions)
Next Meetings
Here are the next meetings of the ISO C++ standards committee, with links to meeting information where available.
- June 8-14, 2008: Sophia Antipolis, France
- September 14-20, 2008: San Francisco Bay area, California, USA
The meetings are public, and if you’re in the area please feel free to drop by.
“One more question, why even go for this syntactic sugar when boost lambda does already provide this functionality?”
Boost lambda is a great achivement, but it’s still limited by the lack of language support. Anything other than the simplest lambdas are unmaintainably complex.
Of course boost::lambda has the great advantage that it already works on many compilers, unlike the C++0x lambda which only gcc supports.
One more question, why even go for this syntactic sugar when boost lambda does already provide this functionality?
I think lambdas should also allow for auxiliary variables like:
[size_t line = 0] (int b) { display(b); if (++line % 10 == 0) flip_colors(); }
I agree — Lambdas are very cool. Soon someone will write a proof assisant in C++!!!
The more and more large and parallel-enabled applications I write, the more I find myself writing in a functional style. (the example mypool.run() given notwithstanding — it assumes that cout, etc also act in a reasonable way when called by many threads…)
Re Alex’s post: I would enjoy hearing a plain English explaination N2554 as well. May a few examples or something, demonstrating the problems all this addtional complexity solves.
Thanks Orcmid. Yeah, I did notice some date weirdness too, and it’s the more curious because the imported posts and comments all seem to have correct/reasonable dates when viewed in the usual blog interface. Hopefully it affects only the imported material, but I’ll keep an eye on it to see whether it affects new posts.
Way off topic meta-matter: You didn’t post a note about moving the blog here, and comments are off there, so I am commenting on your most recent post on something I noticed.
I opened up a new feed, with its own folder, and I will then move posts from the folder from the old feed before I kill that one in my news reader.
In the new feed, synchronization with NewsGator on-line recovered a pile of stuff. Oddly, it is all dated 1969-12-31 which has to be some sort of clue. There is something weird about the RSS feed from the restored material.
Probably not a biggy (unless new posts show up that way too), but a definite curiosity.
Regarding noname #19:
Also, I wonder when the lookup for that decltype is done. IIRC, the semantics are defined by conversion to a struct, where w might be out of scope. (But the committe’s probably already thought of that.)
And yup, the example shows that even in a simple case, the for-loop-with-auto is shorter:
for (auto i = w.begin(); i != w.end(); ++i) {
if ( i->weight() > 100) {…}
}
Especially since the find_if will require even more code to check whether the return value is == w.end().
P.S. Yay, new blog host!
I’m dissapointed that polymorphic lambdas didn’t make it. Having to type out map< string, pair< bool, vector<foo> > >::const_reference for the parameter type in the lambda means that I’m never going to use it when I could use iterators and auto instead ( for (auto i = m.begin(), e = m.end(); i != e; ++i) ) and type less and be more amenable to change later.
Especially since the paper suggests that the semantics are specified by conversion to a struct functor, in which templating operator() works great, especially now that there’s auto (…) -> decltype(…) { … }.
Maybe I just need to learn more about concepts, though, since the justification I read in one of the old lambda papers didn’t make much sense to me.
boost::lambda is already pretty damn cool. The fact that now it (the technique, not the library) will be natively supported is just so damn cool.
The find example in today C++ with Boost 1.34 is actually less verbose and at least as readable and maintainable as the C++0x version you provided:
find_if( w.begin(), w.end(), boost::bind( &Widget::Weight, _1 ) > 100 );
Don’t get me wrong, I’m looking forward to having lambda functions in the core language, and there are numerous, perhaps just slightly more complex examples where a built-in lambda facility would be a definite winner over any library-based implementation. Using an ill-conceived example and overstating your case at the same time ("They don’t really help much here"? In this case apparently they can help more than the proposed core functionality) hardly contributes to the quality of the post, though.
How are local variables captured? I read the section on capture lists in the spec, but couldn’t quite get my head around it. In particular, what happens if a local variable is modified within the closure:
int numWidgets = 0;
for_each( v.begin(), v.end(), []( Widget& w )
{
++numWidgets;
…
… use or modify w …
…
} );
Will numWidgets be 0 after this, or will it contain the number of widgets in v?
What will happen in the following case:
int flag = 0;
mypool.run( [] { flag = 1; } );
cout << flag << endl;
What will be output?
…always "0"
…sometimes "0", sometimes "1", depending on thread scheduling
…sometimes "0", followed by a crash as the closure references a local variable that has gone out of scope
I agree with Aleksey. "They don’t help much here" is an overstatement. The comparison with boost::bind mostly serves to highlight the "Widget const&" and "bool" redundant boilerplate instead of casting the lambdas in the favorable light they deserve. You could’ve chosen a better example.
Re binders: Okay, I give! I’ll use a better example next time.
(no name) asked: "How are local variables captured?" You have to specify whether it’s by copy or by reference. So this example is illegal because it tries to use a local variable:
int numWidgets = 0;
for_each( v.begin(), v.end(), []( Widget& w )
{
++numWidgets; // error, numWidgets is not in scope
} );
If you want to update numWidgets directly, capture it by reference:
for_each( v.begin(), v.end(), [&numWidgets]( Widget& w )
{
++numWidgets; // increments original numWidgets
} );
// numWidgets == v.size() here
Or use the shorthand [&] to take all captured variables implicitly by reference:
for_each( v.begin(), v.end(), [&]( Widget& w )
{
++numWidgets; // increments original numWidgets
} );
// numWidgets == v.size() here
What if you want a local copy? You say to pass it by value, but for safety reasons the current proposal says you get a read-only copy that you can’t modify:
for_each( v.begin(), v.end(), [numWidgets]( Widget& w )
{
int i = numWidgets; // ok
++i;
// "++numWidgets;" would be an error
} );
// numWidgets == 0 here
Or use the shorthand [=] to take all captured variables implicitly by copy:
for_each( v.begin(), v.end(), [=]( Widget& w )
{
int i = numWidgets; // ok
++i;
// "++numWidgets;" would be an error
} );
// numWidgets == 0 here
Similarly, for the question: "What will happen in the following case:"
int flag = 0;
mypool.run( [] { flag = 1; } );
cout << flag << endl;
[]( auto& w ) { return w.Weight() > 100; } );
[] { return _1.Weight() > 100; } );
Let’s see if this time I’m less unnamed.
Still no luck. :-) Sorry for the comment spam.
(I was the one asking about captures.)
Ok, thanks for the explanation of captures – that makes a lot of sense.
Finally, I have to second the calls for "less boilerplate" – although I think a balance must be struck where we don’t have too much magic and too many things happening implicitly. I think the [](auto& w) has that balance – it allows you to name the input parameters (I think _1, _2 and so on may make it easier to write, but makes reading the code difficult), but avoid requiring the programmer to add redundant information.
"What if you want a local copy? You say to pass it by value, but for safety reasons the current proposal says you get a read-only copy that you can’t modify:
for_each( v.begin(), v.end(), [numWidgets]( Widget& w )
{
int i = numWidgets; // ok
++i;
// "++numWidgets;" would be an error
} );"
The numWidgets getting passed as a read-only copy and ++numWidgets giving an error seems ugly. Ok, I understand the intent of the lambda/syntax there.
Wouldn’t it be better if the name inside the [] just refer to the variable to be captured (Just like the argument and the parameter can have the same names but a parameter without & would mean only a copy). And in above case, any modifications are only local. Does that make sense ?
[]( Widget& w ) -> bool { return w.Weight() > 100; } );
[]( Widget& w ) { return w.Weight() > 100; } );
{
…
return SomeResultCalculatedFrom( w );
} );
{
…
return SomeResultCalculatedFrom( w );
} );
[]( decltype(w)::value_type& w ) { return w.Weight() > 100; } );
[]( decltype(*w) w ) { return w.Weight() > 100; } );
[]( decltype(*w.begin()) w ) { return w.Weight() > 100; } );