OGDC Talk Slides Online

I’ve been getting a fair number of requests for my slides from last week’s keynote at OGDC. They’re now available here.

Trip Report: April 2007 ISO C++ Standards Meeting

The ISO C++ standards committee met on April 15-20, 2007 in Oxford, UK. Despite unseasonably sunny and warm weather and good European beer (the best I tried was Budweiser, and no, I’m not kidding because it’s not the American one), we made good progress toward our goal of having a complete draft of C++09 out for public review this year.

(Aside: I delayed posting this article until the post-meeting mailing was online on the ISO C++ website, so that I could add live public links to the referenced documents, for your viewing pleasure.)

New Language Features Voted Into (Draft) C++09

The following new features progressed to the point of being voted into the draft standard at this meeting. I’ve included a link to the final paper number so that you can read up on the gory details of your favorite proposal. (Note that I’m not mentioning changes that are smaller or are just reformulations that have no effect on the semantics of your code, such as that we replaced the wording about sequence points in the standard while maintaining the same effect.)

Template aliases (aka typedef templates, generalized typedefs) [N2258]

This proposal was coauthored by Gabriel Dos Reis and Bjarne Stroustrup. I’ve written before (here in an article and here in a committee proposal) about why you might want to be able to write a typedef that nails down some, but not all, template parameters. The example from the first article is:

// Not legal C++, but it would be nice…
//
template<typename T>
typedef std::map<std::string, T> Registry;

// If only we could do the above, we could then write things like:
//
Registry<Employee> employeeRoster;
Registry<void (*)(int)> callbacks;
Registry<ClassFactory*> factories;

What C++09 actually added was not only the equivalent of typedef templates, but something more general. Template aliases allow you to get the same effect as the above using the following syntax:

// Legal in C++00
//
template<typename T>
using Registry = std::map<std::string, T>;

// Now these all work as intended
//
Registry<Employee> employeeRoster;
Registry<void (*)(int)> callbacks;
Registry<ClassFactory*> factories;

An example from the second article is motivated by making policy-based-designed templates more usable. Referring to one form of Andrei’s SmartPtr template, I showed that we can write the following wrap-the-typedef-in-a-struct workaround today:

// Today’s workaround
//
template<typename T>
struct shared_ptr {
typedef Loki::SmartPtr<
    T, // note that T still varies, but everything else is fixed
    RefCounted, NoChecking, false, PointsToOneObject, SingleThreaded, SimplePointer<T>
>
type;
};

shared_ptr<int>::type p;   // sample usage, “::Type” is ugly

One of the primary reasons (but not the only one) why policy-based smart pointers didn’t make it into C++09 was usability: Although Loki::SmartPtr can express shared_ptr easily as just one of its configurations, the user has to type lots of weird stuff for the policy parameters that is hard to completely hide behind a simple typedef, and so the user would be less inclined to use the feature:

shared_ptr<int>::type p; // the best usability we could provide for a policy-based Shared Ptr
shared_ptr<int> p; // what actually won and is now in C++09

Too bad we didn’t have template aliases then, because now we could have our cake and eat it too:

// Legal in C++09
//
template<typename T>
using shared_ptr =
    Loki::SmartPtr<
      T,
      RefCounted, NoChecking, false, PointsToOneObject, SingleThreaded, SimplePointer<T>
    >;

shared_ptr<int> p; // using the alias lets us have a completely natural syntax

This aliasing feature extends beyond templates and is also essentially a generalized replacement for typedef. For example:

// Legal C++98 and C++09
//
typedef int Size;
typedef void (*handler_t)( int );

// Legal C++09
//
using Size = int;
using handler_t = void (*)( int );

Especially for such function pointer types, the new form of "using" is simpler than typedef and lets programmers avoid playing guess-where-the-typedef-name-goes because of the C declaration syntax weirdness, at least in this place.

Variadic templates (aka "type varargs" for templates) [N2242; see N2087 for a more readable description and rationale]

This proposal was coauthored by Doug Gregor, Jaakko Järvi, Jens Maurer, and Jason Merrill.

You know how you can call a function like printf with a variable argument list (C’s good old varargs)? Well, there are times you’d like to do that for template arguments lists, too, and now you can. For example, today the C++09 tuple type needs to be declared with some fixed number of defaulted parameters:

// Drawn from draft C++09, before variadic templates
// (when the best tool we had was to hardcode some defaulted parameters)
//
template<
class T1 = unspecified ,
class T2 = unspecified ,
… ,
class TN = unspecified
> class tuple;

template<class T1, class T2, …, class TN>
tuple<V1, V2, …, VN> make_tuple( const T1&, const T2& , …, const TN& );

These declarations gets uglier as N gets bigger. And what if you want more than N types? Sorry, Charlie; the implementer just has to write enough types to cover most use cases, and the actual number in the current wording would be nonportable in practice as every implementer has to make their own choice about how many types will be enough.

With variadic templates, we can write it much more simply and flexibly (and portably, once compilers implement the feature):

// Simplified version using variadic templates
// (now in the latest working draft of C++09; see Library section below)
//
template<class… Types> class tuple;

template<class… Types>
tuple<VTypes…> make_tuple( Types&… );

This is simpler to define, works with any number of types, and will work portably no matter how many types you use. What’s not to like?

For another example that summarizes other ways to use the feature with class types, consider this example provided by Jonathan Caves (thanks, Jon!):

// Legal in C++09

template<typename… M
ixins>
class X : public Mixins…{
public:
X( const Mixins&… mixins ) : Mixins(mixins)… { }
};

class A { };
class B { };
class C { };

X<A, B, C> x;

For the type X<A,B,C>, the compiler will generate something like the following:

class X<A, B, C> : public A, public B, public C {
public:
X( A const& __a, B const& __b, C const& __c ) : A(__a), B(__b), C(__c) { }
};

Unicode characters and strings [N2249]

This proposal was written by Lawrence Crowl, based on the ISO C technical report containing similar non-normative extensions for C.

C++ has essentially airlifted in the unofficial C extensions for char16_t and char32_t, and made them available in the new header <cuchar>. Here are a few basic uses:

char16_t* str16 = u"some 16-bit string value";
char32_t* str32 = U"some 32-bit string value";

C++ did a few small things differently from C. In particular, C++ requires the the encoding be UTF, and C++ is making these types a normative (required) part of its language standard.

New Library Features Voted Into (Draft) C++09

As usual, as we vote in language extensions, the library working group continues systematically updating the standard library to make use of those extensions. In particular, this time the library got support for:

Variadic templates (see above, including its application to std::tuple)
Unicode characters (see above)
Rvalue references (see next)

Rvalue references, aka "move construction/assignment," are a useful way to express that you’re constructing or assigning from an object that will no longer be used for anything else — including, for example, a temporary object — and so you can often get a decent performance boost by simply stealing the guts of the other object instead of making a potentially expensive deep copy. For example, move assignment for a string would simply take over the internal state of the source string using just a few pointer/integer assignments, and not perform any allocations. For more details, see this motivational paper which describes the significant performance and usability advantages of move semantics. Updating the standard library to use move semantics included notably updating the containers, iterators, and algorithms to support moveable types, as well as deprecating auto_ptr (wow!) and replacing it with a unique_ptr that expresses unique ownership semantics. (Previously, the library had already added shared_ptr and weak_ptr to express shared ownership and shared observation semantics, so reference-counted shared pointers are already covered.)

If you’re interested in the standardese details of the new move semantics support in the standard library, you can find details of the updates in the following papers: N1856 (basic concepts, pair, and unique_ptr), N1857 (strings), N1858 (containers), N1859 (iterators), N1860 (algorithms), N1861 (valarray), and N1862 (iostreams).

There were a few other housekeeping updates, but for template metaprogramming aficionados one highlight will be the std::enable_if type trait. Have you ever wanted to write a function template that gets completely ignored for types the template wasn’t designed to be used with (as opposed to accidentally becoming a best match and hiding some other function the user really intended to call)? If you’re a template library writer and haven’t yet seen enable_if, see here for a nice Boosty discussion, and you might get to love it too.

Next Meetings

Here are the next meetings of the ISO C++ standards committee, with links to meeting information. The meetings are public, and if you’re in the area please feel free to drop by.

July 15-20, 2007: Toronto, Canada
September 30 – October 6: Kona, Hawaii, USA [N2289]

Note that the latter is the meeting at which we are hoping to vote out the first complete draft of the new C++ standard, and it runs straight through to Saturday afternoon. So let me insert a caveat for those who might be thinking, "oh yeah, that sounds like the meeting I think I’ll pick to attend, and catch some rays, ha ha" — having a standards meeting in Hawaii sounds nice, but it’s not nearly what you might think. If you come and want to participate, expect to be shut in windowless rooms all day, and frequently all evening — you’ll only actually see the local landscape if you fly in early or stay late, not during the meeting. But we’ll appreciate your help proofreading mind-numbing swathes of the standard!

Thoughts on Scott’s “Red Code / Green Code” Talk

Scott Meyers recently gave an interesting talk at NWCPP. Thanks to Kevin Frei’s recording skills and hardware, you can watch the video here.

Scott’s topic was "Red Code / Green Code: Generalizing const." Here’s the abstract:

C++ compilers allow non-const code to call const code, but going the other way requires a cast. In this talk, Scott describes an approach he’s been pursuing to generalize this notion to arbitrary criteria. For example, thread-safe code should only call other thread-safe code (unless you explicitly permit it on a per-call basis). Ditto for exception-safe code, code not "contaminated" by some open source license, or any other constraint you choose. The approach is based on template metaprogramming (TMP), and the implementation uses the Boost metaprogramming library (Boost.MPL), so constraint violations are, wherever possible, detected during compilation.

(Nit: I think the "const" analogy is a slight red herring. See the Coda at the bottom of this post for a brief discussion.)

Motivation and summary

The motivation is that it might be nice to be able to define categories of constraints on code, and have a convention to enforce that constrained code can’t implicitly call unconstrained code. Scott gives the example that you might want to prevent LGPL’d code from implicitly calling non-LGPL’d code if that would make the non-LGPL’d code subject to LPGL virally. Similarly, you might want to prevent reviewed code from implicitly calling unreviewed code; and so on. Note that these constraints overlap; you can have any combination of LGPL’d/non-LGPL’d and reviewed/unreviewed code.

The basic technique Scott describes is to define a tag type, which is similar to the technique used in standard C++ iostreams to mark iterator categories. You just define some empty types (all we’re doing is generating names):

struct ThreadSafe {};
struct LGPLed {};
struct Reviewed {};
// etc.

Then you arrange for each function to declare its constraints at compile time, for example to say that "my function is LGPL’d" or "my function is Reviewed" or any combination thereof. Scott showed how to conveniently use MPL compile-time collections to write a group of constraints. He provides this general helper template:

template<typename Constraints>
struct CallConstraints {
…
}

Then the callers and callees all traffick in CallConstraints<mpl::vector<ExceptionSafe,Reviewed>>, CallConstraints<mpl::vector<ThreadSafe, LGPLed>>, and so on. Scott also provides ways to opt out or deliberately loosen constraints, via helpers IgnoreConstraints and eraseVal<MyConstraints, SomeConstraint>. He also provides an includes metafunction that you can use to see if one set of constraints is compatible with another set.

But where do you put these constraints, and how do you pass and check them? Scott presented several ways (see the talk for details), but they all had serious drawbacks. Here’s a quick summary of the primary alternative he presented: The idea is to make each participating function that wants to declare constraints a template, and write its constraints inside the function. For example:

template<typename CallerConstraints>
void f( Params params ) {
typedef mpl::vector<ExceptionSafe> MyConstraints;
BOOST_MPL_ASSERT(( includes<MyConstraints, CallerConstraints> ) );
…
}

That’s the basic pattern. When you call another constrained function, you pass along your constraint type, and optionally explicitly relax constraints if you want to loosen them. For example:

template<typename CallerConstraints>
void g() {
typedef mpl::vector<ExceptionSafe, Reviewed> MyConstraints;
BOOST_MPL_ASSERT(( includes<MyConstraints, CallerConstraints> ));
…
// try to call the other function
f<MyConstraints>( params ); // error, trying to call unreviewed code from reviewed code
f<eraseVal<MyConstraints,Reviewed>>( params ); // ok, ignore Reviewed constraint
f<IgnoreConstraints>( params ); // ok, explicitly ignore constraints entirely
}

This has a number of drawbacks:

Virtual functions vs. compile-time checking: This doesn’t (directly) work for virtual functions, because templates can’t be virtual. See Scott’s talk for details about ways to trade that off against a different design, namely the NVI pattern, or a different drawback, namely run-time checking.
Template explosion: Every participating function is required to become a template (or to add a new template parameter). That’s more than merely inconvenient; for one thing, it means we have to put every function in a header file (or else wrap it with a template that does the constraints checking and then passes through to the normal function, which is tedious duplication). The other serious problem is that the function template is now going to have to be instantiated once for each unique combination of caller constraints, including even constraints that are added in the future and have nothing to do with this function, even though each instantiation generates identical code.
Separate checking: The constraints aren’t part of the type of the function, and so we have to compile the body of the function to determine whether the constraints are compatible. In short, we’re not leveraging the type system as much as we might wish to do.

Another minor drawback is that each user has to write the static assertions himself.

My humble contribution (sketch)

I enjoyed Scott’s motivation, and during his talk I thought of a simpler way to pass and check these constraints. I chatted with him about it afterwards, and he’s added it to his talk notes (which should go live at the above talk link over the next few days).

Here’s the idea that occurred to me: What if we make the constraints part of the function signature by passing them as an additional normal parameter, and do the check on the conversion from Constraints<Caller> to Constraints<Callee>? Here’s a sketch of how you’d use it:

void f( Params params, Constraints<ExceptionSafe> myConstraints ) {
…
}void g( Constraints<ExceptionSafe, Reviewed> myConstraints ) {
…
// try to call the other function
f( params, myConstraints ); // error, trying to call unreviewed code from reviewed code
f( params, myConstraints::erase<Reviewed>() ); // ok, ignore Reviewed constraint
f( params, IgnoreConstraints ); // ok, explicitly ignore constraints entirely
}

This has none of the drawbacks of the other alternatives:

It incurs zero or near-zero space and time overhead: No extra template expansions, no run-time checking, and possibly even no space overhead because a constraint’s information payload is all in its compile-time type (a constraint object itself is empty).
It doesn’t require users to make all their functions templates: Just add a normal parameter.
It works naturally with virtual functions: A derived override must have constraints compatible with
the base function, and that follows naturally in that it must match the signature, now including the constraints, of the base. (Future-proofing note: If C++ is ever extended to allow contravariant parameters on virtual functions, that would dovetail perfectly with this technique because derived functions can be less constrained than base functions).
It supports separate compilation and separate checking, by making the constraint part of the function’s type.

We could enable the above technique by bundling up the functionality inside just one Constraints template that would look something like this (sketch only):

// This is pseudocode.
//
class IgnoreConstraints { }; // just a helper

template<C1 = Unused, C2 = Unused, … CN = Unused> // could get rid of this redundancy with C++09 variadic templates
struct Constraints {
typedef mpl::vector<C1, C2, …, CN> CSet;

// Use the conversion operation as our hook to perform the check.
//
template<typename CallerConstraints>
Constraints( const Constraints<CallerConstraints>& ){
BOOST_MPL_ASSERT(( includes<CSet, Constraints<CallerConstraints>::CSet> ));
}

// Allow the caller to explicitly ignore constraints by doing no checks
// on the conversion from the helper type.
//
Constraints( const IgnoreConstraints& ) { }

// … provide erase by duplicating Scott’s eraseVal on CSet…
};

Scott has added notes to his slides showing this approach, and intends to write a real article about this. I’ll leave it to him to write a complete implementation based on the above or some variation thereof; this is just to sketch the idea.

Thanks, Scott, for a very interesting talk!

Coda: On the "const-ness" of code

The talk description starts with:

C++ compilers allow non-const code to call const code, but going the other way requires a cast.

This reference to const-ness is intended to be a helpful analogy in the sense that a const member function can’t directly call a non-const member function of the same class. But really that’s the only situation where you could at a stretch say something like "non-const code can’t call const code." The reason this analogy doesn’t really match what the actual (good and useful!) topic of this talk and technique is that const is about data, not code.

In particular, an object can be const, but a function can’t be const. Not ever. Now, at this point someone may immediately object, "Herb, you fool! Of course a function can be const! What about…" and go on to scribble a code example like:

class X {
public:
void f( const Y& y, int i ) const { // "const code"?
g( y, i ); // error — "can’t call non-const code from const code" ?
}
void g( const Y& y, int i ) { } // "non-const code"?
};

"And so clearly," said someone might triumphantly conclude, "here X::f is const code, and X::g is non-const code, right?"

Alas, no. The only difference the const makes is that the implicit this parameter has type const X* in X::f as opposed to type X* in X::g. It is true that X::f can’t call X::g without a cast, but that has nothing to do with "const-ness of code" but rather the constness of data — you can’t implicitly convert a const X* to an X*. To drive the point home, note that the above code is in virtually every way the same as if we’d written f and g as non-member friends and named the parameter explicitly:

void f( const X* this_, const Y& y, int i ) { // "const code"?
g( this_, y, i ); // error: can’t convert const X* to X*
}
void g( X* this_, const Y& y, int i ) { } // "non-const code"?

Now which function is "const" and which is "non-const"? There really is no such thing as a const function — its constness or non-constness depends (as it must) on which parameter we’re talking about:

Both functions "are const" with respect to their y parameter.
Neither function "is const" with respect to their i parameter.
Only f "is const" with respect to its this_ parameter.

Remember that const is always about data, not code. Const always applies to the declaration of an object of some type. A function can use const on any parameter to declare it won’t change the object you pass it in that position, but it’s still about the object, not about the code. True, for a member function you get to write const at the end if you want to, but that’s just syntactic sugar for putting it on the implicit this parameter; the fact that the const lexically gets tacked onto the function is just an artifact of the this parameter being hidden so that you can’t decorate the parameter directly.

But leaving analogies with const-ness aside, there’s real value in the idea of red code / green code which really is about distinguishing different kinds of code, whereas const is about distinguishing different views of data.