Last night’s talk video is online: Quantifying C++’s accidental complexity, and what we really can do about it

The ISO C++ committee is here in Prague this week to finish C++20, and the meeting hosts Avast Software also arranged a great C++ meetup last night where over 300 people came out to see Bjarne Stroustrup, Tony Van Eerd, and me give talks. The videos are already online, see below — they’re really high quality, and it was a great event. Thank you again to everyone who came out to see us! You made us feel very welcome in your beautiful city and your enthusiasm for C++ is contagious (in a good way!).

Mine was a brand-new talk with material I’ve never presented on-camera before. (I gave a beta version at ACCU Autumn last November in Belfast.) I’m really excited about this upcoming work that I’m planning to bring to the committee in the near future, and I hope you enjoy it.

Thanks again to Hana Dusíková for her hard work organizing this meetup and this entire week-long record-shattering C++ standards meeting, by far the largest in history with 250 attendees. But there’s no rest for the competent — she still has to chair SG7 (reflection and compile-time programming) all day tomorrow. :) I’ll be there!

Trip report: Autumn ISO C++ standards meeting (Belfast)

A few minutes ago, the ISO C++ committee completed its autumn meeting in Belfast, Northern Ireland, hosted with thanks by clearpool.io, Archer-Yates, Microsoft, C++ Alliance, MCS Group, Instil, and the Standard C++ Foundation. As usual, we met for six days Monday through Saturday, and we had about 200 attendees. We now have 23 active subgroups, most of which met in nine parallel tracks all week long; some groups ran all week, and others ran for a few days or a part of a day, depending on their workloads.

See also the Reddit trip report, which was collaboratively edited by many committee members and has lots of excellent detail. You can find a brief summary of ISO procedures here.

C++20 is on schedule to be finalized in February

Per our official C++20 schedule, at our previous meeting in July we reached feature-freeze for C++20 and sent out the C++20 draft for its international comment ballot (Committee Draft, or CD) which ran over the summer and generated 378 comments from national bodies.

At this meeting and the next one (February in Prague), our main job was to work through these comments as well as other fit-and-finish work for C++20. To make sure we were in good shape to finish in Prague, our goal was to make sure we resolved at least half the national body comments at this meeting. Thanks to a lot of hard work across all the subgroups, and especially the subgroup chairs who leveraged our ability to do work in parallel in our nine tracks and domain-specific subgroups, this week we resolved 73% of the national body comments, and made good progress on most of the rest. Here’s a snapshot of national body comment status, breaking out the number that we were able to close even before the end of the week, and the number of CWG (core language) and LWG (standard library) comments whose final resolutions we adopted today:

This means we are in good shape to ship the final text of the C++20 standard at high quality and on time, at the end of the next meeting in February in Prague.

Because we are in feature freeze for C++20, no new major proposals were added into C++20 at this meeting, though we did adopt a few minor design fixes. Most of the changes made at this meeting were bug-fix level improvements, mostly to the “wording implementation details” to make sure features were specified correctly and clearly in the formal specification wording to implement the approved design.

Other progress

In addition to C++20 work, we also had time to make progress on a number of post-C++20 proposals, including:

  • the new SG21 (Contracts) study group’s first meeting;
  • the newly reopened SG4 (Networking) study group including an evening session on networking security;
  • an evening session on executors;
  • further progress on reflection and compile-time programming proposals;
  • progress on pattern matching in the main language evolution design group;
  • and much more.

Thank you again to the approximately 200 experts who attended this meeting, and the many more who participate in standardization through their national bodies! Our next step is to finish the final text of C++20 three months from now in February (Prague, Czech Republic) and then send final C++20 out for its approval ballot.

GotW-ish Solution: The ‘clonable’ pattern

This is the solution to GotW-ish: The ‘clonable’ pattern.

In summary, a distinguished C++ ISO C++ committee expert emailed me to ask:

[To avoid slicing], for each derived class, [I could] write something like

class D: public B {
public:
   shared_ptr<B> clone() const {
       return make_shared<D>(*this);   // not make_shared<B>
   }
   // ...
};

and then I can write

shared_ptr<B> b1 = /* as before */;
shared_ptr<B> b2 = b1->clone();

and b2 will now point to a shared_ptr<B> that is bound to an object with the same dynamic type as *b1.

However, this technique requires me to insert a member function into every class derived from B, with ugly bugs resulting from failure to do so.

So my question is whether there some way of accomplishing this automatically that I’ve missed?

Let’s take a look.

JG Question

  1. Describe as many approaches as you can think of that could let us semi- or fully-automate this pattern, over just writing it by hand every time as recommended in C++ Core Guidelines #C.130. What are each approach’s advantages and drawbacks?

There are two basic approaches in today’s C++: the Curiously Recurring Template Pattern (a.k.a. "CRTP"), and macros (a.k.a. "ick").

But first let’s consider a class of alternatives that is similar, even though it doesn’t answer this specific question or achieve the basic goal.

Nonintrusive solutions

There are nonintrusive solutions such as using type erasure, which don’t require the class to actually have a clone function. One example currently in the standardization proposal pipeline is P0201: A polymorphic value-type for C++. P0201 leads to code like this:

// The class hierarchy is unaffected

class B {
};

class C : public B {
};

class D : public C {
};

// Wrappers enable writing code that's similar to the question...

polymorphic_value<B> b1(D());           // similar to the target use case
polymorphic_value<B> b2 = poly;

The nonintrusive approaches are interesting too, but they don’t satisfy this particular question about how to automate the intrusive clone pattern. They also generally don’t satisfy the original motivation of the question which was to prevent slicing, because with nonintrusive approaches users can still create objects of the types directly and slice them:

D d;
B b = d;                                // oops, still works

Only an intrusive solution can make the copy constructor nonpublic or suppressed as part of automating the clone pattern, and all of the intrusive solutions can be extended to do this, with varying degrees of robustness and usability.

So, how can we automate the pattern in the question?

CRTP: The Curiously Recurring Template Pattern

Since C++98, the main recommended method is to use a variation of CRTP, the Curiously Recurring Template Pattern. The idea is that we instantiate a base class with our own type, and the base class provides the boilerplate we want. CRTP leads to code like this (live example — note that all the live examples use reflection to show the code that gets generated):

// User code: using the library to write our own types (many times)

class B : public clonable_base<B> {
};

class C : public clonable<B,B,C> {
};

class D : public clonable<B,C,D> {
};

shared_ptr<B> b1 = make_shared<D>();    // target use case works
shared_ptr<B> b2 = b1->clone();

The implementation typically looks something like this:

// Library code: implementing the CRTP helpers (once)

template <typename Self>
class clonable_base {
public:
    virtual std::unique_ptr<Self> clone() const {
        return std::make_unique<Self>(static_cast<const Self&>(*this));
    }
};

template <typename Base, typename Intermediate, typename Self>
class clonable : public Intermediate {
public:
    std::unique_ptr<Base> clone() const override {
        return std::make_unique<Self>(static_cast<const Self&>(*this));
    }
};

Advantages include:

  • It’s standard C++: Works on all compilers.
  • It semi-automates the pattern.
  • It’s extensible: It can be directly extended to require nonpublic copying.

Drawbacks include:

  • It’s incomplete and repetitive: It requires cooperation from the code that uses it to supply the right types. It also violates the DRY principle (don’t repeat yourself). If we have to repeat the types, we can get them wrong, and I did make that mistake while writing the samples.
  • It makes it harder to diagnose mistakes: If the supplied types are wrong, the error messages can be subtle. For example, as I was writing the live example, sometimes I wrote the template arguments incorrectly (because cut-and-paste), and it took me longer than I’d like to admit to diagnose the bug because the error message was related to the static_cast downcast inside the clonable implementation which wasn’t the root cause.

Macros

And there are, well, macros. They lead to code like this (live example):

// User code: using the macros to write our own types (many times)

class B {
    CLONABLE_BASE(B);
};

class C : public B {
    CLONABLE(B);
};

class D : public C {
    CLONABLE(B);
};

shared_ptr<B> b1 = make_shared<D>();    // target use case works
shared_ptr<B> b2 = b1->clone();

The implementation typically looks something like this:

// Library code: implementing the macros (once)

#define CLONABLE_BASE(Base) \
    virtual std::unique_ptr<Base> clone() const { \
        return std::unique_ptr<Base>(new Base(*this)); \
    }

#define CLONABLE(Base) \
    std::unique_ptr<Base> clone() const override { \
        using Self = std::remove_cv_t<std::remove_reference_t<decltype(*this)>>; \
        return std::unique_ptr<Self>(new Self(*this));  \
    }

Advantages include:

  • It’s standard C++: Works on all compilers.
  • It semi-automates the pattern: Though less so than CRTP did.
  • It’s extensible: It can be directly extended to require nonpublic copying.
  • It’s easier than CRTP to diagnose mistakes, if you have a modern compiler: If the supplied types are wrong, the error messages are more obvious, at least with a compiler that has good diagnostics for macros.

Drawbacks include:

  • It’s brittle: Macros are outside the language and can also alter other code in the same file. We hates macroses. Sneaky little macroses. Wicked. Tricksy. False.
  • It’s incomplete and repetitive: Like CRTP, we have to supply information and repeat things, but a little less than with CRTP.

Summary so far

You can find more examples and variations of these proposed by a number of people on the original post’s comments and on the Reddit thread.

Both CRTP and macros have drawbacks. And perhaps the most fundamental is this point from the original question (emphasis added):

However, [writing clone manually] requires me to insert a member function into every class derived from B, with ugly bugs resulting from failure to do so.

Can we do better?

Guru Question

  1. Show a working Godbolt.org link that shows how class authors can write as close as possible to this code with the minimum possible additional boilerplate code:
class B {
};

class C : public B {
};

class D : public C {
};

and that still permits the class’ users to write exactly the following:

shared_ptr<B> b1 = make_shared<D>();
shared_ptr<B> b2 = b1->clone();

Reflection and metaclasses: Basic "starter" solution

Future compile-time reflection will give us an opportunity to do better. The following is based on the active reflection proposals currently in the standardization proposal pipeline, and the syntactic sugar of writing a compile-time consteval metaclass function I am proposing in P0707. Note that draft C++20 already contains part of the first round of reflection-related work to land in the standard: consteval functions that are guaranteed to run at compile time, which came from the reflection work and are designed specifically to be used to manipulate reflection information.

The idea is that we use reflection to actually look at the class and compute and generate what we need. Three common things it lets us do are to express:

  • Requirements: We can check for mistakes in the users’ code, and report them with clean and readable compile-time diagnostics.
  • Defaults: We can apply defaults, such as to make member functions public by default.
  • Generated functions: We can generate functions, such as clone.

Let’s start with a simple direct example that does just answers the immediate question, and leads to code like this live example):

// User code: using the library to write our own types (many times)

class(clonable) B {
};

class(clonable) C : public B {
};

class(clonable) D : public C {
};

shared_ptr<B> b1 = make_shared<D>();    // target use case works
shared_ptr<B> b2 = b1->clone();

The implementation is a compile-time consteval function that takes the reflection of the class and inspects it:

consteval void clonable(meta::info source) {
    using namespace meta;

    // 1. Repeat bases and members

    for (auto mem : base_spec_range(source)) -> mem;
    for (auto mem : member_range(source)) -> mem;

    // 2. Now apply the clonable-specific default/requirements/generations:

    auto clone_type = type_of(source);          // if no base has a clone() we'll use our own type
    bool base_has_clone = false;                // remember whether we found a base clone already

    // For each base class...
    for (auto mem : base_spec_range(source)) {  
        // Compute clone() return type: Traverse this base class's member
        //  functions to find any clone() and remember its return type.
        //  If more than one is found, make sure the return types agree.
        for (auto base_mem : member_fn_range(mem)) {
            if (strcmp(name_of(base_mem), "clone") == 0) {
                compiler.require(!base_has_clone || clone_type == return_type_of(base_mem),
                    "incompatible clone() types found: if more than one base class introduces "
                    "a clone() function, they must have the same return type");
                clone_type = return_type_of(base_mem);
                base_has_clone = true;
            }
        }
    }

    // Apply generated function: provide polymorphic clone() function using computed clone_type
    if (base_has_clone) {   // then inject a virtual overrider
        -> __fragment struct Z {
            typename(clone_type) clone() const override {
                return std::unique_ptr<Z>(new Z(*this));  // invoke nonpublic copy ctor
            }
        };
    }
    else {                  // else inject a new virtual function
        -> __fragment struct Z {
            virtual std::unique_ptr<Z> clone() const {
                return std::unique_ptr<Z>(new Z(*this));  // invoke nonpublic copy ctor
            }
        };
    }
};

Advantages include:

  • It fully automates the pattern.
  • It’s extensible: It can be directly extended to require nonpublic copying. (See next section.)
  • It’s complete and nonrepetitive: The code that uses clonable only has to say that one word. It doesn’t have to supply the right types or repeat names; reflection lets the metafunction discover and compute exactly what it needs, accurately every time.
  • It’s easy to diagnose mistakes: We can’t make the mistakes we made with CRTP and macros, plus we get as many additional new high-quality diagnostics we might want. In this example, we already get a clear compile-time error if we create a class hierarchy that introduces clone() twice with two different types.

Drawbacks include:

  • It’s not yet standard C++: The reflection proposals are progressing not but yet ready to be adopted.

But wait… all of the solutions so far are flawed

It turns out that by focusing on clone and showing empty-class examples, we have missed a set of usability and correctness problems. Fortunately, we will solve those too in just a moment.

Consider this slightly more complete example of the above code to show what it’s like to write a non-empty class, and a print test function that lets us make sure we really are doing a deep clone:

class(clonable) B {
public:
    virtual void print() const { std::cout << "B"; }
private:
    int bdata;
};

class(clonable) C : public B {
public:
    void print() const override { std::cout << "C"; }
private:
    int cdata;
};

class(clonable) D : public C {
public:
    void print() const override { std::cout << "D"; }
private:
    int ddata;
};

This "works" fine. But did you notice it has pitfalls?

Take a moment to think about it: If you encountered this code in a code review, would you approve it?


OK, for starters, all of these classes are polymorphic, but all of them have public non-virtual destructors and public copy constructors and copy assignment operators. That’s not good. Remember one of the problems of a nonintrusive solution was that it doesn’t actually prevent slicing because you can still write this:

D d;
B b = d;                                // oops, still works

So what we should actually be writing using all of the solutions so far is something like this:

class(clonable) B {
public:
    virtual void print() const { std::cout << "B"; }
    virtual ~B() noexcept { }
    B() = default;
protected:
    B(const B &) = default;
    B& operator=(const B&) = delete;
private:
    int bdata;
};

class(clonable) C : public B {
public:
    void print() const override { std::cout << "C"; }
    ~C() noexcept override { }
    C() = default;
protected:
    C(const C &) = default;
    C& operator=(const C&) = delete;
private:
    int cdata;
};

class(clonable) D : public C {
public:
    void print() const override { std::cout << "D"; }
    ~D() noexcept override { }
    D() = default;
protected:
    D(const D &) = default;
    D& operator=(const D&) = delete;
private:
    int ddata;
};

That’s a lot of boilerplate.

In fact, it turns out that even though the original question was about the boilerplate code of the clone function, most of the boilerplate is in other functions assumed and needed by clone pattern that weren’t even mentioned in the original question, but come up as soon as you try to use the proposed patterns in even simple real code.

Metaclasses: Fuller "real" solution

Fortunately, as I hinted above, we can do even better. The metaclass function can take care of all of this for us:

  • Apply default accessibilities and qualifiers: Make base classes and member functions public by default, data members private by default, and the destructor virtual by default.
  • Apply requirements: Check and enforce that a polymorphic type doesn’t have a public copy/move constructor, doesn’t have assignment operators, and that the destructor is either public and virtual or protected and nonvirtual. Note that these are accurate compile-time errors, the best kind.
  • Generate functions: Generate a public virtual destructor if the user doesn’t provide one. Generate a protected copy constructor if the user doesn’t provide one. Generate a default constructor if all bases and members are default constructible.

Now the same user code is:

  • Simple and clean. As far as I can tell, it literally could not be significantly shorter — we have encapsulated a whole set of opt-in defaults, requirements, and generated functions under the single word "clonable" library name that a class author can opt into by uttering that single Word of Power.
  • Correct by default.
  • Great error messages if the user writes a mistake.

Live example

class(clonable) B {
    virtual void print() const { std::cout << "B"; }
    int bdata;
};

class(clonable) C : B {
    void print() const override { std::cout << "C"; }
    int cdata;
};

class(clonable) D : C {
    void print() const override { std::cout << "D"; }
    int ddata;
};

That’s it. (And, I’ll add: This is "simplifying C++.")

How did we do it?

In my consteval library, I added the following polymorphic metaclass function, which is invoked by clonable (i.e., a clonable is-a polymorphic). I made it a separate function for just the usual good code factoring reasons: polymorphic offers nicely reusable behavior even for non-clonable types. Here is the code, in addition to the above cloneable which adds the computed clone at the end — and remember, we only need to write the following library code once, and then class authors can enjoy the above simplicity forever:

// Library code: implementing the metaclass functions (once)

consteval void polymorphic(meta::info source) {
    using namespace meta;

    // For each base class...
    bool base_has_virtual_dtor = false;
    for (auto mem : base_spec_range(source)) {

        // Remember whether we found a virtual destructor in a base class
        for (auto base_mem : member_fn_range(mem))
            if (is_destructor(base_mem) && is_virtual(base_mem)) {
                base_has_virtual_dtor = true;
                break;
            }

        // Apply default: base classes are public by default
        if (has_default_access(mem))
            make_public(mem);

        // And inject it
        -> mem;
    }

    // For each data member...
    for (auto mem : data_member_range(source)) {

        // Apply default: data is private by default
        if (has_default_access(mem))
            make_private(mem);

        // Apply requirement: and the programmer must not have made it explicitly public
        compiler.require(!is_public(mem),
            "polymorphic classes' data members must be nonpublic");

        // And inject it
        -> mem;
    }

    // Remember whether the user declared these SMFs we will otherwise generate
    bool has_dtor         = false;
    bool has_default_ctor = false;
    bool has_copy_ctor    = false;

    // For each member function...
    for (auto mem : member_fn_range(source)) {
        has_default_ctor |= is_default_constructor(mem);

        // If this is a copy or move constructor...
        if ((has_copy_ctor |= is_copy_constructor(mem)) || is_move_constructor(mem)) {
            // Apply default: copy/move construction is protected by default in polymorphic types
            if (has_default_access(mem))
                make_protected(mem);

            // Apply requirement: and the programmer must not have made it explicitly public
            compiler.require(!is_public(mem),
                "polymorphic classes' copy/move constructors must be nonpublic");
        }

        // Apply requirement: polymorphic types must not have assignment
        compiler.require(!is_copy_assignment_operator(mem) && !is_move_assignment_operator(mem),
            "polymorphic classes must not have assignment operators");

        // Apply default: other functions are public by default
        if (has_default_access(mem))
            make_public(mem);

        // Apply requirement: polymorphic class destructors must be
        // either public and virtual, or protected and nonvirtual
        if (is_destructor(mem)) {
            has_dtor = true;
            compiler.require((is_protected(mem) && !is_virtual(mem)) ||
                             (is_public(mem) && is_virtual(mem)),
                "polymorphic classes' destructors must be public and virtual, or protected and nonvirtual");
        }

        // And inject it
        -> mem;
    }

    // Apply generated function: provide default for destructor if the user did not
    if (!has_dtor) {
        if (base_has_virtual_dtor)
            -> __fragment class Z { public: ~Z() noexcept override { } };
        else
            -> __fragment class Z { public: virtual ~Z() noexcept { } };
    }

    // Apply generated function: provide defaults for constructors if the user did not
    if (!has_default_ctor)
         -> __fragment class Z { public: Z() =default; };
    if (!has_copy_ctor)
         -> __fragment class Z { protected: Z(const Z&) =default; };

}

GotW-ish: The ‘clonable’ pattern

Yesterday, I received this question from a distinguished C++ expert who served on the ISO C++ committee for many years. The email poses a decades-old question that still has the classic roll-your-own answer in C++ Core Guidelines #C.130, and basically asks whether we’ve made significant progress toward automating this pattern in modern C++ compared to what we had back in the 1990s and 2000s.

Before I present my own answer, I thought I would share just the question and give all of you readers the opportunity to propose your own candidate answers first — kind of retro GotW-style, except that I didn’t write the question myself. Here is the email, unedited except to fix one typo…


In trying to wrap my mind around all the new stuff for C++20, I see that there is one idea that has been around for quite a while but I still don’t see implemented. I’m wondering whether I’m missing something obvious, or whether it’s still not there.

Suppose B is a base class and I have a shared_ptr<B> that might point to a B or to any class that might be derived from B. Assuming that B and all of its derived classes are CopyConstructible, I would like to create a new shared_ptr<B> bound to a newly created object that is a copy of the object to which my original shared_ptr points. I cannot write this:

shared_ptr<B> b1 = /* whatever */;
shared_ptr<B> b2 = make_shared<B>(*b1);    //NO

because that will bind b2 to a copy of the B part of *b1.

I think I can make this work by putting a member function in B along the following lines:

class B {
    shared_ptr<B> clone() const { return make_shared<B>(*this); }
    // ...
};

and then for each derived class, write something like

class D: public B {
public:
    shared_ptr<B> clone() const {
        return make_shared<D>(*this);   // not make_shared<B>
    }
    // ...
};

and then I can write

shared_ptr<B> b1 = /* as before */;
shared_ptr<B> b2 = b1->clone();

and b2 will now point to a shared_ptr<B> that is bound to an object with the same dynamic type as *b1.

However, this technique requires me to insert a member function into every class derived from B, with ugly bugs resulting from failure to do so.

So my question is whether there some way of accomplishing this automatically that I’ve missed?


So here’s your challenge:

JG Question

1. Describe as many approaches as you can think of that could let us semi- or fully-automate this pattern, over just writing it by hand every time as recommended in C++ Core Guidelines #C.130. What are each approach’s advantages and drawbacks?

Guru Question

2. Show a working Godbolt.org link that shows how class authors can write as close as possible to this code with the minimum possible additional boilerplate code:

class B {
};

class C : public B {
};

class D : public C {
};

and that still permits the class’ users to write exactly the following:

shared_ptr<B> b1 = make_shared<D>();
shared_ptr<B> b2 = b1->clone();

Hints: Feel free to use any/all mechanism that some production or experimental compiler supports (if it’s somewhere on Godbolt, it’s legal to try), including but not limited to:

or any combination thereof.

Q&A: Does string::data() return a pointer valid for size() elements, or capacity() elements?

A reader asked:

In C++17, for std::string::data(), is the returned buffer valid for the range [data(), data() + size()), or is it valid for [data(), data + capacity())?

The latter seems more intuitive and what I think most people would expect reserve() to create given the non-const version of data() since C++17.

… and then helpfully included the answer, but in fairness clearly they were wondering whether cppreference.com was correct:

Relevant quote from cppreference.com: … “Returns a pointer to the underlying array serving as character storage. The pointer is such that the range [data(); data() + size()) is valid and the values in it correspond to the values stored in the string.”

Yes, cppreference.com is correct. Here’s the quote from the current draft standard:

  • 2 A specialization of basic_string is a contiguous container (22.2.1).
  • 3 In all cases, [data(), data() + size()] is a valid range, data() + size() points at an object with value charT() (a “null terminator”), and size() <= capacity() is true.

Regarding this potential alternative:

or is it valid for [data(), data + capacity())?

No, that would be strange, because it would mean intentionally supporting reading uninitialized characters in any extra raw memory at the end of the string’s current memory block.

Note that the first part of the above quote from the standard hints at the consistency issue: A string is a container, and we want containers to be consistent. We certainly wouldn’t want vector<widget>::data() to behave that way and let callers see raw memory with unconstructed objects.

The latter [… is …] what I think most people would expect reserve() to create

c/reserve/resize/ and I’ll agree :)

Any container’s size()/resize() is about the data you stored in it and that it’s holding for you. Any container’s capacity()/reserve() is about the underlying raw memory buffer just to let you help the container optimize its raw memory management, but it isn’t intended to give you access to the allocated-but-unused memory.

My favorite work-week of 2019

I just can’t get enough of this short video, combining interviews shot at last year’s CppCon with shots of our new “home” location that we’ll be enjoying for the first time two weeks from now.

Please enjoy it — this is an excellent representation of what CppCon is like.

Those of you who know me personally know that I’m a homebody, and that two of my least-liked things are airplanes and hotel rooms. But even so, I wish it was already the afternoon of 15 September, and that I was sitting in a cramped plane seat on the way to Colorado.

If you’ve been waffling on whether to register for this year, you know what to do. I’m looking forward to seeing many of you there soon.

Survey results: Your “top five” ISO C++ feature proposals

Today I collated and analyzed the results of the survey I posted two weeks ago.

I presented you with a daunting unsorted list of ~300 eye-numbing paper titles, and still 289 of you responded with ~1,200 total votes (not everyone picked five things) many of which contained thoughtful “how I would use it” verbatims. Thank you for your time and interest!

In addition to summing your votes per-paper, I also spent several hours manually assigning categories to the individual proposals so that we could see votes per-feature area. For example, “pattern matching” had multiple papers, so I wanted to generate a subtotal also for votes for “all pattern-matching papers” as well as for each of the individual ones. You might assign categories differently than I did, but I think these are a good start to see basic groupings and patterns.

Here’s a 35-page PDF of the full results. The following are some highlights.


The top-voted primary topic categories correlate well with the results we saw in the 2019 global C++ developer survey open-ended write-in questions for ‘if you could change one thing about C++’ and ‘any other feedback.’ The top topic areas are:

Primary topic category #votes
Reflection 154
Convenience/usability 129
Error handling 129
Compile-time programming 105
Pattern matching 73
Concurrency 70
Performance 58
Numeric 47
Systems programming 45

Here’s a partial graph:

The top-voted individual papers were:

Paper #votes
P0707 reflection & metaclasses 89
P0709 error handling, static EH 85
P1371 pattern matching 71
P1240 reflection 40
P0323 error handling, std::expected 24
P1767 package mgmt 23
P1485 coroutine keywords 23
P1750 std::process 22
P1717 compile-time programming 22
? no specific paper (a catchall when the response was useful and fit in a category, but didn’t include a specific paper number) 22
P1629 Unicode 20
P1031 low level file I/O 19
P1040 compile-time data embed 19
P0645 text formatting 16
P1729 text formatting 15

Finally, disclaimers:

  • Experimental: This was advertised as an experiment. The results may not be representative. However, as I mentioned already, the top-voted topic areas correlate well with the results from the 2019 global C++ developer survey open-ended write-in questions for ‘if you could change one thing about C++’ and ‘any other feedback.’
  • Bias: I posted the poll on my personal blog, which could bias results in favor of my proposals. In fact, the two top-voted individual proposals were my own papers on reflection and lightweight exceptions. But even if you remove those two, their topics are still in the top-voted categories of interest because there was strong voting in favor of “reflection” and “efficient exception/error handling” papers that were not written by me (e.g., the #4 and #5 papers were reflection and error handling papers that I didn’t write). Also, the post was Reddit’ed and it looks like most of the responses came via that. (Besides, happily/sadly, many of you who follow my blog freely disagree and argue with me so this isn’t just an echo chamber, and I also had a number of other papers that got fewer votes or none so you weren’t just upvoting my papers.)

Again, thank you very much to everyone who responded for your time and interest. I hope you find these results likewise interesting, and I’ve shared them with the committee as well.

I think this survey was a successful experiment. We’ll keep reading the data, especially the verbatims on how you would use the features you requested. But I think it’s fair that we can say already that, if nothing else, we now have some confirmation that the part of the community that responded supports the basic goals of P0592 so that paper is more than just the opinion of the author (who isn’t me)… that paper recommends that in the C++23 timeframe we focus our efforts on coroutines, executors, networking, reflection, and pattern matching. There’s a good correlation between that list and your “most-wanted papers’ topics” list above. That’s useful to learn, and we’ll look to learn more from your responses; thank you again.

Trip report: Summer ISO C++ standards meeting (Cologne)

Obligatory comment: The C++20 Eagle has wings.

At noon today, July 20 2019, the ISO C++ committee completed its summer meeting in Cologne, Germany, hosted with thanks by Think-Cell, SIGS Datacom, SimuNova, Silexica, Meeting C++, Josuttis Eckstein, Xara, Volker Dörr, Mike Spertus, and the Standard C++ Foundation.

As usual, we met for six days Monday through Saturday, and it was our biggest meeting yet with some 220 attendees. Here’s a visual, going back to when I first joined the committee:

For more details about how we have adapted organizationally to handle our size increase, see my San Diego “pre-trip” report and my San Diego trip report.

Thank you to all of the hundreds of people who participate in ISO C++ in person and electronically. Below, I want to at least try to recognize by name many of the authors of the proposals we adopted, but nobody succeeds with a proposal on their own. C++ is a team effort – this wouldn’t be possible without all of your help. So, thank you, and apologies for not being able to acknowledge everyone by name.

Notes:

  • See also the Reddit trip report, which was collaboratively edited by several dozen committee members. It has lots of excellent detail.
  • Some of the links below are to papers that will not be published until the post-meeting mailing a few weeks from now, and so the links will start working at that time.
  • You can find a brief summary of ISO procedures here.

The main news: C++20 is feature complete, CD ships

Today we achieved feature freeze for C++20.

Per our official C++20 schedule, we are now feature-complete for C++20 and will send out the C++20 draft for its international comment ballot (Committee Draft, or CD) to run over the summer. After that, we will address national body comments and other fit-and-finish work for the next two meetings (November in Belfast, and February in Prague) and plan to ship the final text of the C++20 start at the end of the February meeting.

Here are the main proposals that were added into C++20 at this meeting.

std::format for text formatting (Victor Zverovich) adds format strings support to the C++ standard library, including for type-safe and positional parameters. If you’re familiar with Boost.Format or POSIX format strings, or even just printf, you’ll know exactly what this is about: It gives us the best of printf (convenience) and the best of iostreams (type safety and extensibility of iostreams) – and it isn’t limited to iostreams, it lets you format any string. I’ve been waiting for this for a long time, so that I will never have to use header iomanip again.

Applying <=> (spaceship) comparisons throughout the standard library (Barry Revzin) uses the new <=> feature and applies it throughout namespace std to basically everything. It’s great to see a new language feature we added in C++20 already used widely in the standard library itself in the same release. And it’s not alone: C++20 also added concepts and also the concepts-based ranges library in the same release.

Stop token and joining thread (Nicolai Josuttis, Lewis Baker, Billy O’Neal, Herb Sutter, Anthony Williams) brings us two presents: First, a correctly RAII thread type whose destructor implicitly joins if you haven’t joined or detached already. And second, but even more importantly, a general composable cancellation mechanism into the standard library that all types can use, called stop_token.

And also, in the “moar constexpr!” and “constexpr all the things!” departments:

We added constexpr INVOKE (Tomasz Kamiński and Barry Revzin), constexpr std::vector (Louis Dionne), and constexpr std::string (Louis Dionne). If all this constexpr surprises you, and in particular if constexpr std::vector and std::string make you do a double-take: Remember that we already added constexpr new (think about that and grok what it means) and made much of the standard library constexpr and added consteval to the language… and all told this means that a whole lot of ordinary C++ code can run at compile time, now including even the standard dynamic vector and string containers. This is something that would have been difficult to imagine just a few years ago, but it shows ever more that we’re on a path where we can run plain C++ code at compile time instead of trying to express those computations as template metaprograms. As many people have said: “More metaprogramming, less template metaprogramming!” We didn’t have to invent a new ‘compile-time C++’ dialect; this is just C++, running at compile time. This trend is good.

There are dozens of additional proposals we adopted besides these, all of them good work, and I want to repeat our great thanks to all the proposal authors and also the even larger number of people who helped them with their proposals without whom this could not have happened. Thank you for your time and effort on behalf of millions of C++ developers worldwide who will benefit from your hard and careful work!

Contracts moved from draft C++20 to a new Study Group

At this meeting, it became clear that we were not quite done designing the contracts feature in time for C++20. Back when we adopted the feature one year ago we thought it was baked and ready for the standard, but since then we have discovered lingering design disagreements and concerns that we could not resolve sufficiently to get consensus (general agreement without sustained objections). All the main contracts proposers unanimously agreed that the right thing to do is to defer its release.

But contracts is an important feature, and work on contracts is not stopping, but actually increasing: The contracts proposers made good progress at this meeting toward starting to identify and iron out the differences, and we formed a new Study Group for Contracts, SG21, with John Spicer as the chair that will continue work with even greater participation. I’m hopeful that over the coming meetings we’ll see some solid further progress in this important area.

The shape of C++20

Now that we know the final feature set of C++20, we can see this is C++’s largest release since C++11. “Major” features include:

  • modules
  • coroutines
  • concepts including in the standard library via ranges
  • <=> spaceship including in the standard library
  • broad use of normal C++ for direct compile-time programming, without resorting to template metaprogramming (see last trip reports)
  • ranges
  • calendars and time zones
  • text formatting
  • span
  • … and lots more …

Combined with what came in C++14 and C++17, the C++14/17/20 nine-year cycle is arguably our biggest nine-year cycle alongside the previous two (C++98 and C++11). We understand that’s exciting, but we also understand that’s a lot for the community to absorb, and so I’m also pleased that along the way we’ve done things like create the Direction Group and, most recently, SG20 on Education, to help guide and absorb continued C++ evolution in our vibrant living language.

It was not lost on the room that today was not just the day we completed the feature set of C++20, but it was also the 50th anniversary of the Apollo 11 lunar landing. A number of people mentioned it throughout the week and especially today, some citing examples from that voyage and drawing parallels that ranged from insightful to side-splittingly funny. Here is one that I contributed after we had the vote to adopt the draft, with profuse apologies to Neil Armstrong that I couldn’t come up with something even better (sorry!):

There’s no doubt about it, C++20 is a big release with many new and important features. It’s almost done as we now enter the review and fit-and-finish phases… as we complete the work to ship the new standard over the coming two meetings, and as it then gradually becomes available for you to use in C++ implementations, we hope you will find it a useful and compelling release.

Oh, and here’s a public service announcement… even without contracts we can still make “co” jokes:

As someone pointed out to me after the session ended: This was, after all, the co_logne meeting.

Other progress and decisions

Because for the past year we’ve been focused on finishing C++20, I’ve been holding back from publishing updates to my own P0707 and P0709 proposals, generation+metaclasses and lightweight exception handling. At this meeting, I brought updates to both of those proposals again. In Study Group 7 (Compile-Time Programming) I showed an updated of P0707 and Andrew Sutton and Wyatt Childers presented updates on the Clang-based implementation progress. In the Evolution subgroup, I presented P0709 for the first time and received broad encouragement for most of the proposal along with a list of questions to which I’ll come back with answers in Belfast.

Thank you again to the approximately 220 experts who attended this meeting, and the many more who participate in standardization through their national bodies! If today our C++20 Eagle has wings, our next step is to land it, then bring it home… and so in our next two regular WG21 meetings, in November (Belfast, Northern Ireland, UK) and February (Prague, Czech Republic), we plan to respond to the C++20 international review ballot comments and make other bugfixes before sending final C++20 out for its approval ballot about seven months from now.

Have a good summer… and see many of you (and a large part of the committee) at CppCon in September!

Draft FAQ: Why does the C++ standard ship every three years?

WG21 has a strict schedule (see P1000) by which we ship the standard every three years. We don’t delay it.

Around this time of each cycle, we regularly get questions about “but why so strict?”, especially because we have many new committee members who aren’t as familiar with our history and the reasons why we do things this way now. And so, on the pre-Cologne admin telecon last Friday, several of the chairs encouraged me to write down the reasons why we do it this way, and some of the history behind the decision to adopt this schedule.

I’ve now done that by adding a FAQ section to the next draft of P1000, and I’ve sent a copy of it to the committee members now en route to Cologne. That FAQ material will appear in the next public revision of P1000 which will be in the post-meeting mailing a few weeks from now.

In the meantime, because the draft FAQ might be of general public interest too, here is a copy of that material. I hope you find it largely useful, occasionally illuminating, and maybe even a bit entertaining.

Now off to the airport… see (many of) you in Cologne. We are expecting about 220 people… by far our largest meeting ever. More about that in the post-meeting trip report after the meeting is over…


FAQs

(As of pre-Cologne, July 2019) There are bugs in the standard, so should we delay C++20?

Of course, and no.

We are on our planned course and speed: Fixing bugs is the purpose of this final year, and it’s why this schedule set the feature freeze deadline for C++“20” in early “19” (Kona), to leave a year to fix bugs including to get a round of international comments this summer. We have until early 2020 (three meetings: Cologne, Belfast, and Prague) to apply that review feedback and any other issue resolutions and bug fixes.

If we had just another meeting or two, we could add <feature> which is almost ready, so should we delay C++20?

Of course, and no.

Just wait a couple more meetings (post-Prague) and C++23 will be open for business and <feature> can be the first thing voted into the C++23 working draft. For example, that’s what we did with concepts; it was not quite ready to be rushed from its TS straight into C++17, so the core feature was voted into draft C++20 at the first meeting of C++20 (Toronto), leaving plenty of time to refine and adopt the remaining controversial part of the TS that needed a little more bake time (the non-“template” syntax) which was adopted the following year (San Diego). Now we have the whole thing.

This feels overly strict. Why do we ship releases of the IS at fixed time intervals (3 years)?

Because it’s one of only two basic project management options to release the C++ IS, and experience has demonstrated that it’s better than the other option.

What are the two project management options to release the C++ IS?

I’m glad you asked.

There are two basic release target choices: Pick the features, or pick the release time, and whichever you pick means relinquishing control over determining the other. It is not possible to control both at once. They can be summarized as follows:

Elaborating:

(1) “What”: Pick the features, and ship when they’re ready; you don’t get to pick the release time. If you discover that a feature in the draft standard needs more bake time, you delay the world until it’s ready. You work on big long-pole features that require multiple years of development by making a release big enough to cover the necessary development time, then try to stop working on new features entirely while stabilizing the release (a big join point).

This was the model for C++98 (originally expected to ship around 1994; Bjarne originally said if it didn’t ship by about then it would be a failure) and C++11 (called 0x because x was expected to be around 7). This model “left the patient open” for indeterminate periods and led to delayed integration testing and release. It led to great uncertainty in the marketplace wondering when the committee would ship the next standard, or even if it would ever ship (yes, among the community, the implementers, and even within the committee, some had serious doubts in both 1996 and 2009 whether we would ever ship the respective release). During this time, most compilers were routinely several years behind implementing the standard, because who knew how many more incompatible changes the committee would make while tinkering with the release, or when it would even ship? This led to wide variation and fragmentation in the C++ support of compilers available to the community.

Why did we do that, were we stupid? Not exactly, just inexperienced and… let’s say “optimistic,” for (1) is the road paved with the best of intentions. In 1994/5/6, and again in 2007/8/9, we really believed that if we just slipped another meeting or three we’d be done, and each time we ended up slipping up to four years. We learned the hard way that there’s really no such thing as slipping by one year, or even two.

Fortunately, this has changed, with option (2)…

(2) “When”: Pick the release time, and ship what features are ready; you don’t get to pick the feature set. If you discover that a feature in the draft standard needs more bake time, you yank it and ship what’s ready. You can still work on big long-pole features that require multiple releases’ worth of development time, by simply doing that work off to the side in “branches,” and merging them to the trunk/master IS when they’re ready, and you are constantly working on features because every feature’s development is nicely decoupled from an actual ship vehicle until it’s ready (no big join point).

This has been the model since 2012, and we don’t want to go back. It “closes the patient” regularly and leads to sustaining higher quality by forcing regular integration and not merging work into the IS draft until it has reached a decent level of stability, usually in a feature branch. It also creates a predictable ship cycle for the industry to rely on and plan for. During this time, compilers have been shipping conforming implementations sooner and sooner after each standard (which had never happened before), and in 2020 we expect multiple fully conforming implementations the same year the standard is published (which has never happened before). This is nothing but goodness for the whole market – implementers, users, educators, everyone.

Also, note that since we went to (2), we’ve also been shipping more work (as measured by big/medium/small feature counts) at higher quality (as measured by a sharp reduction in defect reports and comments on review drafts of each standard), while shipping whatever is ready (and if anything isn’t, deferring just that).

How serious are we about (2)? What if a major feature by a prominent committee member was “almost ready”… we’d be tempted to wait then, wouldn’t we?

Very serious, and no.

We have historical data: In Jacksonville 2016, at the feature cutoff for C++17, Bjarne Stroustrup made a plea in plenary for including concepts in C++17. When it failed to get consensus, Stroustrup was directly asked if he would like to delay C++17 for a year to get concepts in. Stroustrup said No without any hesitation or hedging, and added that C++17 without concepts was more important than a C++18 or possibly C++19 with concepts, even though Stroustrup had worked on concepts for about 15 years. The real choice then was between: (2) shipping C++17 without concepts and then C++20 with concepts (which we did), or (1) renaming C++17 to C++20 which is isomorphic to (2) except for skipping C++17 and not shipping what was already ready for C++17.

What about something between (1) and (2), say do basically (2) but with “a little” schedule flexibility to take “a little” extra time when we feel we need to stabilize a feature?

No, because that would be (1).

The ‘mythical small slip’ was explained by Fred Brooks in The Mythical Man-Month, with the conclusion: ”Take no small slips.”

For a moment, imagine we did slip C++20. The reality is that we would be switching from (2) back to (1), no matter how much we might try to deny it, and without any actual benefit. If we decided to delay C++20 for more fit-and-finish, we will delay the standard by at least two years. There is no such thing as a one-meeting or three-meeting slip, because during this time other people will continue to (rightly) say “well my feature only needs one more meeting too, since we’re slipping a meeting let’s add that too.” And once we slip at least two years, we’re saying that C++20 becomes C++22 or more likely C++23… but we’re already going to ship C++23! — So we’d still be shipping C++23 on either plan, and the only difference is that we’re not shipping C++20 in the meantime with the large amount of fully-baked work that’s ready to go, and making the world wait three more years for it. Gratuitously, because the delay will not benefit those baked features, which is most or all of them.

So the suggestion amounts to “let’s make C++20 be C++22 or C++23,” and the simple answer is “yes, we’re going to have C++23 too, but in addition to C++20 and not instead of it.” To delay C++20 actually means to skip C++20, instead of releasing the great good work that is stable and ready, and there’s no benefit to doing that.

But feature X is broken / needs more bake time than we have bugfix time left in C++20!

No problem! We can just pull it.

In that case, someone needs to write the paper aimed at EWG or LEWG (as appropriate) that shows the problem and/or the future doors we’re closing, and proposes removing it from the IS working draft. Those groups will consider it, and if they decide the feature is broken (and plenary agrees), that’s fine, the feature will be delayed to C++next. We have actually done this before, with C++0x concepts.

But under plan (1), we would be delaying, not only that feature, but the entire feature set of C++20 to C++23! That would be… excessive.

Does (2) mean “major/minor” releases?

No. We said that at first, before we understood that (2) really simply means you don’t get to pick the feature set, not even at a “major/minor” granularity.

Model (2) simply means “ship what’s ready.” That leads to releases that are:

  • similarly sized (aka regular medium-sized) for “smaller” features because those tend to take shorter lead times (say, < 3 years each) and so generally we see similar numbers completed per release; and
  • variable sized (aka lumpy) for “bigger” features that take longer lead times (say, > 3 years each) and each IS release gets whichever of those mature to the point of becoming ready to merge during that IS’s time window, so sometimes there will be more than others.

So C++14 and C++17 were relatively small, because a lot of the standardization work during that time was taking place in long-pole features that lived in proposal papers (e.g., contracts) and TS “feature branches” (e.g., concepts).

C++20 is a big release …

Yes. C++20 has a lot of major features. Three of the biggest all start with the letters “co” (concepts, contracts, coroutines) so perhaps we could call it co_cpp20. Or co_dependent. Wait, we’re digressing.

… and so aren’t we cramming a lot into a three-year cycle for C++20?

No, see “lumpy” above.

C++20 is big, not because we did more work in those three years, but because many long-pole items (including at least two that have been worked on in their current form since 2012, off to the side as P-proposals and TS “branches”) happened to mature and get consensus to merge into the IS draft in the same release cycle.

It has pretty much always been true that major features take many years. The main difference between plan (1) for C++98 and C++11 and plan (2) now is: In C++98 and C++11 we held the standard until they were all ready, now we still ship those big ones when they’re ready but we also ship other things that are ready in the meantime instead of going totally dark.

C++20 is the same 3-year cycle as C++14 and C++17; it’s not that we did more in these 3 years than in the previous two 3-year cycles, it’s just that more long-pole major features became ready to merge. And if any really are unready, fine, we can just pull them again and let them bake more for C++23. If there is, we need that to be explained in a paper that proposes pulling it, and why, for it to be actionable.

In fact, I think the right way to think about it is that C++14+17+20 taken as a whole is our third 9-year cycle (2011-2020), after C++98 (1989-1998) and C++11 (2002-2011), but because we were on plan (2) we also shipped the parts that were ready at the 3- and 6-year points.

Isn’t it better to catch bugs while the product is in development, vs. after it has been released to customers?

Of course.

But if we’re talking about that as a reason to delay the C++ standard, the question implies two false premises: (a) it assumes the features haven’t been released and used before the standard ships (many already have production usage experience); and (b) it assumes all the features can be used together before the standard ships (they can’t).

Elaborating:

Re (a): Most major C++20 features have been implemented in essentially their current draft standard form in at least one shipping compiler, and in most cases actually used in production code (i.e., has already been released to customers who are very happy with it). For example, coroutines (adopted only five months ago as of this writing) has been used in production in MSVC for two years and in Clang for at least a year with very happy customers at scale (e.g., Azure, Facebook).

Re (b): The reality is that we aren’t going to find many feature interaction problems until users are using them in production, which generally means until after the standard ships, because implementers will generally wait until the standard ships to implement most things. That’s why when we show any uncertainty about when we ship, what generally happens is that implementations wait – oh, they’ll implement a few things, but they will hit Pause on implementing the whole thing until they know we’re ready to set it in stone. Ask <favorite compiler> team what happened when they implemented <major feature before it was in a published standard>. In a number of cases, they had to implement it more than once, and break customers more than once. So it’s reasonable for implementers to wait for the committee to ship.

Finally, don’t forget the feature interaction problem. In addition to shipping when we are ready, we need time after we ship to find and fix problems with interactions among features and add support for such interactions that we on typically cannot know before widespread use of new features. No matter how long we delay the standard, there will be interactions we can’t discover until much later. The key is to manage that risk with design flexibility to adjust the features in a compatible way, not to wait until all risk is done.

The standard is never perfect… don’t we ship mistakes?

Yes.

If we see a feature that’s not ready, yes we should pull it.

If we see a feature that could be better, but we know that the change can be done in a backward-compatible way, that’s not a reason to not ship it now; it can be done as an extension in C++next.

We do intentionally ship features we plan to further improve, as long as we have good confidence we can do so in a backward-compatible way.

But shouldn’t we aim to minimize shipping mistakes?

Yes. We do aim for that.

However, we don’t aim to eliminate all risk. There is also a risk and (opportunity) cost to not shipping something we think is ready. So far, we’ve been right most of the time.

Are we sure that our quality now is better than when were on plan (1)?

Yes.

By objective metrics, notably national body comment volume and defect reports, C++14 and C++17 have been our most stable releases ever, each about 3-4 times better on those metrics than C++98 or C++11. And the reason is because we ship regularly, and put big items into TS branches first (including full wording on how they integrate with the trunk standard) and merge them later when we know they’re more baked.

In fact, since 2012 the core standard has always been maintained in a near-ship-ready state (so that even our working drafts are at least as high quality as the shipped C++98 and C++11 standards). That never happened before 2012, where we would often keep the patient open with long issues lists and organs lying around nearby that we meant to put back soon; now we know we can meet the schedule at high quality because we always stay close to a ship-ready state. If we wanted to, we could ship the CD right now without the Cologne meeting and still be way higher quality than C++98’s or C++11’s CDs (or, frankly, their published standards) ever were. Given that C++98 and C++11 were successful, recognizing that we’re now at strictly higher quality than that all the time means we’re in a pretty good place.

C++98 and C++11 each took about 9 years and were pretty good products …

Yes: 1989-1998, and 2002-2011.

… and C++14 and C++17 were minor releases, and C++20 is major?

Again, I think the right comparable is C++14+17+20 taken as a whole: That is our third 9-year cycle, but because we were on plan (2) we also shipped the parts that were ready at the 3- and 6-year points.

Does (2) allow making feature-based targets like P0592 for C++next?

Sure! As long as it doesn’t contain words like “must include these features,” because that would be (1).

Aiming for a specific set of features, and giving those ones priority over others, is fine – then it’s a prioritization question. We’ll still take only what’s ready, but we can definitely be more intentional about prioritizing what to work on first so it has the best chance of being ready as soon as possible.