GotW #101 Solution: Preconditions, Part 2 (Difficulty: 7/10)

This special Guru of the Week series focuses on contracts. We covered some basics of preconditions in GotW #100. This time, let’s see how we can use preconditions in some practical examples…

1. Consider these functions, expanded from an article by Andrzej Krzemieński: [1] … How many ways could a caller of each function get the arguments wrong, but that would silently compile without error? Name as many different ways as you can.

There are several ways to break this down. I’ll use three major categories of possible mistakes, the first two of which overlap:

  • wrong order: passing an argument in the wrong position
  • wrong value: passing an argument with a valid but wrong value (e.g., index out of range)
  • invalid value: passing an argument that is already invalid (e.g., an invalid iterator)

Let’s see how these play out with our three examples, starting with (a).

(a) is_in_values (int val, int min, int max)

// Example 1: Adapted from [1]

auto is_in_values (int val, int min, int max)
  -> bool;  // true iff val is in the values [min, max]

Oh my, three identically typed integer parameters… what could be confusing about that?!

Wrong order (5 ways): First, there are five ways to pass these in the wrong order, because there are 3! = 6 permutations, all of which compile but only the first of which is correct:

is_in_values( v,  lo, hi );    // correct

is_in_values( v,  hi, lo );    // all these are wrong, but compile :(
is_in_values( lo, v,  hi ); 
is_in_values( lo, hi, v  ); 
is_in_values( hi, v,  lo ); 
is_in_values( hi, lo, v  );

Some of these argument orders may seem strange, but some are orders other libraries’ similar APIs might use which makes confusion easier, we all make mistakes… and the type system isn’t helping us at all.

Wrong value (1 way): Second, there is an implicit precondition that min <= max, so passing arguments where min > max would be wrong, but would silently compile. Some of these are exercised by the “wrong order” permutations above, but even call sites that remember the right argument order can make mistakes about the actual values.

Invalid value (0 ways): Finally, all possible values of an int are valid — some may be suspiciously big or small, but int doesn’t have the concept of “not a number” (NaN) as we have with floats, or the concept of “invalidated” like we have with iterators.

(b) is_in_container (int val, int idx_min, int idx_max)

It sure doesn’t help that the next function has the identical signature as is_in_values, but with very different meaning:

auto is_in_container (int val, int idx_min, int idx_max)
  -> bool; // true iff container[i]==val for some i in [idx_min, idx_max]

Wrong order (5 ways): As in (a), we again have five ways to pass these in the wrong order, all of which compile but only the first is correct:

is_in_container( v,  lo, hi );    // correct

is_in_container( v,  hi, lo );    // all these are wrong, but compile :(
is_in_container( lo, v,  hi ); 
is_in_container( lo, hi, v  ); 
is_in_container( hi, v,  lo ); 
is_in_container( hi, lo, v  );

Wrong value (3 ways): Again as in (a), we have the implicit precondition that idx_min <= idx_max, so passing idx_min > idx_max would be wrong, but would silently compile. But this time there are two additional ways to go wrong, because idx_min and idx_max must both be valid subscripts into container, so if either is outside the range [0, container.size()) it is a valid integer but an out of bounds value for this use.

Invalid value (0 ways): Again as in (a), all possible values of an int are valid — though some may be wrong values if they’re out of bounds as we noted above, they’re still valid integers.

(c) is_in_range (T val, Iter first, Iter last)

template <typename T, typename Iter>
auto is_in_range (T val, Iter first, Iter last)
  -> bool; // true iff *i==val for some i in [first,last)

Wrong order (1 way): This time there’s only one way to pass the parameters in the wrong order (ignoring pathological cases where the same argument might convert both T and Iter):

is_in_container( v, istart, iend );    // correct

is_in_container( v, iend, istart );    // wrong, but compiles :(

Wrong value (2 ways): We could pass a first and last that are not a valid range in two ways:

  • they point into the same container, but first doesn’t precede last
  • they point into different containers

Invalid value (2 ways): And finally, either of first or last could actually be an invalidated iterator (e.g., dangling). For example, the container they point into may be destroyed so that both are invalid; or one of the two iterators might have been calculated before a more recent operation like vector::push_back that could have invalidated it.

But if the sight of these function signatures has had you pulling your hair and shouting “use the type system, Luke!” at your screen, you’re not alone… now let’s make things better.

2. Show how can you improve the function declarations in Question 1 by …

(a) just grouping parameters, using a struct with public variables

Interestingly, we actually get a lot of benefit simply by grouping ‘parameters that go together,’ using an creating an aggregate or “grouping” helper struct.[3] For example:

// Example 2(a)(i): Improving Example 1 with aggregate types

struct min_max { int min, max; };

auto is_in_values (int val, min_max minmax) -> bool;
auto is_in_container (int val, min_max rng) -> bool;

template <typename Iter> struct two_iters { Iter first, last; };

template <typename T, typename Iter>
auto is_in_range (T val, two_iters<Iter> rng) -> bool;

Or even just venerable anonymous std::pair is better than no grouping:

// Example 2(a)(ii): Improving Example 1 with aggregate types

auto is_in_values (int val, std::pair<int,int> minmax) -> bool;
auto is_in_container (int val, std::pair<int,int> rng) -> bool;

template <typename T, typename Iter>
auto is_in_range (T val, std::pair<Iter,Iter> rng) -> bool;

With either of the above, there’s only one way for callers to get the argument order wrong. And it requires only two extra characters at call sites, because we can use { } to group the arguments without creating actual named objects of the helper struct:

is_in_values( v, {lo, hi} );	// correct
is_in_values( v, {hi, lo} );	// wrong, but compiles

is_in_container( v, {lo, hi} );	// correct
is_in_container( v, {hi, lo} );	// wrong, but compiles

is_in_range( v, {i1, i2} );		// correct
is_in_range( v, {i2, i1} ); 	// wrong, but compiles

So just grouping parameters using a struct eliminates some errors. But really using the type system is even better…

(b) just using an encapsulated class, using a class with private variables (an abstraction with its own invariant)

Clearly all three functions are crying out for a “range”-like abstraction for its pair of parameters, in the first two cases a range of values and in the third a range of iterators. How do we know? Because:

Here’s one way we can apply class types we can find in the standard library or Boost today:

// Example 2(b): Improving Example 1 with encapsulated class types

auto is_in_values (int val, boost::integer_range<int> rng) -> bool;

auto is_in_container (int val, boost::integer_range<int> rng) -> bool;

template <typename T, std::ranges::input_range Range>
auto is_in_range (T val, Range&& rng) -> bool;

This gives us all the mistake-reduction goodness we got in (a), plus more.

First, as in (a), absent pathological conversions, it’s very difficult to get arguments in the wrong order simply because of being forced to group the parameters:

auto minmax = boost::irange(10, 100);
is_in_values( 42, minmax );

auto minmax2 = boost::irange(0, ssize(myvec)-1);
is_in_container( 42, minmax2 );

auto myvec = std::vector<int>();
is_in_range( 42, myvec );

But, unlike our helper structs in (a), we now get additional safety because the types can express constructor preconditions that move some of those mistakes (such as (hi,lo) misordering) to constructors of class abstractions that can then preserve them as invariants [4] – so the mistake can still be made but in fewer places, to where we construct or modify the abstracted object (e.g., range), rather than every time we use un-abstracted separately values (e.g., a couple of iterator objects we have lying around and whose relationship we have to maintain by hand over time). This is why we sometimes say “types are predicates,” because a type encapsulates a predicate, namely its invariant.

GUIDELINE: When multiple functions state the same precondition, it’s a telltale sign there’s a missing class that should turn it into an invariant. A repeated precondition is nearly always a “naked invariant” that should be encapsulated up inside a type. This is more obvious when the precondition involves multiple parameters (or ordinary variables for that matter); a poster child is the STL’s pervasive use of iterator pairs, which have long been crying out to be encapsulated using a range abstraction, and fortunately we now have that in C++20. Consider using a class instead.

GUIDELINE: Remember that a key reason why encapsulated classes are powerful is that they wrap up preconditions and turn them into invariants. Hiding data members is good dependency management because it limits the code that can depend on the details of the data and is responsible for maintaining the correct relationship among the data members.

(c) just using post-C++20 contract preconditions (not yet valid C++, but something like the syntax in [2])

Preconditions test values, so they can let us eliminate the “wrong values” kinds of mistakes. Consider this code:

// Example 2(c): Improving Example 1 with boolean preconditions

auto is_in_values (int val, int min, int max)
  -> bool // true iff val is in the values [min, max]
     [[pre (min <= max)]]
;

auto is_in_container (int val, int idx_min, int idx_max)
  -> bool // true iff container[i]==val for i in [idx_min, idx_max]
     [[pre (0       <= idx_min
         && idx_min <= idx_max
         && idx_max <  container.size())]]          // see note [5]
;

template <typename T, typename Iter>
auto is_in_range (T val, Iter first, Iter last)
  -> bool // true iff *i==val for some i in [first,last)
     [[pre (/*... is_reachable? is_not_dangling? hmm ...*/)]]
;

For the first two functions, we can write clear preconditions that can check the “wrong value” bugs.

In these particular examples, the best place to write the preconditions is right on the constructors of the class types we saw in (b), and if we write them there then we don’t have to repeat them as explicit contracts on every function.

But is (b) always better than (c), in other examples? This brings us to our last question, which is all about “can” versus “should”…

3. Consider these three examples, where each shows expressing a boolean condition either as a function precondition or as an encapsulated invariant inside a new type… In each of these cases, which way is better? Explain your answer.

In Question 2, writing a type was often the best choice, but it isn’t always.

The benefits to writing a type include:

  • Encapsulation. We limit the code that is responsible for maintaining the boolean condition.
  • Language support. We get the help of the type system to statically enforce requirements.

But there are costs and limitations too:

  • What’s the abstraction? There may not be a suitable one. We can’t write a good type unless we can discover a useful abstraction that the type’s interface should support. A good type represents a useful reusable domain abstraction that programmers can understand and that makes their code clearer by elevating the vocabulary of the code. There won’t always be a practical and reusable abstraction; when there isn’t, we won’t be able to write a useful and reusable type. — Even when there is, we have to design that all ahead of time, which requires a lot more advance knowledge and engineering than just writing ad-hoc boolean conditions on individual functions.
  • What’s the cost? It may not be feasible to maintain the invariant. We have to do any extra work it takes to maintain the invariant, and it has to be practical to do. When it isn’t, we can’t maintain the invariant without help from outside code, and so we won’t be able to really encapsulate it properly.
  • Does it make sense as an independent abstraction? Will the user be carrying around objects of this type, or are we just jamming a precondition common to a few functions (or only one) into a type and calling it useful? Occam’s Razor: Don’t multiply entities beyond necessity.
  • What’s the type the caller is using? This is where a real usable abstraction shines, because many callers will be using it independently of calling our function. But if the caller isn’t using this type, then there typically has to be an implicit or explicit conversion (because inheritance from all argument types our callers might already have usually isn’t an option), and that conversion would need to be usable and sufficiently cheap.

GUIDELINE: Remember that types and contracts are “better together.” Use both. They are complementary, neither is a substitute for the other. All we are trying to accomplish with contracts is to augment the language’s static type checking with runtime checking where that is more appropriate because we can’t design a practical abstraction. And this is why we want contracts on functions (preconditions, postconditions) even though we already have types, and why we also want contracts on types (invariants).

Let’s consider the three examples.

(a) A vector that is sorted

template <typename T>
void f( vector<T> const& v ) [[pre( is_sorted(v) )]] ;

template <typename T>
void f( sorted<vector<T>> const& v );

If this looks familiar, it’s because is_sorted is one of the classic examples we saw in GotW #98 of conditions that are often impractical to check and enforce as an assertion, in this case a precondition.

Can we do better by making it a type, perhaps a sorted wrapper around a container like vector that maintains the guarantee that it’s always sorted? Well, we have to answer some questions about a sorted<T>:

  • What’s the abstraction it provides? It can’t easily fulfill the requirements of a sequence container like vector itself; for example, push_back doesn’t make much sense because letting the caller insert an arbitrary value at a specific location would easily cause the container to be unsorted. Instead, it would naturally want a more general insert function instead, and the interface would be more like set. This part could be workable.
  • What’s the cost? This where it starts to breaks down: Keeping a vector sorted all the time means that every insertion would cost O(N) work all the time. Which leads into…
  • Does it make sense as an independent abstraction? … that it’s very common for code to maintain an “almost-sorted” vector, such as by inserting new elements at the end which is fast (and, hmm, affects our abstraction design, because then it would make sense to have push_back after all, wouldn’t it? hmm) but leaves a suffix of unsorted elements in the container, and then periodically sorting the whole container so that the sorting cost is amortized. But an almost-sorted vector isn’t good enough, and so doesn’t fit the bill. We don’t have empirical evidence of such types in general use.
  • What’s the type the caller is using? And now we’re busted all the way, because we want this interface to be usable by anyone who has a vector<T>, which would require a conversion to sorted<vector<T>>. If we do a deep copy, that’s prohibitively expensive. Even if the conversion is lightweight by avoiding a deep copy, such as by just wrapping an existing vector object, it wouldn’t be very useful unless it did O(N) work every time unconditionally to verify the invariant. And even then the abstraction design is affected and compromised: If the user can still see and modify the original vector, then that’s still part of the accessible interface to the data, so the user can make the container be not fully sorted and we’re unable to really encapsulate and maintain our intended invariant.

So is_sorted is much better as a function precondition.

// (b) A vector that is not empty

template <typename T>
void f( vector<T> const& v ) [[pre( !v.empty() )]] ;

template <typename T>
void f( not_empty<vector<T>> const& v );

This one is more feasible as a type, but still not ideal:

  • What’s the abstraction it provides? It’s a vector, and we can make the interface identical to vector with just extra preconditions on pop and erase functions to not remove the last element in the container.
  • What’s the cost? Emptiness is cheap enough to check and maintain.
  • Does it make sense as an independent abstraction? This is where it starts to get questionable… the answer is at best “maybe.” It’s not clear to me than a “nonempty vector” is a generally useful abstraction.
  • What’s the type the caller is using? This is where I think we break down again. Again, we want this interface to be usable by anyone who has a vector<T>, and that means a conversion to not_empty<vector<T>>. If we do a deep copy, that’s prohibitively expensive. This time if we just wrap an existing vector object to avoid the deep copy, the check is cheap. But then we still have the problem that the abstraction design is affected and compromised so that it can’t maintain its invariant, because if the user can still see and modify the original vector, they can remove the last element on us.

So not_empty seems better as a function precondition.

(c) A pointer that is not null

void f( int* p ) [[pre( p != nullptr )]] ;

void f( not_null<int*> p );

This time we can do better:

  • What’s the abstraction it provides? This one’s easy to state: It’s a not-null pointer. That’s a far simpler interface than a container, because we just need operator* and operator->, construction, destruction, and copying. Even so it’s not totally without subtlety, because not_null should not have move operations that modify the source object. This means that a not_null<unique_ptr<T>> is legal but there’s not much you can do with it besides dereference it and destroy it: It can’t be copyable because unique_ptr isn’t copyable, and it must not be movable because moving a unique_ptr leaves the source null.
  • What’s the cost? Nullness is cheap enough to check and maintain.
  • Does it make sense as an independent abstraction? Definitely. A “non-null pointer” has been widely rediscovered and reinvented as a generally useful abstraction.
  • What’s the type the caller is using? A not_null<int*> is a useful object in its own right in the calling code, independently of calling this particular function. And if our function is invoked by someone who has only an ordinary int*, doing a full copy of the pointer is cheap, and applying the nullness check as a precondition on that converting constructor is exactly equivalent to writing the precondition by hand, but is automated.

So not_null seems better as a type, primarily because it is independently useful. This is why it has been reinvented a number of times, including as gsl::not_null. [6]

GUIDELINE: Wherever practical, design interfaces so that incorrect call sites are illegal (won’t compile, using the type system) or loud (won’t pass unit tests, using preconditions). This is a key part of achieving the goal to “make interfaces easy to use correctly, and hard to use incorrectly.” Preconditions directly help with that by letting us catch entire groups of errors at test time, and are a complement to the type system which makes incorrect uses “not fit” through the compiler and also carries extra preconditions around for us in the form of invariants.

GUIDELINE: Remember that the type system is a hammer, and not every precondition is a nail. The type system is a powerful tool, but not every precondition is naturally (part of) an invariant of a useful type that provides a good reusable abstraction that’s generally useful independently of this function.

Notes

[1] A. Krzemieński. “Contracts, preconditions and invariants” (Andrzej’s C++ blog, December 2020).

[2] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and “ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20 for this article I also modified the syntax from : to ( ), and to name the return value _return_ for postconditions. That’s not a statement of preference, it’s just so the examples can compile today to make them easier to check.

[3] For 2(a) and 2(b), on platform ABIs that do not pass small structs/classes in registers, turning individual parameters into a struct/class could cause them to be passed in stack memory instead of in registers.

[4] Upcoming GotWs will cover invariants and violation handling.

[5] If C++ gets chained comparisons as proposed in P0515 and P0893 we could write this much more clearly, and with fewer opportunities for mistakes, as:

[[pre( 0 <= idx_min <= idx_max < container.size() )]]

[6] B. Stroustrup and H. Sutter (eds.) “I.12 Declare a pointer that must not be null as not_null” (C++ Core Guidelines.) If the not_null<T> type we are using is implicitly convertible from T, which is the intent of I.12 to provide a drop-in replacement for pointer parameters, then the usability is the same as with the precondition. Otherwise, the caller has to provide a not_null argument at the call site, either by doing an explicit conversion or by just using a not_null local variable in their own body.

Acknowledgments

Thank you to the following for their feedback on this material: Joshua Berne, Gabriel Dos Reis, J. Daniel Garcia, Gábor Horváth, Andrzej Krzemieński, Bjarne Stroustrup, Andrew Sutton, Ville Voutilainen

Trip report: Winter 2021 ISO C++ standards meeting (virtual)

Today, the ISO C++ committee held its second full-committee (plenary) meeting of the pandemic and adopted a few more features and improvements for draft C++23.

A record of 18 voting nations sent representatives to this meeting: Austria, Bulgaria, Canada, Czech Republic, Finland, France, Germany, Israel, Italy, Japan, Netherlands, Poland, Romania, Russia, Spain, Switzerland, United Kingdom, and United States. Japan had participated in person during C++98 and C++11, and has always given us good remote ballot feedback during C++14/17/20, and is attending again now; welcome back! Italy and Romania are our newest national bodies; welcome!

Our virtual 2021

We continue to have the same priorities and the same schedule we originally adopted for C++23. However, since the pandemic began, WG21 and its subgroups have had to meet all-virtually via Zoom, and we are not going to try to have a face-to-face meeting in 2021 (see What’s Next below). Some subgroups had already been having virtual meetings for years, but this was a major change for other groups including our two main design groups – the language and library evolution working groups (EWG and LEWG). In all, over the past year we have held approximately 200 virtual meetings.

Today: A few more C++23 features adopted

Today we formally adopted a second round of small features for C++23, as well as a number of bug fixes. Below, I’ll list some of the more user-noticeable changes and credit all those paper authors, but note that this is far from an exhaustive list of important contributors… even for these papers, nothing gets done without help from a lot of people and unsung heroes, so thank you first to all of the people not named here who helped the authors move their proposals forward! And thank you to everyone who worked on the adopted issue resolutions and smaller papers I didn’t include in this list.

P1102 by Alex Christensen and JF Bastien is the main noticeable change we adopted for the core language itself. It’s just a tiny bit of cleanup, but one that I’m personally fond of: In C++23 we will be able to omit empty ( ) lambda parameter lists even when we have to declare the lambda mutable. I’m the one who proposed the lambda syntax we have today (except for the mutable part which wasn’t mine and I never liked), including that it enabled making unused parts of the syntax optional so that we can write simple lambdas simply. For example, today we can already write

[x]{ return f(x); }

as a legal synonym for

[x] () -> auto { return f(x); }

and omit the empty parameter list and deduced return type. Even so, I’ve noticed a lot of people write the ( ) part anyway, which isn’t wrong or anything, it’s just that often they write it because they don’t know they can omit it too. And part of the problem was the oddity in pre-C++23 that if you need to write mutable, then you actually do have to also write the ( ) (but not the return type), which was just weird but was another reason for people to just write ( ) all the time, because sometimes they had to. With P1102, we don’t have to. That’s more consistent. Thanks, Alex and JF!

In the spirit of “completing C++20,” P2259 by Tim Song makes several fixes to iterator_category to make it work better with ranges and adaptors. Here is an example of code that does not compile today for arcane reasons (see the paper), but will be legal C++23 thanks to Tim:

std::vector<int> vec = {42};
auto r = vec | std::views::transform([](int c) { return std::views::single(c);})
             | std::views::join
             | std::views::filter([](int c) { return c > 0; });
r.begin();

Further in the “completing C++20” spirit, P2017 by Barry Revzin fixes some additional glitches in ranges to make them work better. Here is an example of safe and efficient code that does not compile today, where for arcane reasons the declaration of e isn’t supported and today’s workaround is to make the code more complex and less efficient. This will be legal C++23 thanks to Barry:

auto trim(std::string const& s) {
    auto isalpha = [](unsigned char c){ return std::isalpha(c); };
    auto b = ranges::find_if(s, isalpha);
    auto e = ranges::find_if(s | views::reverse, isalpha).base();
    return subrange(b, e);
}

P2212 by Alexey Dmitriev and Howard Hinnant generalizes time_point::clock to allow for greater flexibility in the kinds of clocks it supports, including stateful clocks, external system clocks that don’t really have time_points, representing “time of day” as a distinct time_point, and more.

P2162 by Barry Revzin takes an important first step toward cleaning up std::visit and lay the groundwork for its further generalization. Even if you don’t yet love std::visit, it’s a useful tool that P2162 makes more useful by making it work more regularly. We expect to see further generalization in the future, which is much easier to do with a cleaner and more regular existing feature to build upon.

Finally, I saw cheers and celebratory emoji erupt in the Zoom chat window when we adopted P1682 by JeanHeyd Meneide. It’s very small, but very useful. When passing an enum to an API that uses the underlying type, today we have to write a static_cast to the std::underlying_type, which makes us repeat the enum’s name and so is cumbersome all the time and brittle for type-safety under maintenance if we change to use a different enum:

some_untyped_api( static_cast<std::underlying_type_t<ABCD>>(some_value) );

Thanks to JeanHeyd, in C++23 we will be able to write:

some_untyped_api( std::to_underlying(some_value) );

Note that of course standard library vendors don’t have to wait until 2023 to provide to_underlying or any of these other fixes and improvements. Just having a feature like this one voted into the draft standard is often enough for vendors to be proactive in providing it… these days, vendors are more closely tracking our draft standard meeting by meeting rather than waiting for the official release, in part because we are shipping regularly and predictably and we don’t vote features into the draft standard until we think they’re pretty well baked so that vendors have less risk in implementing them early.

We also adopted a number of other issue resolutions and small papers that made additional improvements.

Finally, we came close to adopting P0533 by Edward Rosten and Oliver Rosten, which is about adding constexpr to many of the functions in math.h that we share with C. This is clearly a Good Thing and therefore many voted in favor of adopting the paper. The only hesitation that stopped it from getting consensus this time were concerns that it needed more time to iron out how implementations would implement it, such as how to deal with errno in a constexpr context. This is the kind of question that often arises when we want to make improvements to entities declare in the C headers, because not only are they governed by the C standard rather than the C++ standard, but typically they are provided and controlled by the operating system vendor rather than by the C++ compiler/library writer, and those constraints always mean a bit of extra work when we want to make improvements for C++ programmers and remain compatible. As far as I know, everyone wants to see these functions made constexpr, so we expect to see this paper come to plenary again in the future. Thanks for your perseverance, Edward and Oliver!

What’s next

As long as we are meeting virtually, we will continue to have virtual plenaries like the one we had this week to formally adopt new features as they progress through subgroups. Our next two virtual plenaries to adopt features into the C++23 working draft will be held in June and November. Progress will be slower than when we can meet face-to-face, and we’ll doubtless defer some topics that really need in-person discussion until we can meet again safely, but in the meantime we’ll make what progress we can and we’ll ship C++23 on time.

The next tentatively planned face-to-face meeting is February 2022 in Portland, OR, USA; however, we likely won’t know until well into the autumn whether we’ll be able to confirm that or need to postpone it. You can find a list of our meeting plans on the Upcoming Meetings page.

Thank you again to the hundreds of people who are working tirelessly on C++, even in our current altered world. Your flexibility and willingness to adjust are much appreciated by all of us in the committee and by all the C++ communities! Thank you, and see you on Zoom.

Firsts in 2020 (or, A little dose of good news)

2020 has been mostly terrible. That includes for the C++ committee and many of our communities, where just this month we lost Beman Dawes. Beman was one of the most important and influential C++ experts in the world, and made his many contributions mostly behind the scenes. I and everyone else who has ever benefited from any of the standardized STL, Boost, C++Now, std::filesystem, C++98/11/14/17, and more — so, really, most people who have ever used C++ — all owe Beman a debt of gratitude. We miss him greatly.

To end the year with a little dose of good news, I thought I’d mention a just few positive C++ accomplishments that did happen for 2020, and were happier “first-ever” achievements.

First, the big one…

C++20 is the first ever “D&E-complete” release of C++. In February, we completed C++20, which is the first release of Standard C++ that includes every feature that Bjarne Stroustrup envisioned for C++’s evolution in his 1994 book The Design and Evolution of C++ (aka D&E), including concepts, coroutines, modules, and more, except only for one minor feature (unified function call syntax). Thank you to Bjarne for sticking with it until we got there, and personally doing the heavy lifting to drive important features like concepts into Standard C++!

C++20 is the first release of C++ that added a feature that made the standard smaller. When I talk about the importance of simplifying C++ by judiciously adding features that let programmers express their intent directly, some people legitimately object that adding a feature makes C++ bigger and more complex. I reply “but it makes C++ code simpler” and “if it replaces something more complex then we can teach a simpler C++ for new code,” but those effects have been hard to measure concretely. Now in C++20 for the first time we added a new feature that made the standard smaller: We added the C++20 spaceship operator to the language, but we also applied it throughout the C++ standard library and that made the library specification nearly 20 pages shorter — a net reduction. So for the first time we can measure that, yes, adding a feature to C++ can make C++ smaller. Thank you to everyone who helped me with that proposal and who are listed in the Acknowledgements in the link, and especially to Walter E. Brown, Jens Maurer, Barry Revzin, and David Stone!

First year for all-virtual standards meetings, including EWG and LEWG. Since March, for the first time all major subgroups including the two main design subgroups of EWG (language) and LEWG (library) have been having virtual meetings by telecon or Zoom and making progress in between face-to-face meetings. We’ve also had a record number of nearly 20 virtual subgroup meetings on average per month. It’s great to see that, despite the pandemic, the committee has continued work on C++23 and other long-pole features, and in November we were able to formally adopt the first C++23 features into our brand-new C++23 working draft. Thank you once again to JF Bastien (EWG), Bryce Lelbach (LEWG), and their assistants, and all the other subgroup chairs and participants for patiently supporting these changes that we had to invent and transition to at short notice, and as we continue to work out the kinks as we go!

Many first virtual conferences. And of course 2020 saw many of our C++ conferences hold their first virtual events (and create new ones like Pure Virtual C++), in the face of huge uncertainties and technical challenges with bleeding-edge technologies, to make it all work far more smoothly than we really would have had any right to expect on such short notice. Thank you to the organizers for working so busily behind the scenes to make it possible to have a facsimile of our face-to-face conferences until we can meet again in person!

Here’s hoping that by this time next year we will all be doing better in every way, and have a happier 2021 to reflect upon. Thank you again, everyone, for your interest in C++ and support for our many C++ events, forums, and other communities large and small, and best wishes for a great 2021.

Trip report: Autumn ISO C++ standards meeting (virtual)

On Monday, the ISO C++ committee completed its final full-committee (plenary) meeting of 2020 and adopted the first changes to the C++23 working draft, including a few new features.

This was a first in several ways: It was our first-ever virtual plenary, held online via Zoom. It was also our first-ever plenary meeting that wasn’t held at the end of a long around-the-clock week of intensive subgroup meetings; instead, it was held at the end of nearly nine months of virtual subgroup meetings.

Our virtual 2020

The pandemic was just getting started when we held our February meeting in Prague, Czech Republic. Since then it has of course been impossible to meet in person; as I mentioned before, our ISO C++ meetings are virtual until further notice, but we continue to have the same priorities and the same schedule for C++23.

So since the pandemic began, WG21 subgroups have been meeting virtually via Zoom. Some subgroups had already been having virtual meetings for years, but this was a major change for other groups including our two main design groups – the language and library evolution working groups (EWG and LEWG).

In all, since Prague we have held about 150 virtual meetings. When lots of subgroups are meeting, some of them weekly, those meetings add up!

This week: First C++23 features adopted

On Monday we formally adopted the first features of C++23, including the first C++23 language feature, as well as a number of bug fixes.

First up was P0330 by JeanHeyd Meneide, which adds a literal suffix for (signed) size_t, so in C++23 we will be able to write literals like 100uz. (I wonder whether uz will be pronounced Uzi.) See the many excellent side-by-side examples in JeanHeyd’s paper for how this helps make uses of size_t safer and more convenient especially in naked for loops iterating over containers. Congratulations to JeanHeyd for C++23’s first language extension, and also for his persistence with this paper – the adopted version is revision 8, and that number and the paper’s change history indicates the level of rigor that can be required to get a feature into C++. Many thanks!

P1679 by Wim Leflere and Paul Fee add a basic_string::contains function so we can write code like if (str.contains(substr)) std::cout << “found!\n”; … I can already hear the chorus of “finally!”

P0881 by Alexey Gorgurov and Antony Polukhin add a stacktrace library to C++23. This is a much-anticipated extension based on Boost.Stacktrace that will enable much easier-to-debug portable diagnostic messages.

The charmingly numbered P1048 by Juan Alday gives us an is_scoped_enum type trait to detect when an enumeration is defined using the new-style (C++11, but well it’s still “new”!) enum class. As the paper points out, this is particularly useful as a migration aid, including to write code that detects and measures the adoption of “enum class” over plain old “enum.”

Finally, P0943 by Hans Boehm supports C atomics (spelled _Atomic) in C++ where the two did not already overlap, which helps write headers that work in both C and C++. (The adopted version is R6 which should be published in the next few weeks.) This is one example of the ongoing extra coordination we’ve had lately between the C and C++ committees, which leads to the next thing we did…

New SG22: C/C++ liaison

We appointed a new study group, SG22, for C/C++ liaison. This is a unique study group, because it is shared jointly by both the C and C++ committees, and it continues the tradition of closer coordination between the two committees. Thank you to WG14 (C) and its chair David Keaton for their continued interest in coordinating the two languages, to Aaron Ballman for agreeing to chair this new group, and for our WG14 and WG21 project editors Thomas Köppe, JeanHeyd Meneide, and Richard Smith to serve as assistant chairs. Thank you all for being willing to step up!

Other updates

Thank you to Richard Smith for his work for many years as project editor for the C++ standard, and completing C++20 this month! Thank you also to the many of you who have helped Richard and shared the editing workload by providing PRs and proofreading to apply plenary resolutions; that has been very much appreciated by Richard and by all of us especially given that C++20 is a “big” release with many new features, all of which have created an unusually high amount of editing work for this release.

Starting now, as we begin C++23, Thomas Köppe has graciously agreed to step up to be our primary project editor for the standard, with Richard as backup project editor. Thank you Thomas, and thank you again Richard and to all who have helped with the editing for the C++ IS!

Next steps

While we are meeting virtually until further notice, we will continue to have virtual plenaries like the one we had this week to formally adopt new features as they progress through subgroups. Our next virtual plenary will be in February, on the Monday of what would have been the Kona meeting.

Progress during this time will be slower than when we can meet face-to-face, and we’ll doubtless defer some topics that really need in-person discussion until we can meet again safely, but in the meantime we’ll make what progress we can and we’ll ship C++23 on time.

Thank you again to the hundreds of people who are working tirelessly on C++, even in our current altered world. Your flexibility and willingness to adjust are much appreciated by all of us in the committee and by all the C++ communities! Thank you, and see you on Zoom.

My plans at CppCon

It’s hard to believe CppCon 2020 is nearly here… in fact, pre-conference tutorials are already in progress.

I’ll be at the conference throughout the week in the hallways and session rooms. Here are some of the times I’ll be participating on the actual program:

  • Sunday 1300 MDT: Organizer’s Panel. In the middle of the Welcome Reception, we’re holding an Organizer’s Panel where several of us organizers will be talking about what to expect in the week ahead and available for extensive Q&A to answer any questions you may have. (We will have live music during the whole Welcome Reception, so be sure to watch the chat window for how to join that stream if you want to listen to our CppCon house band leader, Jim Basnight, perform for us.)
  • Monday 1500 MDT: My AMA. I’ll be available for an “ask me anything” session. Attendees can ask questions across the board from C++20 features, to how the committee is working during the pandemic on C++23, to specific C++ features I’ve designed or contributed to like concurrency or structured bindings or the spaceship operator, or other things you may think of.
  • Tuesday 0730 MDT: Committee Fireside Chat Panel (moderator). This year, I won’t be a panelist myself, so I don’t plan to answer any questions. Instead, I’ll be the panel moderator, and we have a great slate of panelists again this year: Bjarne Stroustrup (of course), Bryce Adelstein Lelbach (library evolution subgroup chair), Hana Dusíková (compile-time programming subgroup chair), Inbal Levi (Israel national chair), JC Van Winkel (Netherlands national chair and teaching subgroup chair), JF Bastien (language evolution subgroup chair), Michael Wong (low-latency/gaming/embedded subgroup chair and AI subgroup chair), and Tony Van Eerd (expert in many subgroups and popular speaker). I can’t wait!
  • Wednesday 0730 MDT: Bjarne Stroustrup AMA (moderator). This is Bjarne’s AMA, I’m just the moderator.
  • (will be recorded) Friday 1330 MDT: Empirically Measuring, and Reducing, C++’s Accidental Complexity. This is my one actual talk, and it’s the last talk of the conference. It will be a major update of the talk I’ve given publicly one time before in Prague earlier this year, which after a broader intro focused specifically on parameter passing as an example of where we could dramatically simplify C++. This time I’ll include lots of updates, including that I hope to demo a working compiler implementation of the proposal.

Notes:

  • Denver time zone (MDT) is the default, which is where the physical CppCon usually happens. The Sched.org view lets you show the schedule in your own time zone.
  • The conference starts Sunday. You don’t have to wait for Bjarne’s opening keynote on Monday… if you’re attending, be sure to make use of the Open House from 0900-1200 MDT where you can wander around, and especially the Welcome Reception from 1200-1430 MDT which includes the first panel (see above).

I look forward to seeing many of you there!

C++20 approved, C++23 meetings and schedule update

A couple of interesting things happened in the ISO C++ world this week…

C++20 passed unanimously, on track to publish later this year

On Friday September 4, C++20’s DIS (Draft International Standard) ballot ended, and it passed unanimously. This means that C++20 has now received final technical approval and is done with ISO balloting, and we expect it to be formally published toward the end of 2020 after we finish a final round of ISO editorial work.

As always, we are not counting on ISO’s publication speed to call it C++20, it’s C++20 because WG21 completed technical work in February. If for some reason ISO needs until January to get it out the door and assigns it a 2021 publication date, the standard will still be referred to as C++20. That is already its industry name, and 300,000+ search hits can’t be (retroactively made) wrong!

ISO C++ meetings are virtual until further notice, Kona postponed

A month ago, I notified the committee that our face-to-face meetings will be postponed until further notice. We still need to plan for face-to-face meetings so that we’re ready to resume when that’s possible and safe, but for now all currently planned meetings should be viewed as “tentative.”

Among other constraints such as national and corporate travel restrictions, we are subject to face-to-face meeting bans from several parent organizations. Two of those extended or enacted face-to-face meeting bans this week, on Tuesday September 1:

  • INCITS, the U.S. standards body, extended its face-to-face meeting ban through March 31, 2021. This means that our Kona meeting planned for February is now formally postponed to an unspecified future date.
  • ISO SC22, our corner of the international organization for standardization that handles programming languages, resolved to ban face-to-face meetings of more than 100 people until further notice. Since our meetings lately have regularly seen over 200 attendees, we’re currently evaluating how this affects future post-Kona tentative meeting plans.

All of these bans are subject to further extension, and we won’t meet in person again until it’s safe to do so. As of this writing, our next tentative face-to-face meeting would be the rescheduled Varna meeting, in the first week of June 2021, but that should be viewed as the earliest possible resumption of meetings. As the pandemic develops and INCITS and ISO meeting bans and other restrictions are extended, it’s certainly possible that we may not be able to meet again in 2021 at all. We’ll see.

In the meantime, though, we’re still making progress on our work: For several years, we have already been holding regular virtual meetings for some of our subgroups, including study groups (SGs) and CWG and LWG (language and library specification wording). Since the pandemic started, EWG and LEWG (language and library evolution, our primary design subgroups) have also begun meeting virtually, and we are continuing to adjust our process for how to approve design changes to progress proposals while not meeting in person. And starting in November, we will begin having virtual plenary (whole-group) meetings to formally approve changes, including potentially new features, to the C++23 working paper…

C++23 schedule and priorities

The C++23 schedule (P1000R4) and C++23 priorities (P0592R4) are unaffected by the pandemic. You may find this surprising, but that’s because the committee is on a “train model” that focuses on schedule and priorities for each release, instead of a specific feature set. One of the benefits of the train model is that it is very resilient, and can handle even major disruptions without change. We have already been in the mode of working on features all the time, including long-pole features that take many years, and each regular release train includes “whatever’s ready” with the next train opening up as soon as the previous one ships. So, that is unchanged.

What has changed, of course, is the speed at which we can work on features during the coming period. The pandemic disruptions have impacted all our lives, and reduced the time and energy WG21 participants have for standards work as well as our capacity to make progress face to face three times a year, and this has slowed down development of features we’re working on now that will land in { C++23, C++26, C++29 } . No virtual process will fully compensate for the lack of intense week-long face-to-face meetings, but as usual we’ll continue to make progress on baking features according to the P0592R4 priorities, including issue resolutions and an emphasis on completing C++20, and as usual we’ll load each feature into the currently loading train as the feature becomes ready. So progress continues, and the trains will continue to run on time to ship everything that’s ready.

Of course, the ISO C++ committee isn’t the only part of the C++ world that has “gone virtual” this year. We’ve been enjoying many virtual conferences, and just a week from now we’ll start the biggest C++ conference of the year: CppCon 2020, all online. I look forward to seeing many of you there, including literally seeing you at the video chat tables and in my AMA Q&A session early in the week, and the Committee Fireside Chat panel on Tuesday.

Thanks for your interest in C++ and C++ standardization! Be safe, everyone.

C++ on Sea video posted: Bridge to NewThingia (extended)

Two weeks ago, I had the privilege of speaking at the C++ on Sea 2020 virtual conference. The video of my talk has now been posted — it’s an extended version of the talk I gave at DevAroundTheSun in April. You can find it here:

Thanks very much to Phil Nash and all the other organizers and volunteers who made the virtual event run so smoothly! And thanks also to the post-processing team who did all the videos, including that they did a smooth job of fixing my video hiccup in the middle when I lost my link and had to reconnect to resume (entirely my pilot error).

A software note: I really enjoyed the Remo platform that C++ on Sea used this year. It allowed for lots of attendee interaction that’s surprisingly like mingling at a physical conference… you are in what looks and feels like a room with lots of tables, just like at a conference, and can easily talk face to face (video and audio) with all the folks who are at your table, and move from table to table to mix and mingle. It really felt seamless. I enjoyed spending time talking with some of you there that way — it was great to meet new people as well as see some familiar faces, and I look forward to doing it again soon this September because the current plan is for the also-all-virtual CppCon 2020 to use the same software. So, I hope to literally see many of you then! In the meantime, I hope you enjoy this talk.

AMA @cpp_russia on July 2

undefinedC++ Russia is an online event this year, and I’m happy to be one of many C++ folks to be invited to participate. On July 2 I’ll be doing a Q&A session, which is the first time I’m doing an “AMA” — no talk, just Q&A and discussion. I’m looking forward to it, and to see what kinds of questions are on people’s minds.

And don’t miss Bjarne Stroustrup’s similar AMA-style session at the same event, two days earlier on June 30!