C++20 approved, C++23 meetings and schedule update

A couple of interesting things happened in the ISO C++ world this week…

C++20 passed unanimously, on track to publish later this year

On Friday September 4, C++20’s DIS (Draft International Standard) ballot ended, and it passed unanimously. This means that C++20 has now received final technical approval and is done with ISO balloting, and we expect it to be formally published toward the end of 2020 after we finish a final round of ISO editorial work.

As always, we are not counting on ISO’s publication speed to call it C++20, it’s C++20 because WG21 completed technical work in February. If for some reason ISO needs until January to get it out the door and assigns it a 2021 publication date, the standard will still be referred to as C++20. That is already its industry name, and 300,000+ search hits can’t be (retroactively made) wrong!

ISO C++ meetings are virtual until further notice, Kona postponed

A month ago, I notified the committee that our face-to-face meetings will be postponed until further notice. We still need to plan for face-to-face meetings so that we’re ready to resume when that’s possible and safe, but for now all currently planned meetings should be viewed as “tentative.”

Among other constraints such as national and corporate travel restrictions, we are subject to face-to-face meeting bans from several parent organizations. Two of those extended or enacted face-to-face meeting bans this week, on Tuesday September 1:

  • INCITS, the U.S. standards body, extended its face-to-face meeting ban through March 31, 2021. This means that our Kona meeting planned for February is now formally postponed to an unspecified future date.
  • ISO SC22, our corner of the international organization for standardization that handles programming languages, resolved to ban face-to-face meetings of more than 100 people until further notice. Since our meetings lately have regularly seen over 200 attendees, we’re currently evaluating how this affects future post-Kona tentative meeting plans.

All of these bans are subject to further extension, and we won’t meet in person again until it’s safe to do so. As of this writing, our next tentative face-to-face meeting would be the rescheduled Varna meeting, in the first week of June 2021, but that should be viewed as the earliest possible resumption of meetings. As the pandemic develops and INCITS and ISO meeting bans and other restrictions are extended, it’s certainly possible that we may not be able to meet again in 2021 at all. We’ll see.

In the meantime, though, we’re still making progress on our work: For several years, we have already been holding regular virtual meetings for some of our subgroups, including study groups (SGs) and CWG and LWG (language and library specification wording). Since the pandemic started, EWG and LEWG (language and library evolution, our primary design subgroups) have also begun meeting virtually, and we are continuing to adjust our process for how to approve design changes to progress proposals while not meeting in person. And starting in November, we will begin having virtual plenary (whole-group) meetings to formally approve changes, including potentially new features, to the C++23 working paper…

C++23 schedule and priorities

The C++23 schedule (P1000R4) and C++23 priorities (P0592R4) are unaffected by the pandemic. You may find this surprising, but that’s because the committee is on a “train model” that focuses on schedule and priorities for each release, instead of a specific feature set. One of the benefits of the train model is that it is very resilient, and can handle even major disruptions without change. We have already been in the mode of working on features all the time, including long-pole features that take many years, and each regular release train includes “whatever’s ready” with the next train opening up as soon as the previous one ships. So, that is unchanged.

What has changed, of course, is the speed at which we can work on features during the coming period. The pandemic disruptions have impacted all our lives, and reduced the time and energy WG21 participants have for standards work as well as our capacity to make progress face to face three times a year, and this has slowed down development of features we’re working on now that will land in { C++23, C++26, C++29 } . No virtual process will fully compensate for the lack of intense week-long face-to-face meetings, but as usual we’ll continue to make progress on baking features according to the P0592R4 priorities, including issue resolutions and an emphasis on completing C++20, and as usual we’ll load each feature into the currently loading train as the feature becomes ready. So progress continues, and the trains will continue to run on time to ship everything that’s ready.

Of course, the ISO C++ committee isn’t the only part of the C++ world that has “gone virtual” this year. We’ve been enjoying many virtual conferences, and just a week from now we’ll start the biggest C++ conference of the year: CppCon 2020, all online. I look forward to seeing many of you there, including literally seeing you at the video chat tables and in my AMA Q&A session early in the week, and the Committee Fireside Chat panel on Tuesday.

Thanks for your interest in C++ and C++ standardization! Be safe, everyone.

C++ on Sea video posted: Bridge to NewThingia (extended)

Two weeks ago, I had the privilege of speaking at the C++ on Sea 2020 virtual conference. The video of my talk has now been posted — it’s an extended version of the talk I gave at DevAroundTheSun in April. You can find it here:

Thanks very much to Phil Nash and all the other organizers and volunteers who made the virtual event run so smoothly! And thanks also to the post-processing team who did all the videos, including that they did a smooth job of fixing my video hiccup in the middle when I lost my link and had to reconnect to resume (entirely my pilot error).

A software note: I really enjoyed the Remo platform that C++ on Sea used this year. It allowed for lots of attendee interaction that’s surprisingly like mingling at a physical conference… you are in what looks and feels like a room with lots of tables, just like at a conference, and can easily talk face to face (video and audio) with all the folks who are at your table, and move from table to table to mix and mingle. It really felt seamless. I enjoyed spending time talking with some of you there that way — it was great to meet new people as well as see some familiar faces, and I look forward to doing it again soon this September because the current plan is for the also-all-virtual CppCon 2020 to use the same software. So, I hope to literally see many of you then! In the meantime, I hope you enjoy this talk.

AMA @cpp_russia on July 2

undefinedC++ Russia is an online event this year, and I’m happy to be one of many C++ folks to be invited to participate. On July 2 I’ll be doing a Q&A session, which is the first time I’m doing an “AMA” — no talk, just Q&A and discussion. I’m looking forward to it, and to see what kinds of questions are on people’s minds.

And don’t miss Bjarne Stroustrup’s similar AMA-style session at the same event, two days earlier on June 30!

Talk video available: Bridge to NewThingia @DevAroundTheSun

Last month, I gave a new talk “Bridge to NewThingia” at DevAroundTheSun. Using examples from the evolution of programming languages and a few other tech products, it analyzes some key design factors that let you confidently answer the question, “why will your NewThing succeed, when a lot of things that look like it have failed in the past?”

The talk video is now available on YouTube, linked below. I hope you enjoy it, and thanks again to the organizers of DevAroundTheSun for putting together this global online event.

The New York ISO C++ meeting is postponed

A few minutes ago, I announced to the ISO C++ committee that all our meetings originally planned for 2020 have been postponed. We had already postponed the Varna meeting originally planned for June 1-6, and earlier today INCITS (the U.S. national body) announced that it was banning all face-to-face standards meetings for the rest of the year, so we are also postponing the New York meeting previously planned for November 9-14. We deeply appreciate all the work the Varna and New York hosts have put into planning to have us in town, and we still look forward to meeting there as soon as it is safe to do so.

The meeting after that is currently still planned in Kona for February 22-27, but that too is now under review as we monitor developments including any possible further ISO or INCITS meeting ban extensions. We will not meet unless we are permitted to do so and can be confident we can meet safely.

It still feels surreal that in just one week from now we would all have been traveling to Bulgaria for the first meeting of C++23. But that was in a different world, and reality and facts matter. In the meantime, the ISO C++ committee and the C++ community have been moving online, with more ISO subgroup meetings already working on C++23 features and more C++ community events taking place in wonderful locations called “Zoom” and “YouTube Live,” instead of in Varna and New York, at least for now. Thank you again to everyone who is organizing those meetings and events for us, including the recent Pure Virtual C++ online event and the upcoming C++ Europe and C++ Russia and C++ on Sea online conferences, and still more to come!

Stay safe, everyone. We hope that you and your loved ones are holding up well and that we’ll be able to see each other again soon.

Of feedback, and little things

I try hard to always ask for feedback on drafts of my talks and articles, and I always learn important things from the responses, including especially things I omitted but should include so as to pre-answer audience questions. Just like the best support call is the one the customer doesn’t have to make because they didn’t have a problem, the best post-talk question is the one the audience doesn’t have to ask because the answer was clear in the talk.

Here’s a very small example from the talk I gave last week… just a “little thing,” but like many little things one that could derail people’s minds and be a distraction from the talk’s intended message.

When we did the tech check a few days before the talk, I displayed my opening slide, which was…

When it came up on the stream, I heard the techs say “great, we can see it fine, Bridge to Newt… Hinge… Ya…” and they stumbled over the last word. Only then did I realize that the cute name I was using that was so clear to me (and kind of central to the talk’s message) was not clear at all — in fact, instead of setting the stage for the message, it created a “what does that mean?” question that distracted from the message.

(Aside: I think it’s pretty funny that the bug appeared to be a version of the “max munch” rule, but with the English language — it seems that their eyes scanned the word and found that the first four letters matched an English word, “newt,” and then they took that and scanned onward for the next part but their mental tokenizer was already derailed.)

So I updated the slide to try to preempt the problem, by capitalizing one letter to provide a visual cue about the intended word end (and doing some minor visual rebalancing so it fit):

I also tried variations like “NewThing-ia” but the extra punctuation seemed unnecessary and felt a bit stilted. It felt like just capitalizing the T was enough… and as far as I know this eliminated the problem, though admittedly I didn’t re-test the update on a new person before the talk. :)

It was just a little thing, but it’s an example of how little things can be important. I suppose it’s also an example of “naming is hard” and of “names matter.”

I appreciated the techs’ implicit feedback that helped me debug my title and pre-eliminate a “what does this mean” question.

Speaking at DevAroundTheSun

Next week, I’m honored to be part of DevAroundTheSun, a live 24-hour global event for COVID-19 relief that starts on May 12 at 12:00 UTC. It’s like LiveAid or Lady Gaga’s recent One World: #TogetherAtHome, but for developers. You can watch on Twitch and YouTube, and all the talks are relatively short at 25 minutes… think TED talk style and length, and hopefully similar quality in content even though the production is a little different since we have to present our talks from home. Some of the speakers may be familiar to you, such as Bjarne Stroustrup giving an updated version of his actual TED talk, and Kevlin Henney who is always entertaining and enlightening. Other speakers may be new to you and me, but are all excellent presenters with interesting topics.

The event starts at 12:00 UTC May 12, and I’m scheduled to be on in the second hour (13:20 UTC), which will be right around dawn for me here in Seattle. I’ll be giving a new talk with all-new material about how to get from OldThing to your shiny NewThing… and in this case “all-new material” means I have to finish writing it this weekend. Here’s the title slide and abstract:

Great new ideas are common, and their developers and users want them to succeed. So why do the large majority fail to ever achieve wide adoption? For example, thousands of genuinely interesting programming languages have been invented, but few ever achieved widespread or long-lasting use. This talk considers examples, and identifies some basic (but frequently-violated and hard-to-make-retroactive) strategic success predictors, that you can use to evaluate someone else’s new idea and to “design for success” in your own new project or language.

I hope you enjoy all parts of the event you’re able to watch, and I’ll post a link to the video recording when it’s available.

When the hot loop’s done: Four quick tips for high throughput

These short tips are useful to remember when writing high-throughput code. You may already know most of them, and if so then please spread the word — friends don’t let friends write performance bottlenecks.

In a high-throughput hot loop:

  • Avoid holding locks or other resources, unless you know it won’t block another performance-sensitive thread. Definitely don’t acquire any new locks or other resources! (While you’re at it, avoid closing resources too… sure, that .Close() may claim to be nonblocking, but can you really ever be sure?)
  • Keep all such blocking operations outside performance-critical sections, so in those sections you can run without yielding. That includes watching out for the OS scheduler: Start that section as soon as possible after an OS context switch so that it’s as close as possible to a fresh OS thread quantum, and make sure it can finish before the OS gets tired of waiting and does the next context switch.
  • Don’t do I/O in hot loops. No! Don’t! Not even intermittently, which can cause gaps in your throughput. Crikey, I sometimes still see people doing console logging inside a critical loop. In 2020.
  • I/O is still okay, just defer it: Inside the hot loop, buffer up any such side effects into something like a non-allocating lock-free ring buffer in memory, or similar guaranteed-nonblocking guaranteed-throughput service. Then, after the hot loop ends, send all the buffered work out — and then you can also do all your other deferred locking, resource closing, and other blocking work, as much as you want. It’s all good when you can afford to wait.

Those are just a few tips, but they’re so popular that you may remember learning them as a rhyming song as a child. For high throughput, if you’re going to play the game, boy, you’ve got to learn to play it right:

You got to know when to hold ’em
Know when to close ’em
Know when to block and wait
Know when to run
You never cout your message
When your throughput should be stable
There’ll be time enough for cout-ing
When the hot loop’s done

But seriously: Thank you, Kenny. We miss you.

The Varna ISO C++ meeting is postponed

Yesterday morning, I announced to the committee that the next ISO WG21 (C++) meeting originally planned for June 1-6 in Varna, Bulgaria, has been postponed due to the current health situation.

We appreciate very much all the hard work and expense that our hosts, VMware and Chaos Group, have invested in welcoming us to their beautiful country for our first Bulgarian meeting! And we haven’t given up: We still hope to meet in Varna as soon as possible, but it will be sometime after 2020. In the meantime, our current plan is to continue with our existing meeting schedule, and hold our next face-to-face ISO C++ meeting in New York City this November, subject of course to global developments as all of us throughout the world deal with this emergency together.

We all love C++, but our first priority is the health and safety of all of our experts and observers, all of our families, and of the global community. The ISO C++ committee leadership (about 60 current and former officers and chairs) had already been monitoring the situation closely and making fallback plans. Then the decision was made for us yesterday morning when, shortly after WHO declared a pandemic, ISO issued a global ban on all face-to-face standards meetings through at least June 30:

https://platform.twitter.com/widgets.js

 

Interestingly, the ISO Secretary-General’s announcement included the note that approximately 275 ISO meetings scheduled to happen between 1 Feb and 30 June had already been cancelled, which gives a glimpse into the scope and scale of ISO standards work.

In the meantime, progress will continue. Many WG21 subgroups have already been having regular online meetings between our full face-to-face meetings, and we will expand that and have more subgroups making progress using ISO’s online meeting facilities throughout the spring and summer. But above all, please everyone be safe, take this crisis seriously, and follow the instructions on social distancing and isolation — by doing that we will all help ourselves and our communities. Of course this will still get worse before it gets better, but we are not helpless: The actions that we as individuals and communities are taking right now are going to make a major difference in reducing the damage done, because most of the potential damage is still ahead of us where we can influence it.

It’s hard to know yet how soon it will be before our conferences and standards meetings will resume as usual, because the situation is still changing rapidly. But we’ll keep watching, we’ll keep staying safe not just for our own sake but also to minimize danger to others, and after it’s over I look forward to seeing many of you again, face to face, at our regular events in the hopefully-near future.

References, simply

References are for parameter passing, including range-for. Sometimes they’re useful as local variables, but pointers or structured bindings are usually better. Any other use of references typically leads to endless design debates. This post is an attempt to shed light on this situation, and perhaps reduce some of the time spent on unresolved ongoing design debates in the C++ community. Thank you to the following for their feedback on drafts of this material: Howard Hinnant, Arthur O’Dwyer, Richard Smith, Bjarne Stroustrup, Ville Voutilainen.

Edited to add: mention the core language specification complexity, and that the list of examples is not exhaustive and other examples fall into the same categories as listed examples.


References

What references are and how to use them

In C++, a C& or C&& reference is an indirect way to refer to an existing object. Every reference has a dual nature: It’s implemented under the covers as a pointer, but semantically it usually behaves like an alias because most uses of its name automatically dereference it. (Other details are not covered here, including the usual parameter passing rules and that C&& has a different meaning depending on whether C is a concrete type or a template parameter type.)

C++ references were invented to be used as function parameter/return types, and that’s what they’re still primarily useful for. Since C++11, that includes the range-for loop which conceptually works like a function call (see Q&A).

Sometimes, a reference can also be useful as a local variable, though in modern C++ a pointer or structured binding is usually better (see Q&A).

That’s it. All other uses of references should be avoided.


Advanced note for experts

Please see the Q&A for more, including const& lifetime extension, pair<T&, U&>, optional<T&>, and other cases. Note that the examples explicitly listed below are not intended to be exhaustive; other examples (e.g., tie, reference_wrapper) fall under one or more of the cases listed below.





Appendix: Q&A

Historical question: Can you elaborate a little more on why references were invented for function parameter/return types?

Here is a summary, but for more detail please see The Design and Evolution of C++ (D&E) section 3.7, which begins: “References were introduced primarily to support operator overloading…”

In C, to pass/return objects to/from functions you have two choices: either pass/return a copy, or take their address and pass/return a pointer which lets you refer to an existing object.

Neither is desirable for overloaded operators. There are two motivating use cases, both described in D&E:

  • The primary use case is that we want to pass an existing object to an operator without copying it. Passing by reference lets calling code write just a - b, which is natural and consistent with built-in types’ operators. If we had to write &a - &b to pass by pointer, that would be (very) inconvenient, inconsistent with how we use the built-in operators, and a conflict when that operator already has a different meaning for raw pointers as it does in this example.
  • Secondarily, we want to return an existing object without copying it, especially from operators like unary * and []. Passing by reference lets calling code write str[0] = 'a'; which is natural and consistent with built-in arrays and operators. If we had to write *str[0] = 'a'; to return by pointer, that would be (slightly) inconvenient and also inconsistent with built-in operators, but not the end of the world and so this one is only a secondary motivating case.

Those are the only uses of references discussed in D&E, including in the section on smart references and operator., and the only places where references are really needed still today.

What was that about range-for being like a function call?

The C++11 range-for loop is semantically like function parameter passing: We pass a range to the loop which takes it as if by an auto&& parameter, and then the loop passes each element in turn to each loop iteration and the loop body takes the element in the way it declares the loop element variable. For example, this loop body takes its element parameter by const auto&:

// Using range-for: The loop variable is a parameter to
// the loop body, which is called once per loop iteration
for (const auto& x : rng) { ... }

If we were instead using the std::for_each algorithm with the loop body in a lambda, the parameter passing is more obvious: for_each takes the range via an iterator pair of parameters, and then calls the loop body lambda passing each element as an argument to the lambda’s parameter:

// Using std::for_each: Basically equivalent
for_each (begin(rng), end(rng), [&](const auto& x) { ... });

Is a reference a pointer to an object, or an alternate name for the object?

Yes — it is either or both, depending on what you’re doing at the moment.

This dual nature is the core problem of trying to use a reference as a general concept: Sometimes the language treats a reference as a pointer (one level of indirection), and sometimes it treats it as an alias for the referenced object (no level of indirection, as if it were an implicitly dereferenced pointer), but those are not the same thing and references make those things visually ambiguous.

When passing/returning an object by reference, this isn’t a problem because we know we’re always passing by pointer under the covers and when we use the name we’re always referring to the existing object by alias. That’s clear, and references are well designed for use as function parameter/return types.

But when trying to use references elsewhere in the language, we have to know which aspect (and level of indirection) we’re dealing with at any given time, which leads to confusion and woe. References have never been a good fit for non-parameter/return uses. And that is doubly sad, because supporting “reference types” throughout the language complicates the C++ core language specification, while simultaneously forcing us to keep teaching why to avoid using nearly all of that generalized language support.

Aren’t local references useful because of lifetime extension?

We “made it useful” as an irregular extension, but that’s brittle and now basically unnecessary as of C++17.

A brief history of lifetime extension: After references were first added in the 1980s, C++ later added a special case where binding a temporary object to a local variable of type const& and still later auto&& (but not generally other kinds of local references) was “made useful” by imbuing only those references with the special power of extending the lifetime of a temporary object, just because we could (and because there were use cases where it was important for performance, before C++17 guaranteed copy elision). However, these cases have always been:

  • brittle and inconsistent (e.g., const T& t = f(); and const T& t = f().x; and struct X { const T& r; } x = { f() }; extend the lifetime of an object returned by value from f(), but const T& t = f().g(); does not);
  • irregular (e.g., T& t = f(); is ill-formed, whereas const T& t = f(); and T t = f(); still uniformly work); and
  • unnecessary now that C++17 has guaranteed copy elision (e.g., just write T t = f(); and the meaning is both obvious and correct, as well as way easier to teach and learn and use).

Aren’t local references useful to get meaningful names for parts of an object returned from a function?

Yes, but since C++17 structured bindings are strictly better.

For example, given a set<int> s and calling an insert function that returns a pair<iterator, bool>, just accessing the members of the pair directly means putting up with hard-to-read code:

// accessing the members of a pair directly (unmeaningful names)
auto value = s.insert(4);
if (value.second) {
    do_something_with(value.first);
}

Structured bindings lets us directly name the members — note that this just invents names for them, it does not create any actual pointer indirection:

// using structured bindings (easy to use meaningful names)
auto [position, succeeded] = s.insert(4);
if (succeeded) {
    do_something_with(position);
}

In the olden days before structured bindings, some people like to use references to indirectly name the members — which like the above gives them readable names, but unlike the above does create new pointer-equivalent indirect variables and follows those pointers which can incur a little space and time overhead (and also isn’t as readable)…

// using references (cumbersome, don't do this anymore)
auto value      = s.insert(4);
auto& position  = value.first;          // equivalent to pointers
auto& succeeded = value.second;
if (succeeded) {                        // invisible dereference
    do_something_with(position);        // invisible dereference
}

// or using pointers (ditto)
auto value     = s.insert(4);
auto position  = &value.first;          // self-documenting pointers
auto succeeded = &value.second;
if (*succeeded) {                       // visible dereference
    do_something_with(*position);       // visible dereference
}

… but even in the olden days, references were never significantly better than using pointers since the code is basically identical either way. Today, prefer structured bindings.

Aren’t local references useful to express aliases, for example to a member of an array or container?

Yes, though pointers can do it equivalently, it’s a style choice.

For example, this local reference is useful:

auto& r = a[f(i)];
// ... then use r repeatedly ...

Or you can equivalently use a pointer:

auto p = &a[f(i)];
// ... then use *p repeatedly ...

Isn’t T& convenient for easily expressing a pointer than can’t be rebound to another object?

Yes, though T* const does equally well.

Either is mainly useful as a local variable. (See also previous answer.)

Isn’t T& convenient for easily expressing a pointer that is not null?

Not exactly — T& lets you express a pointer that’s not-null and that can’t be rebound.

You can also express not-null by using gsl::not_null<> (see for example the Microsoft GSL implementation), and one advantage of doing it this way is that it also lets you independently specify whether the pointer can be rebound or not — if you want it not to be rebindable, just add const as usual.

What about lambda [&] capture?

[&] is the right default for a lambda that’s passed to a function that will just use it and then return (aka structured lifetime) without storing it someplace where it will outlive the function call. Those structured uses fall under the umbrella of using references as parameter/return types. For non-parameter/return uses, prefer using pointers.

What about pair<T&, U&> and tuple<T&, U&> and struct { T& t; U& u; }?

I’ve mainly seen these come up as parameter and return types, where for the struct case the most common motivation is that C++ doesn’t (yet) support multiple return values, or as handwritten equivalents of what lambda [&] capture does. For those uses, they fall under the umbrella of using references as parameter/return types. For non-parameter/return uses, prefer using pointers.

[GENERAL UMBRELLA QUESTION] But what about using a reference for ((other use not as a parameter or return type or local variable))?

Don’t. WOPR said it best, describing something like the game of trying to answer this class of question: “A strange game. The only winning move is not to play.”

Don’t let yourself be baited into even trying to answer this kind of question. For example, if you’re writing a class template, just assume (or document) that it can’t be instantiated with reference types. The question itself is a will o’ the wisp, and to even try to answer it is to enter a swamp, because there won’t be a general reasonable answer.

(Disclaimer: You, dear reader, may at this very moment be thinking of an ((other use)) for which you think you have a reasonable and correct answer. Whatever it is, it’s virtually certain that a significant fraction of other experts are at this very moment reading this and thinking of that ((other use)) with a different answer, and that you can each present technical arguments why the other is wrong. See optional<T&> below.)

All of the remaining questions are specific cases of this general umbrella question, and so have the same answer…

… But what about using a reference type as a class data member?

For the specific case of pair<T&, U&> and tuple<T&, U&> and struct { T& t; U& u; }, see the earlier answer regarding those. Otherwise:

Don’t, see previous. People keep trying this, and we keep having to teach them not to try because it makes classes work in weird and/or unintended ways.

Pop quiz: Is struct X { int& i; }; copyable? If not, why not? If so, what does it do?

Basic answer: X is not copy assignable, because i cannot be modified to point at something else. But X is copy constructible, where i behaves just as if it were a pointer.

Better answer: X behaves the same as if the member were int* const i; — so why not just write that if that’s what’s wanted? Writing a pointer is arguably simpler and clearer.

… But what about using a reference type as an explicit template argument?

Don’t, see above. Don’t be drawn into trying to answer when this could be valid or useful.

Explicitly jamming a reference type into a template that didn’t deduce it and isn’t expecting it, such as calling std::some_algorithm<std::vector<int>::iterator&>(vec.begin(), vec.end());, will be either very confusing or a compile-time error (or both, a very confusing compile-time error — try std::sort).

… But what about using a reference type for a class template specialization?

Don’t, see above. Don’t be drawn into trying to answer when this could be valid or useful.

… But wait, not even optional<T&>?

Don’t, see above. Especially not this one.

An astonishing amount of ink has been spilled on this particular question for years, and it’s not slowing down — the pre-Prague mailing had yet another paper proposing an optional<T&> as one alternative, and we’ve had multiple Reddit’d posts about it in the past few weeks (exampleexample). Those posts are what prompted me to write this post, expanding on private email I wrote to one of the authors.

Merely knowing that the discussion has continued for so many years with no consensus is a big red flag that the question itself is flawed. And if you’re reading this and think you have answer, ask yourself whether in your answer optional<T&> really IS-AN optional<T> — template specializations should be substitutable for the primary template (ask vector<bool>) and the proposed answers I’ve seen for optional<T&> are not substitutable semantically (you can’t write generic code that uses an optional<T> and works for that optional<T&>), including that some of them go so far as actually removing common functions that are available on optional<T> which clearly isn’t substitutable.

There’s a simple way to cut this Gordian knot: Simply knowing that references are for parameter/return types will warn us away from even trying to answer “what should optional<T&> do?” as a design trap, and we won’t fall into it. Don’t let yourself be baited into trying to play the game of answering what it should mean. “The only winning move is not to play.”

Use optional<T> for values, and optional<T*> or optional<not_null<T*>> for pointers.

Epilogue: But wait, what about ((idea for optional<T&>))?

If after all the foregoing you still believe you have a clear answer to what optional<T&> can mean that:

  • is still semantically IS-A substitutable for the optional<> primary template (e.g., generic code can still use it as a more general optional);
  • cannot be represented about equally well by optional<not_null<T*>>; and
  • does not already have published technical arguments against it showing problems with the approach;

then please feel free to post a link below to a paper that describes that answer in detail.

Fair warning, though: Even while reviewing this article, a world-class expert reviewer responded regarding experience with one of the world’s most popular versions of optional<T&>:

“I know that Boost has optional<T&> so I tried it for my use case … ((code example)) is a run-time error for me. I expected ((a different behavior)) and it did not. I suspect the mistake is in the ambiguity: Does assigning an optional<T&> assign through the reference, or rebind the reference?”

My answer: Exactly, the dual nature of references is always the problem.

  • If the design embraces the pointer-ness of references (one level of indirection), then one set of use cases works and people with alias-like use cases get surprised.
  • If the design embraces the alias-ness of references (no indirection), then the other set of use cases works and people with pointer-like use cases get surprised.
  • If the design mixes them, then a variety of people get surprised in creative ways.

Java object references encounter similar problems — everything is implicitly a pointer, but there’s no clean way to syntactically distinguish the pointer vs. the pointee. Being able to talk separately about the pointer vs. the pointee is an important distinction, and an important and underestimated advantage of the Pointer-like things (e.g., raw pointers, iterators, ranges, views, spans) we have in C++.