GotW #100 Solution: Preconditions, Part 1 (Difficulty: 8/10)

This special Guru of the Week series focuses on contracts. We’ve seen how postconditions are directly related to assertions (see GotWs #97 and #99). So are preconditions, but that in one important way makes them fundamentally different. What is that? And why would having language support benefit us even more for writing preconditions more than for the other two?

1. What is a precondition, and how is it related to an assertion?

A precondition is a “call site prerequisite on the inputs”: a condition that must be true at each call site before the caller can invoke this function. In math terms, it’s about expressing the domain of recognized inputs for the function. If preconditions don’t hold, the function can’t possibly do its work (achieve its postconditions), because the caller hasn’t given it a starting point it understands.

A precondition IS-AN assertion in every way described in GotW #97, with the special addition that whereas a general assertion is always checked where it is written, a precondition is written on the function and conceptually checked at every call site. (In less-ideal implementations, including if we write it as a library today, the precondition check might be in the function body; see Question 2.)

Explain your answer using the following example, which uses a variation of a proposed post-C++20 syntax for preconditions. [1]

// Example 1(a): A precondition along the lines proposed in [1]

void f( int min, int max )
    [[pre( min <= max )]]
{
    // ...
}

The above would be roughly equivalent to writing the test before the call at every call site instead. For example, for a call site that performs f(x, y), we want to check the precondition at this specific call site at least when it is being tested (and possibly earlier and/or later, see GotW #97 Question 4):

// Example 1(b): What a compiler might generate at a call site
//               “f(x, y)” for the precondition in Example 1(a)

assert( x <= y ); // implicitly injected assertion at this call site,
                  // checked (at least) when this call site is tested
f(x, y);

And, as we’ll see in Question 4, language support for preconditions should apply this rewrite recursively for subexpressions that are themselves function calls with preconditions.

GUIDELINE: Use a precondition to write “this is what a bug is” as code the caller can check. A precondition states in code the circumstances under which this function’s behavior is not documented.

2. Rewrite the example in Question 1 to show how to approximate the same effect using assertions in today’s C++.

Here’s one way we can do it, that extends the MY_POST technique from GotW #99 Example 2 to also support preconditions. Again, instead of MY_ you’d use your company’s preferred unique macro prefix: [2]

// Eliminate forward-boilerplate with a macro (written only once)
#define MY_PRE_POST(preconditions, postconditions)         \
    assert( preconditions );                               \
    auto post = [&](auto&& _return_) -> auto&& {           \
        assert( postconditions );                          \
        return std::forward<decltype(_return_)>(_return_); \
    };

And then the programmer can just write:

// Example 2: Sample precondition

void f( int min, int max )
{   MY_PRE_POST_V( min <= max, true ); // true == no postconditions here
    // ...
}

This has the big benefit that it works using today’s C++. It has the same advantages as MY_POST in GotW #99, including that it’s future-friendly… if we use the macro as shown above, then if in the future C++ has language support for preconditions and postconditions with a syntax like [1], migrating your code to that could be as simple as search-and-replace:

{ MY_PRE_POST( **, * )[[pre: ** ]] [[post _return_: * )]] {

return post( * )return *

GUIDELINE (extended from GotW #99): If you don’t already use a way to write preconditions and postconditions as code, consider trying something like MY_PRE_POST until language support is available. It’s legal C++ today, it’s not terrible, and it’s future-friendly to adopting future C++ language contracts.

But even if macros don’t trigger your fight-or-flight response, it’s still a far cry from language support …

Are there any drawbacks to your solution compared to having language support for preconditions?

Yes:

  • Callee-body checking only. This method can run the check only inside the function’s body. First, this means we can’t easily perform the check at each call site, which would be ideal including so we can turn the check on for one call site but not another when we are testing a specific caller. Second, for constructors it can’t run at the very beginning of construction because member initialization happens before we enter the constructor body.
  • Doesn’t directly handle nested preconditions, meaning preconditions of functions invoked as part of the precondition itself. We’ll come to this in Question 4.

3. If a precondition fails, what does that indicate, and who is responsible for fixing the failure?

Each call site is responsible for making sure it meets all of a function’s preconditions before calling that function. If a precondition is false, it’s a bug in the calling code, and it’s the calling code author who is responsible for fixing it.

Explain how this makes a precondition fundamentally different from every other kind of contract.

A precondition is the only kind of contract you can write that someone else has to fulfill, and so if it’s ever false then it’s someone else’s fault — it’s the caller’s bug that they need to go fix.

GUIDELINE: Remember the fundamental way preconditions are unique… if they’re false, then it’s someone else’s fault (the calling code author). When you write any of the other contracts (assertions, function postconditions, class invariants), you state something that must be true about your own function or class, and if prior contracts were written and well tested then likely it’s your function or class that created the first unexpected state.

4. Consider this example, expanded from a suggestion by Gábor Horváth:

// Example 4(a): What are the implicit preconditions?

auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre( x[0] <= std::sqrt(y) )]] ;

Note that std::floating_point is a C++20 concept.

a) What kinds of preconditions must a caller of calc satisfy that can’t generally be written as testable boolean expressions?

The language requires the number and types of arguments to match the parameter list. Here, calc must be called with two arguments. The first must be a std::vector<int> or something convertible to that. The second one’s type has to satisfy the floating_point concept (it must be float, double, or long double).

It’s worth remembering that these language-enforced rules are conceptually part of the function’s precondition, in the sense that they are requirements on call sites. Even though we generally can’t write testable boolean predicates for these to check that we didn’t write a bug, we also never need to do that because if we write a bug the code just won’t compile. [3] Code that is “correct by construction” doesn’t need to add assertions to find potential bugs.

GUIDELINE: Remember that a static type is a (non-boolean) precondition. It’s just enforced by language semantics with always-static checking (edit or compile time), and never needs to be tested using a boolean predicate whose test could be delayed until dynamic checking (test or run time).

COROLLARY: A function’s number, order, and types of parameters are all (non-boolean) parts of its precondition. This falls out of the “static type” statement because the function’s own static type includes those things. For example, the language won’t let us invoke this function with the argument lists () or (1,2,3,4,5) or (3.14, myvector). We’ll delve into this more deeply in GotW #101.

COROLLARY: All functions have preconditions. Even void f() { }, which takes no inputs at all including that it reads no global state, has the precondition that it must be passed zero arguments. The only counterexample I can think of is pathological: void f(...) { } can be invoked with any number of arguments but ignores them all.

b) What kinds of boolean-testable preconditions are implicit within the explicitly written declaration of calc?

There are three possible kinds of implied boolean preconditions. All three are present in this example.

(1) Type invariants

Each object must meet the invariants of its type. This is subtly different from “the object’s type matches” (a static property) that we say in 4(a), because this means additionally “the object’s value is not corrupt” (a dynamic property).

Here, this means x must obey the invariant of vector<int>, even though that invariant isn’t expressed in code in today’s C++. [4] For y this is fairly easy because all bit patterns are valid floating point values (more about NaNs in just a moment).

(2) Subexpression preconditions

The subexpression x[0] calls x.operator[] which has its own precondition, namely that the subscript be non-negative and less than x.size(). For 0, that’s true if x.size() > 0 is true, or equivalently !x.empty(), so that becomes an implicit part of our whole precondition.

(3) Subexpressions that make the whole precondition false

The subexpression std::sqrt(y) invokes C’s sqrt. The C standard says that unless y >= 0, the result of sqrt(y) is NaN (“not a number”), which means our precondition amounts to something <= NaN which is always false. Therefore, y >= 0 is effectively part of calc’s precondition too. [5]

Putting it all together

If we were to write this all out, the full precondition would be something like this — and note that the order is important! Here we’ll ignore the parts that are enforced by the language, such as parameter arity, and focus on the parts that can be written as boolean expressions:

// Example 4(b): Trying to write the precondition out more explicitly
//               (NOT all are recommended, this is for exposition)

auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre(

//  1. parameter type invariants:
           /* x is a valid object, but we can’t spell that, so: */ true

//  2. subexpression preconditions:
        && x.size() > 0  // so checking x[0] won’t be undefined (!)

//  3. subexpression values that make our precondition false:
        && y >= 0        // redundant with the expression below

// finally, our explicit precondition itself:
        && x[0] <= std::sqrt(y) 
    )]] ;

GUIDELINE: Remember that your function’s full effective precondition is the precondition you write plus all its implicit prerequisites. Those are: (1) each parameter’s type invariants, (2) any preconditions of other function calls within the precondition, and (3) any defined results of function calls within the precondition that would make the precondition false.

c) Should any of these boolean-testable implicit preconditions also be written explicitly here in this precondition code? Explain.

For #1 and #3, we generally shouldn’t be repeating them as in 4(b):

  • We can skip repeating #1 because it’s enforced by the type system, plus if there is a bug it’s likely in the type itself rather than in our code or our caller’s code and will be checked when we check the type’s invariants.
  • We can skip repeating #3 because it’ll just make the whole condition be false and so is already covered.

But #2 is the problematic case: If x is actually empty, the subexpression’s precondition would actually make our precondition undefined to evaluate! “Undefined” is a very bad answer if we ever check this precondition, because if in our checking the full precondition is ever violated then we absolutely want that check to do something well-defined — we want it to evaluate to false and fail.

If a subexpression of our precondition itself has a real precondition, then we do want to check that first, otherwise we cannot check our full precondition without undefined behavior if that subexpression’s precondition was not met:

// Example 4(c): Today, we should repeat our subexpressions’ real
//               preconditions, so we can check our precondition
//               without undefined behavior

auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre( x.size() > 0 && x[0] <= std::sqrt(y) )]] ;

With today’s library-based preconditions, such as the one shown in Question 2, we need to repeat subexpressions’ preconditions if we want to check our precondition without undefined behavior. One of the potential advantages of a language-supported contracts system is that it can “flatten” the preconditions to automatically test category #2 , so that nested preconditions like this one don’t need to be repeated (assuming that the types and functions you use, here std::vector and its member functions, have written their preconditions and invariants)… and then we could still debate whether or not to explicitly repeat subexpression preconditions in our preconditions, but it would be just a water-cooler stylistic debate, not a “can this even be checked at all without invoking undefined behavior” correctness debate.

Here’s a subtle variation suggested by Andrzej Krzemieński. For the sake of discussion, suppose we have a nested precondition that is not used in the function body (which I think is terribly unlikely, but let’s just consider it):

void display( /*...*/ )
    [[pre( globalData->helloMessageHasBeenPrinted() )]]
{
    // assume for sake of discussion that globalData is not
    // dereferenced directly or indirectly by this function body
}

Here, someone could argue: “If globalData is null, only actually checking the precondition would be undefined behavior, but executing the function body would not be undefined behavior.”

Question: Is globalData != nullptr an implicit precondition of display, since it applies only to the precondition, and is not actually used in the function body? Think about it for a moment before continuing…

… okay, here’s my answer: Yes, it’s absolutely part of the precondition of display, because by definition a precondition is something the caller is required to ensure is true before calling display, and a condition that is undefined to evaluate at all cannot be true.

GUIDELINE: If your checked precondition has a subexpression with its own preconditions, make sure those are checked first. Otherwise, you might find your precondition check doesn’t fire even when it’s violated. In the future, language support for preconditions might automate this for you; until then, be careful to write out the subexpression precondition by hand and put it first.

Notes

[1] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and “ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20 for this article I also modified the syntax from : to ( ), and to name the return value _return_ for postconditions. That’s not a statement of preference, it’s just so the examples can compile today to make them easier to check.

[2] Again, as in GotW #99 Note 4, in a real system we’d want a few more variations, such as:

// A separate _V version for functions that don’t return
// a value, because 'void' isn’t regular
#define MY_PRE_POST_V(preconditions, postconditions) \
    assert( preconditions );                         \
    auto post = [&]{ assert( postconditions ); };

// Parallel _DECL forms to work on forward declarations,
// for people who want to repeat the postcondition there
#define MY_PRE_POST_DECL(preconditions, postconditions)
#define MY_PRE_POST_V_DECL(preconditions, postconditions)

And see GotW #99 Note 5 for how to guarantee the programmer didn’t forget to write “return post” at each return.

[3] Sure, there are things like is_invocable, but the point is we can’t always write those expressions, and we don’t have to here.

[4] Upcoming GotWs will cover invariants and violation handling. For type invariants, today’s C++ doesn’t yet provide a way to write those as a checkable assertions to help us find bugs where we got it wrong and corrupted an object. The language just flatly assumes that every object meets the invariants of its type during the object’s lifetime, which is from the end of its construction to the beginning of its destruction.

[5] There’s more nuance to the details of what the C standard says, but it ends up that we should expect the result of passing a negative or NaN value to sqrt will be NaN. Although C calls negative and NaN inputs “domain errors,” which hints at a precondition, it still defines the results for all inputs and so strictly speaking doesn’t have a precondition.

Acknowledgments

Thank you to the following for their feedback on this material: Joshua Berne, Gabriel Dos Reis, J. Daniel Garcia, Gábor Horváth, Andrzej Krzemieński, Jean-Heyd Meneide, Bjarne Stroustrup, Andrew Sutton, Jim Thomas, Ville Voutilainen.

GotW #100: Preconditions, Part 1 (Difficulty: 8/10)

This special Guru of the Week series focuses on contracts. We’ve seen how postconditions are directly related to assertions (see GotWs #97 and #99). So are preconditions, but that in one important way makes them fundamentally different. What is that? And why would having language support benefit us even more for writing preconditions more than for the other two?

JG Question

1. What is a precondition, and how is it related to an assertion? Explain your answer using the following example, which uses a variation of a proposed post-C++20 syntax for preconditions. [1]

// A precondition along the lines proposed in [1]

void f( int min, int max )
    [[pre( min <= max )]]
{
    // ...
}

Guru Questions

2. Rewrite the example in Question 1 to show how to approximate the same effect using assertions in today’s C++. Are there any drawbacks to your solution compared to having language support for preconditions?

3. If a precondition fails, what does that indicate, and who is responsible for fixing the failure? Explain how this makes a precondition fundamentally different from every other kind of contract.

4. Consider this example, expanded from a suggestion by Gábor Horváth:

auto calc( std::vector<int> const&  x ,
           std::floating_point auto y ) -> double
    [[pre( x[0] <= std::sqrt(y) )]] ;

Note that std::floating_point is a C++20 concept.

  • What kinds of preconditions must a caller of calc satisfy that can’t generally be written as testable boolean expressions?
  • What kinds of boolean-testable preconditions are implicit within the explicitly written declaration of calc?
  • Should any of these boolean-testable implicit preconditions also be written explicitly here in this precondition code? Explain.

Notes

[1] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and “ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20 for this article I also modified the syntax from : to ( ). That’s not a statement of preference, it’s just so the examples can compile today to make them easier to check.

GotW #99 Solution: Postconditions (Difficulty: 7/10)

This special Guru of the Week series focuses on contracts. Postconditions are directly related to assertions (see GotW #97)… but how, exactly? And since we can already write postconditions using assertions, why would having language support benefit us more for writing postconditions more than for writing (ordinary) assertions?

1. What is a postcondition, and how is it related to an assertion?

A function’s postconditions document “what it does” — they assert the function’s intended effects, including the return value and any other caller-visible side effects, which must hold at every return point when the function returns to the caller.

A postcondition IS-AN assertion in every way described in GotW #97, with the special addition that whereas a general assertion is always checked where it is written, a postcondition is written on the function and checked at every return (which could be multiple places). Otherwise, it’s “just an assertion”: As with an assertion, if a postcondition is false then it means there is a bug, likely right there inside the function on which the postcondition is written (or in the postcondition itself), because if prior contracts were well tested then likely this function created the first unexpected state. [2]

Explain your answer using the following example, which uses a variation of a proposed post-C++20 syntax for postconditions. [1]

// Example 1(a): A postcondition along the lines proposed in [1]

string combine_and_decorate( const string& x, const string& y )
    [[post( _return_.size() > x.size() + y.size() )]]
{
    if (x.empty()) {
        return "[missing] " + y + optional_suffix();
    } else {
        return x + ' ' + y + something_computed_from(x);
    }
}

The above would be roughly equivalent to writing the test before every return statement instead:

// Example 1(b): What a compiler might generate for Example 1(a)

string combine_and_decorate( const string& x, const string& y )
{
    if (x.empty()) {
        auto&& _return_ = "[missing] " + y + optional_suffix();
        assert( _return_.size() > x.size() + y.size() );
        return std::forward<decltype(_return_)>(_return_);
    } else {
        auto&& _return_ = x + ' ' + y + something_computed_from(x);
        assert( _return_.size() > x.size() + y.size() );
        return std::forward<decltype(_return_)>(_return_);
    }
}

2. Rewrite the example in Question 1 to show how to approximate the same effect using assertions in today’s C++. Are there any drawbacks to your solution compared to having language support for postconditions?

We could always write Example 1(b) by hand, but language support for postconditions is better in two key ways:

(A) The programmer should only write the condition once.

(B) The programmer should not need to write forwarding boilerplate by hand to make looking at the return value efficient.

How can we approximate those advantages?

Option 1 (basic): Named return object + an exit guard

The simplest way to achieve (A) would be to use the C-style goto exit; pattern:

// Example 2(a)(i): C-style “goto exit;” postcondition pattern

string combine_and_decorate( const string& x, const string& y )
{
    auto _return_ = string();
    if (x.empty()) {
        _return_ = "[missing] " + y + optional_suffix();
        goto post;
    } else {
        _return_ = x + ' ' + y + something_computed_from(x);
        goto post;
    }

post:
    assert( _return_.size() > x.size() + y.size() );
    return _return_;
}

If you were thinking, “in C++ this wants a scope guard,” you’re right! [3] Guards still need access to the return value, so the structure is basically similar:

// Example 2(a)(ii): scope_guard pattern, along the lines of [3]

string combine_and_decorate( const string& x, const string& y )
{
    auto _return_ = string();
    auto post = std::experimental::scope_success([&]{
        assert( _return_.size() > x.size() + y.size() );
    });

    if (x.empty()) {
        _return_ = "[missing] " + y + optional_suffix();
        return _return_;
    } else {
        _return_ = x + ' ' + y + something_computed_from(x);
        return _return_;
    }
}

Advantages:

  • Achieved (A). The programmer writes the condition only once.

Drawbacks:

  • Didn’t achieve (B). There’s no forwarding boilerplate, but only because we’re not even trying to forward…
  • Overhead (maybe). … and to look at the return values we require a named return value and a move assignment into that object, which is overhead if the function wasn’t already doing that.
  • Brittle. The programmer has to remember to convert every return site to _return_ = ...; goto post; or _return_ = ...; return _return_;… If they forget, the code silently compiles but doesn’t check the postcondition.

Option 2 (better): “return post” postcondition pattern

Here’s a second way to do it that achieves both goals, using a local function (which we have to write as a lambda in C++):

// Example 2(b): “return post” postcondition pattern

string combine_and_decorate( const string& x, const string& y )
{
    auto post = [&](auto&& _return_) -> auto&& {
        assert( _return_.size() > x.size() + y.size() );
        return std::forward<decltype(_return_)>(_return_);
    };

    if (x.empty()) {
        return post( x + ' ' + y + something_computed_from(x) );
    } else {
        return post( "[missing] " + y + optional_suffix() );
    }
}

Advantages:

  • Achieved (A). The programmer writes the condition only once.
  • Efficient. We can look at return values efficiently, without requiring a named return value and a move assignment.

Drawbacks:

  • Didn’t achieve (B). We still have to write the forwarding boilerplate, but at least it’s only in one place.
  • Brittle. The programmer has to remember to convert every return site to return post. If they forget, the code silently compiles but doesn’t check the postcondition.

Option 3 (mo’betta): Wrapping up option 2… with a macro

We can improve Option 2 by wrapping the boilerplate up in a macro (sorry). Note that instead of “MY_” you’d use your company’s preferred unique macro prefix: [4]

// Eliminate forward-boilerplate with a macro (written only once)
#define MY_POST(postconditions)                            \
    auto post = [&](auto&& _return_) -> auto&& {           \
        assert( postconditions );                          \
        return std::forward<decltype(_return_)>(_return_); \
    };

And then the programmer can just write:

// Example 2(c): “return post” with boilerplate inside a macro

string combine_and_decorate( const string& x, const string& y )
{   MY_POST( _return_.size() > x.size() + y.size() );

    if (x.empty()) {
        return post( x + ' ' + y + something_computed_from(x) );
    } else {
        return post( "[missing] " + y + optional_suffix() );
    }
}

Advantages:

  • Achieved (A) and (B). The programmer writes the condition only once, and doesn’t write the forwarding boilerplate.
  • Efficient. We can look at the return value without requiring a local variable for the return value, and without an extra move operation to put the value there.
  • Future-friendly. You may have noticed that I changed my usual brace style to write { MY_POST on a single line; that’s to make it easily replaceable with search-and-replace. If you systematically declare the condition as { MY_POST at the start of the function, and systematically write return post() to use it, the code is likely more future-proof — if we get language support for postconditions with a syntax like [1], migrating your code to that could be as simple as search-and-replace:

{ MY_POST( * )[[post _return_: * )]] {

return post( * )return *

Drawbacks:

  • (improved) Brittle. It’s still a manual pattern, but now we have the option of making it impossible for the programmer to forget return post by extending the macro to include a check that post was used before each return (see [5]). That’s feasible to put into the Option 3 macro, whereas it was not realistic to ask the programmer to write out by hand in Options 1 and 2.

GUIDELINE: If you don’t already use a way to write postconditions as code, consider trying something like MY_POST until language support is available. It’s legal C++ today, it’s not terrible, and it’s future-friendly to adopting future C++ language contracts.

Finally, all of these options share a common drawback:

  • Less composable/toolable. The next library or team will have THEIR_POST convention that’s different, which makes it hard to write tools to support both styles. Language support has an important incidental benefit of providing a common syntax that portable code and tools can rely upon.

3. Should a postcondition be expected to be true if the function throws an exception back to the caller?

No.

First, let’s generalize the question: Anytime you see “if the function throws an exception,” mentally rewrite it to “if the function reports that it couldn’t do what it advertised, namely complete its side effects.” That’s independent of whether it reports said failure using an exception, std::error_code, HRESULT, errno, or any other way.

Then the question answers itself: No, by definition. A postcondition documents the side effects, and if those weren’t achieved then there’s nothing to check. And for postconditions involving the return value we can add: No, those are meaningless by construction, because it doesn’t exist.

“But wait!” someone might interrupt. “Aren’t there still things that need to be true on function exit even if the function failed?” Yes, but those aren’t postconditions. Let’s take a look.

Justify your answer with example(s).

Consider this code:

// Example 3: (Not) a reasonable postcondition?

void append_and_decorate( string& x, string&& y )
    [[post( x.size() <= x.capacity() && /* other non-corruption */ )]]
{
    x += y + optional_suffix();
}

This can seem like a sensible “postcondition” even when an exception is thrown, but it is testing whether x is still a valid object of its type… and sure, that had better be true. But that’s an invariant, which should be written once on the type [2], not a postcondition to be laboriously repeated arbitrarily many times on every function that ever might touch an object of that type.

When reasoning about function failures, we use the well-known Abrahams error safety guarantees, and now it becomes important to understand them in terms of invariants:

  • The nofail guarantee is “the function cannot fail” (e.g., such functions should be noexcept), and so doesn’t apply here since we’re discussing what happens if the function does fail.
  • The basic guarantee is “no corruption,” every object we might have tried to modify is still a valid object of its type… but that’s identical to saying “the object still meets the invariants of its type.”
  • The strong guarantee is “all or nothing,” so in the case we’re talking about where an error is being reported, a strong guarantee function is again saying that all invariants hold. (It also says observable state did not change, but I’ll ignore that for now; for how we might want to check that, see [6].)

So we’re talking primarily about class invariants… and those should hold on both successful return and error exit, and they should be written on the type rather than on every function that uses the type.

GUIDELINE: If you’re trying to write a “postcondition” that should still be true even if an exception or other error is reported, you’re probably either trying to write an invariant instead [2], or trying to check the strong did-nothing guarantee [6].

4. Should postconditions be able to refer to both the initial (on entry) and final (on exit) value of a parameter, if those could be different?

Yes.

If so, give an example.

Consider this code, which uses a strawman _in_() syntax for referring to subexpressions of the postcondition that should be computed on entry so they can refer to the “in” value of the parameter (note: this was not proposed in [1]):

// Example 4(a): Consulting “in” state in a postcondition

void instrumented_push( vector<widget>& c, const widget& value )
    [[post( _in_(c.size())+1 == c.size() )]]
{

    c.push_back(value);

    // perform some extra work, such as logging which
    // values are added to which containers, then return
}

Postconditions like this one express relative side effects, where the “out” state is a delta from the “in” state of the parameter. To write postconditions like this one, we have to be able to refer to both states of the parameter, even for parameters that must be modifiable.

Note that this doesn’t require taking a copy of the parameter… that would be expensive for c! Rather, an implementation would just evaluate any _in_ subexpression on entry and store only that result as a temporary, then evaluate the rest of the expression on exist. For example, in this case the implementation could generate something like this:

// Example 4(b): What an implementation might generate for 4(a)

void instrumented_push( vector<widget>& c, const widget& value )
{
    auto __in_c_size = c.size();

    c.push_back(value);

    // perform some extra work, such as logging which
    // values are added to which containers, then return

    assert( __in_c_size+1 == c.size() );
}

Notes

[1] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and “ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20 for this article I also modified the syntax from : to ( ), and to name the return value _return_ for postconditions. That’s not a statement of preference, it’s just so the examples can compile today to make them easier to check.

[2] Upcoming GotWs will cover preconditions and invariants, including how invariants relate to postconditions.

[3] P. Sommerlad and A. L. Sandoval. “P0052: Generic Scope Guard and RAII Wrapper for the Standard Library” (WG21 paper, February 2019). Based on pioneering work by Andrei Alexandrescu and Petru Marginean starting with “Change the Way You Write Exception-Safe Code – Forever” (Dr. Dobb’s Journal, December 2000), and widely implemented in D and other languages, the Folly library, and more.

[4] In a real system we’d want a few more variations, such as:

// A separate _V version for functions that don’t return
// a value, because 'void' isn’t regular
#define MY_POST_V(postconditions)                          \
    auto post = [&]{ assert( postconditions ); };

// Parallel _DECL forms to work on forward declarations,
// for people who want to repeat the postcondition there
#define MY_POST_DECL(postconditions)   // intentionally empty 
#define MY_POST_V_DECL(postconditions) // intentionally empty

Note: We could try to combine MY_POST_V and MY_POST by always creating both a single-parameter lambda and a no-parameter lambda, and then “overloading” them using something like compose from Boost’s wonderful High-Order Function library by Paul Fultz II. Then in a void-returning function return post() still works fine even with empty parens. I didn’t do that because the proposed future in-language contracts proposed in [1] uses a slightly different syntax depending on whether there’s a return value, so if our syntax doesn’t somehow have such a distinction then it will be harder to migrate this macro to a syntax like [1] with a simple search-and-replace.

[5] We could add extra machinery help the programmer remember to write return post, so that just executing a return without post will assert… set a flag that gets sets on every post() evaluation, and then assert that flag in the destructor of an RAII object for every normal return. The code is pretty simple with a scope guard [3]:

// Check that the programmer wrote “return post” each time
#define MY_POST_CHECKED                                     \
    auto post_checked = false;                              \
    auto post_guard = std::experimental::scope_success([&]{ \
        assert( post_checked );                             \
    });

Then in MY_POST and MY_POST_V, pull in this machinery and then also set post_checked:

#define MY_POST(postconditions)                             \
    MY_POST_CHECKED                                         \
    auto post = [&](auto&& _return_) -> auto&& {            \
        assert( postconditions );                           \
        post_checked = true;                                \
        return std::forward<decltype(_return_)>(_return_);  \
    };

#define MY_POST_V(postconditions)                           \
    MY_POST_CHECKED                                         \
    auto post = [&]{                                        \
        assert( postconditions );                           \
        post_checked = true;                                \
    };

If you don’t have a scope guard helper, you can roll your own, where “successful exit” is detectable by seeing that the std::uncaught_exceptions() exception count hasn’t changed:

// Hand-rolled alternative if you don’t have a scope guard
#define MY_POST_CHECKED                                     \
    auto post_checked = false;                              \
    struct post_checked_ {                                  \
        const bool *pflag;                                  \
        const int  ecount = std::uncaught_exceptions();     \
        post_checked_(const bool* p) : pflag{p} {}          \
        ~post_checked_() {                                  \
            assert( *pflag ||                               \
                    ecount != std::uncaught_exceptions() ); \
        }                                                   \
    } post_checked_guard{&post_checked}; 

[6] For strong-guarantee functions, we could try to check that all observable state is the same as on function entry. In some cases, we can partly do that… for example, writing the test that a failed vector::push_back didn’t invalidate any pointers into the container may sound hard, but it’s actually the easy part of that function’s “error exit” condition! Using a strawman syntax like [1], extended to include an “error” exit condition:

// (Using a hypothetical “error exit” condition)
// This is enough to check that no pointers into *this are invalid

template <typename T, typename Allocator>
constexpr void vector<T>::push_back( const T& )
    [[error( _in_.data() == data() && _in_.size() == size() )]] ;

But other “error exit” checks for this same function would be hard, expensive, or impossible to express. For example, it would be expensive to write the check that all elements in the vector have their original values, which would require first taking a deep copy of the container.

Acknowledgments

Thank you to the following for their feedback on this material: Joshua Berne, Gábor Horváth, Andrzej Krzemieński, James Probert, Bjarne Stroustrup, Andrew Sutton.

GotW #99: Postconditions (Difficulty: 7/10)

This special Guru of the Week series focuses on contracts. Postconditions are directly related to assertions (see GotW #97)… but how, exactly? And since we can already write postconditions using assertions, why would having language support benefit us more for writing postconditions more than for writing (ordinary) assertions?

JG Question

1. What is a postcondition, and how is it related to an assertion? Explain your answer using the following example, which uses a variation of a proposed post-C++20 syntax for postconditions. [1]

// A postcondition along the lines proposed in [1]

string combine_and_decorate( const string& x, const string& y )
    [[post( _return_.size() > x.size() + y.size() )]]
{
    if (x.empty()) {
        return "[missing] " + y + optional_suffix();
    } else {
        return x + ' ' + y + something_computed_from(x);
    }
}

Guru Questions

2. Rewrite the example in Question 1 to show how to approximate the same effect using assertions in today’s C++. Are there any drawbacks to your solution compared to having language support for postconditions?

3. Should a postcondition be expected to be true if the function throws an exception back to the caller? Justify your answer with example(s).

4. Should a postcondition be able to refer to both the initial (on entry) and final (on exit) value of a parameter, if those could be different? If so, give an example.

Notes

[1] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018). Subsequent EWG discussion favored changing “expects” to “pre” and “ensures” to “post,” and to keep it as legal compilable (if unenforced) C++20 for this article I also modified the syntax from : to ( ), and to name the return value _return_ for postconditions. That’s not a statement of preference, it’s just so the examples can compile today to make them easier to check.

GotW #98 Solution: Assertion levels (Difficulty: 5/10)

This special Guru of the Week series focuses on contracts. We covered basic assertions in GotW #97… but not all asserted conditions are created equal.

Given some assertion facility that can be used like this:

MyAssert( condition );  // expresses that ‘condition’ must be true

1. Give one example each of an asserted condition whose run-time evaluation is:

a) super cheap

Without resorting to constexpr expressions, it’s hard to find one cheaper than the one we saw in GotW #97 Example 3, which we can simplify down to this:

// Example 1(a): A dirt cheap assertion (from GotW #97 Example 3)

int min = /* some computation */;
int max = /* some other computation */;
MyAssert( min <= max );

This is always going to be dirt cheap. Not only is the integer comparison operation cheap, but min and max are already being accessed by this function and so they’re already “hot” in registers or cache.

b) arbitrarily expensive

“A condition that’s expensive? That sounds pretty easy,” you might say, and you’re right!

One commonly cited example is is_sorted. Just to emphasize how expensive it can be both in absolute terms and also relative to the program’s normal execution, let’s put it inside a well-known function… this is a precondition, but we’ll write it as an assertion in the function body for now which doesn’t affect this question: [1]

// Example 1(b): An arbitrarily expensive assertion

template <typename Iter, typename T>
bool binary_search( Iter begin, Iter end, const T& value ) {
    MyAssert( is_sorted(begin, end) );
    // ...
}

Checking that all the container’s elements are in ascending order requires visiting them all, and that O(N) linear complexity is arbitrarily expensive when the container’s size N can be arbitrarily large.

The icing on the cake: In this example, just evaluating the assertion requires doing more work than the entire function it appears in, which is only O(log N) complexity!

On a humorous note, O(N) remains O(N) no matter how hard we try to make it efficient:

MyAssert( std::is_sorted( std::execution::par, begin(s), end(s) ) );
                  // still O(N) arbitrarily expensive, but good try!

2. What does the answer to Question 1 imply for assertion checking? Explain.

We want a way to enable checking for some assertions but not others in the same code, because we may not always be able to afford expensive checks. That’s true whether we’re enabling checking at test time (e.g., on the developer’s machine) or at run time (e.g., in production).

Some conditions are so expensive that we may never check them without a good reason, even in testing. Example 1(b)’s is_sorted is a great example: You probably won’t ever enable it in production, and likely not by default in testing, except by explicitly turning it on during a bug hunt after enabling checking for cheaper assertions wasn’t enough or pointed at this data structure for deeper investigation. [2]

Other conditions are so cheap we’ll probably always check them absent a good reason not to, even in production. Example 1(a)’s min <= max is at this other end of the scale: It’s so dirt cheap to check that it’s unlikely we’ll ever have a performance reason to disable it. [2]

So it makes perfect sense that if Examples 1(a) and 1(b) appear in the same source file, the developer will want to enable checking for 1(b)’s assertion only by some kind of manual override to explicitly request it, and enable checking for 1(a)’s assertion all the time.

3. Give an example of an asserted condition that is in general impossible to evaluate, and so cannot be checked.

One common example is is_reachable for pointers or other iterators, to say that if we increment an iterator enough times we can make it equal to (refer to the same object as) a second iterator:

// Example 3: Very reasonable, but generally not checkable

auto first_iterator = /*...*/;
auto last_iterator  = /*...*/;
MyAssert( is_reachable(first_iterator, last_iterator) );
std::find( first_iterator, last_iterator, value );

In general, there’s no way to write is_reachable. You could try to increment first_iterator repeatedly until it becomes equal to last_iterator, but when the latter is not reachable that might never happen and even just trying would often be undefined behavior.

You might be tempted to test is_reachable using std::distance:

MyAssert( std::distance(first_iterator, last_iterator) >= 0 );

… but that would be horribly wrong. Can you see why?

Take a moment to think about it before continuing…

… okay. The answer is that std::distance itself requires that last_iterator is reachable from first_iterator, otherwise it’s undefined behavior. So this maybe-tempting-looking alternative actually assumes what we want to prove, and so it’s not useful for this purpose. (GotW #100 will consider in detail the general question of preconditions of contract subexpressions, which covers examples like this one.)

Can these kinds of conditions still be useful?

Yes. In practice, these kinds of conditions spell out “this is a formal comment.” Static analyzers and other tools may be able to test such a condition in a subset of cases; for example, at some call sites an analyzer may be able to infer statically that two iterators point into different containers and so one isn’t reachable from the other. Alternatively, the tools might support special pseudofunction names that they recognize when you use them in assertion expressions to give information to the tool. So the conditions can still be useful, even if they can’t generally be checked the normal way, by just evaluating them and inspecting the result.

4. How do these questions help answer:

a) what “levels” of asserted conditions we should be able to express?

There’s a wide spectrum of “expensiveness” of assertion conditions, ranging from cheap to arbitrarily high to even impossible. In the post-C++20 contracts proposal at [3], this is partly captured by the proposed basic levels of default, audit, and axiom, roughly intended to represent “cheap,” “expensive,” and “impossible” respectively.

Because we need to check these with different frequencies (or not at all), we need a way to enable and disable checking for subsets of assertions independently, even when they’re in the same piece of code.

GUIDELINE: Distinguish between (at least) “cheap,” “expensive,” and “impossible” to evaluate assertions. If you develop your own assertion system for in-house use, support enabling/disabling at least these kinds of assertions independently. [1] I say “at least” because what’s “expensive” is subjective and will vary from program to program, from team to team… and even within a program from your code to your third-party library’s code that you’re calling. Having just two preset “cheap” and “expensive” levels is minimal, but useful.

b) why the assertions we can “practically” write are a subset of all the ones we might “ideally” like to write?

It can be useful to distinguish between all ideal assertions, meaning everything that has to be true at various points in the program for it to run correctly, and the practical assertions, meaning the subset of those that can be reasonably expressed as a C++ expression and checked. In GotW #97 question 3, part of the solution says that “if an assertion fails” then…

there is a program bug, possibly in the assertion itself. The first place to look for the bug is in this same function, because if prior contracts were well tested then likely this function created the first unexpected state.

If we could write all ideal assertions, and exercise all control flow and data flow during testing, then a failed assertion would definitely mean a bug in the same function where it was written. Because we realistically can’t write and exercise them all, though, we could be observing a secondary effect from a bug that happened earlier. Still, this function is the first place to start looking for the bug.

Notes

[1] Upcoming GotWs will cover preconditions and violation handling. For handlers, we’ll cover additional distinctions such as categories of violations (e.g., to distinguish safety-related checks vs. other checks).

[2] As always, any checks left on in production would often install a different handler, such as a log-and-continue handler rather than a terminating handler; see GotW #97 Question 4, and note [1].

[3] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018).

Acknowledgments

Thank you to the following for their feedback on this material: Joshua Berne, Guy Davidson, J. Daniel Garcia, Gábor Horváth, Maciej J., Andrzej Krzemieński.

GotW #98: Assertion levels (Difficulty: 5/10)

This special Guru of the Week series focuses on contracts. We covered basic assertions in GotW #97… but not all asserted conditions are created equal.

JG Questions

Given some assertion syntax:

SomeAssert( condition );  // expresses that ‘condition’ must be true

1. Give one example each of an asserted condition whose run-time evaluation is:

a) super cheap

b) arbitrarily expensive

Guru Questions

2. What does the answer to Question 1 imply for assertion checking? Explain.

3. Give an example of an asserted condition that is in general impossible to evaluate, and so cannot be checked. Can these kinds of conditions still be useful?

4. How do these questions help answer:

a) what “levels” of asserted conditions we should be able to express?

b) why the assertions we can “practically” write are a subset of all the ones we might “ideally” like to write?

GotW #97 Solution: Assertions (Difficulty: 4/10)

Assertions have been a foundational tool for writing understandable computer code since we could write computer code… far older than C’s assert() macro, they go back to at least John von Neumann and Herman Goldstine (1947) and Alan Turing (1949). [1,2] How well do we understand them… exactly?

1. What is an assertion, and what is it used for?

An assertion documents the expected state of specific program variables at the point where the assertion is written, in a testable way so that we can find program bugs — logic errors that have led to corrupted program state. An assertion is always about finding bugs, because something the programmer thought would always be true was found to actually be false (oops).

For example, this line states that the program does not expect min to exceed max, and if it does the code has a bug somewhere:

// Example 1: A sample assertion

assert( min <= max );

If in this example min did exceed max, that would mean we have found a bug and we need to go fix this code.

GUIDELINE: Assert liberally. [3] The more sure you are that an assertion can’t be false, the more valuable it is if it is ever found to be false. And in addition to finding bugs today, assertions verify that what you believe is “obviously true” and wrote correctly today actually stays true as the code is maintained in the future.

GUIDELINE: Asserted conditions should never have side effects on normal execution. Assertions are only about finding bugs, not doing program work. And asserted conditions only evaluated if they’re enabled, so any side effects won’t happen when they’re not enabled; they might sometimes perform local side effects, such as to do logging or allocate memory, but the program should never rely on them happening or not happening. For example, adding an assertion to your code should never make a logically “pure” function into an impure function. (Note that “no side effects on normal execution” is always automatically true for violation handlers even when an assertion system such as proposed in [4] allows arbitrary custom violation handlers to be installed, because those are executed only if we discover that we’re in a corrupted state and so are already outside of normal execution. [5] For conditions, it’s up to us to make sure it’s true.)

2. C++20 supports two main assertion facilities… For each one, briefly summarize how it works, when it is evaluated, and whether it is possible for the programmer to specify a message to be displayed if the assertion fails.

assert

The C-style assert is a macro that is evaluated at execution time (hopefully enabled at least during your unit testing! see question 4) if NDEBUG is not set. The condition we pass it must be a compilable boolean expression, something that can be evaluated and converted to bool.

It doesn’t directly support a separate message string, but implementations will print the failed condition itself, and so a common technique is to embed the information in the condition itself in a way that doesn’t affect the result. For example, a common idiom is to append &&"message"  (a fancy way of saying &&true):

// Example 2(a): A sample assert() with a message

assert( min <= max
        && "BUG: argh, miscalculated min and max again" );

static_assert

The C++11 static_assert is evaluated at compile time, and so the condition has to be a “boolean constant expression” that only refers to compile-time known values. For example:

// Example 2(b): A sample static_assert() with a message

static_assert( sizeof(int) >= 4,
               "UNSUPPORTED PLATFORM: int must be at least 4 bytes" );

It has always supported a message string, and C++17 made the message optional.

Bonus: [[assert: ?

Looking forward, a proposed post-C++20 syntax for assertions would support it as a language feature, which has a number advantages including that it’s not a macro. [4] This version would be evaluated at execution time if checking is enabled. Currently that proposal does not have an explicit provision for a message and so programmers would use the && "message" idiom to add a message. For example:

// Example 2(c): An assertion along the lines proposed in [4]

[[assert( min <= max
          && "BUG: argh, miscalculated min and max again" )]] ;

3. If an assertion fails, what does that indicate, and who is responsible for fixing the failure?

A failed assertion means that we checked and found the tested variables to be in an unexpected state, which means at least that part of the program’s state is corrupt. Because the program should never have been able to reach that state, two things are true:

  • There is a program bug, possibly in the assertion itself. The first place to look for the bug is in this same function, because if prior contracts were well tested then likely this function created the first unexpected state. [5]
  • The program cannot recover programmatically by reporting a run-time error to the calling code (via an exception, error code, or similar), because by definition the program is in a state it was not designed to handle, so the calling code isn’t ready for that state. It’s time to terminate and restart the program. (There are advanced techniques that involve dumping and restarting an isolated portion of possibly tainted state, but that’s a system-level recovery strategy for an impossible-to-handle fault, not a handling strategy for run-time error.) Instead, the bug should be reported to the human developer who can fix the bug.

GUIDELINE: Don’t use assertions to report run-time errors. Run-time errors should be reported using exceptions, error codes, or similar. For example, don’t use an assertion to check that a remote host is available, or that the user types valid input. Yes, std::logic_error was originally created to report bugs (logic errors) using an exception, but this is now widely understood to be a mistake; don’t follow that pattern.

Referring to this example:

// Example 3

void f() {
    int min = /* some computation */;
    int max = /* some other computation */;

    // still yawn more yawn computation

    assert( min <= max );         // A

    // ...
}

In this code, if the assertion at line A is false, that means what the function actually did before the assertion doesn’t match what the assertion condition expected, so there is a bug somewhere in this function — either before or within the assertion.

This demonstrates why assertions are primarily about eliminating bugs, which is why we test…

4. Are assertions primarily about checking at compile time, at test time, or at run time? Explain.

Assertions are primarily about finding bugs at test time. But assertions can also be useful at other times because of some well-known adages: “the sooner the better,” “better late than never,” and “never [will] I ever.”

Bonus points for pointing out that there is also a fourth time in the development cycle I didn’t list in the question, when assertions can be profitably checked. Here they are:

Of course this can be even more nuanced. For example, you might make different decisions about enabling assertions if your “run time” is an end user’s machine, or a server farm, or a honeypot. Also, checking isn’t free and so you may enable run-time checking for severe classes of bugs but not others, such as that an operating system component may require checking in production for all out-of-bounds violations and other potential security bugs, but not non-security classes of bugs.

First, “the sooner the better”: It’s always legal and useful to find bugs as early as possible. If we can find a bug even before actually executing a compiled test, then that’s wonderful. This is a form of shift-left. We love shift-left. There are two of these times in the graphic:

  • (Earliest, best) Edit time: By using a static analysis tool that is aware of assert and can detect some violations statically, you can get some diagnostics as you’re writing your code, even before you try to compile it! Note that to recognize the assert macro, you want to run the static analyzer in debug mode; analyzers that run after macro substitution won’t see an assert condition when the code is set to make release builds since the macro will expand to nothing. Also, usually this kind of diagnostic uses heuristics and works on a best-effort basis that catches some mistakes while not diagnosing others that look similar. But it does shift some diagnostics pretty much all the way left to the point where you’re actually writing the code, which is great when it works… and you still always have the next three assertion checking times available as a safety net.
  • (Early) Compile time: If a bug that depends only on compile-time information can be detected at compile time even before actually executing a compiled test, then that’s wonderful. This is one reason static_assert exists: so that we can express tests that are guaranteed to be performed at compile time.

Next, the primary target:

  • Test time: This is the main time tests are executed. It can be during developer-machine unit testing, regression testing, build-lab integration testing, or any other flavor of testing. The point is to find bugs before they escape into production, and inform the programmer so they can fix their bug.

Finally, “better late than never” (safety net) or “never [will] I ever” (intolerable severe condition):

  • (Late) Run time: Even after we release our code, it can be useful to have a way to enable checking at run time to at least log-and-continue (e.g., using facilities such as [6] or [7]). One motivation is to know if a bug made it through testing and out into the field and get better late-debug diagnostics; this is sometimes called shift-right but I think of it as much as being about belt-and-suspenders. Another motivation is to ensure that severe classes of bugs ensure execution will halt outright if we cannot tolerate continuing after such a fault is detected.

Importantly, in all cases the motivation is still debugging: Findings bugs early is still debugging, just better (sooner and less expensive). And finding bugs late that escaped into production is still debugging, just worse (later and more expensive). Each of these times is a successive safety net for bugs that make it past the earlier times.

Because at run time we may want to log a failed assertion, our assertion violation handler should be able to USE-A logging system, but the relationship really is USES-A. An assertion violation handling system IS-NOT-A general-purpose logging system, and so a contracts language feature shouldn’t be designed around such a goal. [5]

Finally, speaking of run time: Note that it can be useful to write an assertion, and also write code that does some handling if the assertion is false. Here’s an example from [8]:

// Example 4: Defense in depth

int DoSomething(int x) {

    assert( x != 0 && "x should be nonzero" ); // finds bug, if checked
    if( x == 0 ) {
        return INVALID_COOKIE; // robustness fallback, if not checked
    }

    // do useful work

}

You might see this pattern written interleaved as follows to avoid duplicating the condition, and this is one of the major patterns that leads to writing assert(!"message"):

    if( x == 0 ) {
        assert( !"x should be nonzero" ); // finds bug, if checked
        return INVALID_COOKIE; // robustness fallback, if not checked
    }

At first this may look like it’s conflating the distinct “bug” and “error” categories we saw in Question 3’s table. But that’s not the case at all, it’s actually deliberately using both categories to implement “defense in depth”: We assert something in testing to minimize actual occurrences, but then in production still provide fallback handling for robustness in case a bug does slip through, for example if our test datasets didn’t exercise the bug but in production we hit some live data that does.

Notes

With thanks to Wikipedia for the first two references.

[1] H. H. Goldstine and J. von Neumann. “Planning and Coding of problems for an Electronic Computing Instrument” (Report on the Mathematical and Logical Aspects of an Electronic Computing Instrument, Part II, Volume I, p. 12; Institute for Advanced Study, April 1947).

[2] Alan Turing. “Checking a Large Routine” (Report of a Conference on High Speed Automatic Calculating Machines, pp. 67-9, June 1949).

[3] H. Sutter and A. Alexandrescu. C++ Coding Standards (Addison-Wesley, 2004). Item 68, “Assert liberally to document internal assumptions and invariants.”

[4] G. Dos Reis, J. D. Garcia, J. Lakos, A. Meredith, N. Myers, and B. Stroustrup. “P0542: Support for contract based programming in C++” (WG21 paper, June 2018). To keep it as legal compilable (if unenforced) C++20 for this article I modified the syntax from : to ( ). That’s not a statement of preference, it’s just so the examples can compile today to make them easier to check.

[5] Upcoming GotWs will cover postconditions, preconditions, invariants, and violation handling.

[6] G. Melman. spdlog: Fast C++ logging library (GitHub).

[7] Event Tracing for Windows (ETW) (Microsoft, 2018).

[8] H. Sutter. “P2064: Assumptions” (WG21 paper, 2020).

Acknowledgments

Thank you to the following for their comments on drafts of this article: Joshua Berne, Gábor Horváth, Andrzej Krzemieński, Andrew Sutton. Thanks also to Reddit user “evaned” and Anton Dyachenko for additional feedback.

GotW #97: Assertions (Difficulty: 4/10)

Assertions have been a foundational tool for writing understandable computer code since we could write computer code… far older than C’s assert() macro, they go back to at least John von Neumann and Herman Goldstine (1947) and Alan Turing (1949). [1,2] How well do we understand them… exactly?

[Update: On second thought, I’ll break the “assertions” and “postconditions” into two GotWs. This GotW has the assertion questions, slightly reordered for flow, and GotW #99 will cover postconditions.]

JG Questions

1. What is an assertion, and what is it used for?

2. C++20 supports two main assertion facilities:

  • assert
  • static_assert

For each one, briefly summarize how it works, when it is evaluated, and whether it is possible for the programmer to specify a message to be displayed if the assertion fails.

Guru Questions

3. If an assertion fails, what does that indicate, and who is responsible for fixing the failure? Refer to the following example assertion code in your answer.

void f() {
    int min = /* some computation */;
    int max = /* some other computation */;

    // still yawn more yawn computation

    assert (min <= max);         // A

    // ...
}

4. Are assertions primarily about checking at compile time, at test time, or at run time? Explain.

Notes

Thanks to Wikipedia for pointing out these references.

[1] H. H. Goldstine and J. von Neumann. “Planning and Coding of problems for an Electronic Computing Instrument” (Report on the Mathematical and Logical Aspects of an Electronic Computing Instrument, Part II, Volume I, p. 12; Institute for Advanced Study, April 1947.)

[2] Alan Turing. “Checking a Large Routine” (Report of a Conference on High Speed Automatic Calculating Machines, pp. 67-9, June 1949).

Trip report: Summer ISO C++ standards meeting (Rapperswil)

On Saturday June 9, the ISO C++ committee completed its summer meeting in beautiful Rapperswil, Switzerland, hosted with thanks by HSR Rapperswil, Zühlke, Netcetera, Bbv, SNV, Crealogix, Meeting C++, and BMW Car IT GmbH. We had some 140 people at the meeting, representing 11 national bodies. As usual, we met for six days Monday through Saturday, this time including all evenings.

Per our C++20 schedule, this was the second-last meeting for merging major language features into C++20. So we gave priority to the major proposals that might make C++20 or otherwise could make solid progress, and we deferred other proposals to be considered at a future meeting — not at all as a comment on the proposals, but just for lack of time. We expect to get to these soon once C++20 is well in hand.

Top news: Contracts adopted for C++20

Contracts (Gabriel Dos Reis, J. Daniel Garcia, John Lakos, Alisdair Meredith, Nathan Myers, Bjarne Stroustrup) was formally adopted for C++20.

Contracts allow preconditions, postconditions, and assertions to be expressed in code using a uniform syntax, with options to have different contract levels, custom violation handlers, and more.

Here are a few quick examples to get the flavor. Let’s start with perhaps the most familiar contract, assert:

void f() {
    int x = g();
    int y = h();
    [[assert: x+y > 0]]
}

“But wait,” someone might say, “C’s assert macro was good enough for my grandpappy, so shouldn’t it be good enough for me?” Never fear, you can still assert(x), but if you respell it as [[assert:x]] you get some benefits. For example:

  • You’re not relying on a macro (unlike C assert). Yes, this matters, because macros are outside the language and routinely cause problems when using language features. For example, this was demonstrated again just a few days ago on Twitter by Nico Josuttis (HT: Alisdair Meredith), who pointed out that assert(c==std::complex<float>{0,0}) does not compile; the reason is because macros don’t understand language commas, but [[assert: c==std::complex<float>{0,0}]] works just fine without any surprises.
  • You get to install your own violation handler and ship a release build with the option of turning on enforcement at run time.
  • You get to express audit to distinguish expensive checks to be run only when explicitly requested.
  • You get to express axiom contracts that are intended to never generate run-time code but are available to static analysis tools.
  • Finally, you will likely get better performance, because contracts should enable compilers to perform more optimizations, more easily, than expressing them using assertions.

But there’s more, because of course contracts are not just about assertions. They also include expects preconditions and ensures postconditions, which are part of the function declaration and so are visible at call sites:

double sqrt(double x) [[expects: x >= 0]];

void sort(vector<emp>& v) [[ensures audit: is_sorted(v)]];
    // could be expensive, check only when audit is requested

In addition to similar benefits as for assert, expects preconditions in particular deliver some enforcement benefits that are very desirable and difficult or impossible to get by hand:

  • Preconditions are usually enforced at the call site, which is what we want most of the time because a precondition violation always means a programming bug in the calling code.
  • But preconditions can also be enforced in the callee, which can sometimes be necessary for pragmatic reasons, such as when the function is invoked through an opaque function pointer.
  • Preconditions and postconditions that are known at the call site also give the optimizer more information to potentially make your code fast.

As a cheeky aside, if you noticed that I mentioned optimization several times, it’s for the usual reason: The simplest way to get C++ programmers to want a feature is to show that it can make their code faster, even if performance isn’t the only motivation or benefit.

Why contracts are a big deal, and related Rapperswil progress

In my opinion, contracts is the most impactful feature of C++20 so far, and arguably the most impactful feature we have added to C++ since C++11. That statement might surprise you, so let me elaborate why I think so.

Having first-class contracts support it is the first major “step 1” of reforming error handling in C++ and applying 30 years’ worth of learnings.

Step 2 is to (gradually) migrate std:: standard library precondition violations in particular from exceptions (or error codes) to contracts. The programming world now broadly recognizes that programming bugs (e.g., out-of-bounds access, null dereference, and in general all pre/post/assert-condition violations) cause a corrupted state that cannot be recovered from programmatically, and so they should never be reported to the calling code as exceptions or error codes that code could somehow handle. Over the past three meetings (Albuquerque through Rapperswil), the Library Evolution Working Group (LEWG) voted unanimously to pursue P0788R2 (Walter Brown) and relatedly in Rapperswil voted unanimously to pursue section 4.2 of P0709 (my paper), which includes pursuing a path of gradually migrating all standard library preconditions from exceptions toward contracts. (Note that in the near term, the idea is for implementations to be allowed, but not required, to use contracts. We are bringing the community forward gently here.)

Why is step 2 so important? Because it’s not just window dressing: Gradually switching precondition violations from exceptions to contracts promises to eventually remove a majority of all exceptions thrown by the standard library(!). This principle applies across programming languages; for examples, in Java and .NET some 90% of all exceptions are thrown for precondition violations (e.g., ArgumentNullException). Being able to eliminate a majority of all exceptions, which eventually enables many more functions to be noexcept, is a huge improvement for both correctness and performance:

  • Correctness: It eliminates a majority of the invisible control flow paths in our code. For example, over 20 years ago I wrote GotW #20 [Sutter 1997] that shows how today a 4-line function has 3 normal execution paths and 20 invisible exceptional execution paths; if we can eliminate a majority of the functions that can throw, we immediately remove a majority of the invisible possible execution paths in functions like this one, in all calling code.
  • Performance: More noexcept enables more optimization and faster code. (You knew that was coming, right?)

Once you change all preconditions (and postconditions and assertions) from exceptions to contracts, eliminating some of the largest categories of exceptions, one specific kind of exception dominates all others: bad_alloc. Which brings us to step 3…

Step 3 is to consider handling heap exhaustion (out-of-memory, OOM) differently from other errors. If in addition to not throwing on precondition violations we also do not throw on OOM, the vast majority of all C++ standard library functions can be noexcept. — Needless to say, this would be a huge change where we need to tread carefully and measure impact and high-fidelity compatibility in major real code bases, and we don’t want people to panic about it or think we’ve lost our minds: We are serious about bringing code forward gently after validating good adoptability and low impact long before this gets near an actual standard.

Nevertheless, we are also serious about improving this, and the fundamental change is simple and already fully supported in C++ today: In Rapperswil, LEWG also voted unanimously to pursue section 4.3 of P0709 (my paper) which proposes pursuing a path of gradually migrating all OOM from bad_alloc to new(nothrow)-like mechanisms. The initial contemplated step, definitely not for C++20, would be to change the default new_handler from throwing bad_alloc to calling terminate. That might sound like a big change, but it’s not, it’s already fully supported in C++ today because you can already change the new_handler to terminate today with a single line of code (std::set_new_handler([]{terminate();});), and this would just be changing the default and existing code that wants to keep the current behavior could still write simply the reverse single line of code (std::set_new_handler([]{throw std::bad_alloc();});) to get exactly the current behavior.

To repeat, this is a feature that will want a clear high-fidelity zero-breakage migration path, and we’re treating that compatibility seriously, even as we are also treating solving this problem seriously to modernize C++ error handling and move toward a mostly-noexcept world.

You can find a more detailed writeup in my new proposal P0709, particularly sections 1.1, 4.2, and 4.3. Again, P0709 is not for C++20, it is to illustrate a direction and potential path. The other parts of P0709 have not yet been reviewed by the full committee, so for now they should not be treated as anything more than a proposal, subject to much discussion and feedback over the coming several years.

Other new features approved for C++20

We adopted several other new features into the draft standard.

Feature test macros (Ville Voutilainen, Jonathan Wakely). This enables code to portably test whether a certain new C++ feature exists. “Why would I do that?” someone might ask. The primary and biggest benefit is they help your team to start to adopt new C++ features even before all your compilers support them; just write

#if /*new feature is present*/ // temporary
    nice clean modern code!    // <-- keep this long-term
#else                          // temporary
    do it the old way          // temporary
#endif                         // temporary

and eventually drop the #if test and the whole #else block as soon as all your compilers are up to date. This is so much better than today, where one of two things happens: (1) Teams often wait until all their compilers support a given new C++ feature before they start to use it, which slows down adopting the new features and getting their benefits at least on the compilers that do support it. Or, (2) teams roll their own nonstandard nonportable compiler-specific “approximate” feature tests (e.g., write their own macro for “I know this feature is available on version ## of MSVC and version ## of GCC” by hand).

We all agree we don’t like macros. However, we don’t yet have replacements for all uses of macros, including this one, and these standard macros are better than rolling your own nonstandard ones or, more likely, just not using new C++ features at all for a longer time.

Some experts still disagree, and we respect their views, but in my view these feature test macros are an important and pragmatic help to improve the speed of adoption of new standard C++ features.

Standard library concepts (Casey Carter, Eric Niebler). This is the first part of the Ranges TS to be merged into C++20 at an upcoming meeting, and contains the core concepts from the Ranges TS. It is also the first appearance of the concepts language feature in the standard library. Expect more to come for C++20 (see Ranges, below).

Class Types in Non-Type Template Parameters (Jeff Snyder, Louis Dionne). Yes, you can now have types other than int and char as non-type template parameters. For example, in addition to template<int Size>, you can now have things like template<fixed_string S> for a suitably defined class type. It turns out that this builds on the <=> spaceship comparison operator; and if you’re wondering why, it’s because we know the semantics of a defaulted <=> comparison, which is essential because the compiler has to perform comparisons to determine whether two template instantiations are the same.

Note: I did not have this benefit in mind at all when I proposed <=>. I think this is a great example where, when you let programmers express intent explicitly as with <=>, and provide useful defaults, you are inherently adding more reliable information to the source code and will reap additional benefits because you can now build on knowledge of that clear programmer intent.

explicit(bool) (Barry Revzin). This is conditional explicit, along the same lines as conditional noexcept. It lets library writers write explicit at a finer granularity, to turn off conversions where possible and feasible, without having to write two functions all the time.

Bit-casting object representations (JF Bastien). Header <bit> now lets you bit_cast. “But wait,” someone may be thinking, “don’t we already have reinterpret_cast?” Yes we do, and this is still a reinterpreting operation, but bit_cast has less unsafety and some additional flexibility: it ensures the sizes of the From and To types match, guarantees that they are both actually trivially copyable, and as a bonus makes the operation constexpr wherever possible.

Speaking of constexpr bit_cast, here are some more items in the “constexpr all the things!” department:

Other progress and decisions

Reflection TS is feature-complete: The Reflection TS was declared feature-complete and is being sent out for its main comment ballot over the summer. Note again that the TS’s current template metaprogramming-based syntax is just a placeholder; the feedback being requested is on the core “guts” of the design, and the committee already knows it intends to replace the surface syntax with a simpler programming model that uses ordinary compile-time code and not <>-style metaprogramming.

Parallelism TS2 is done: The second Parallelism TS is done! We completed ballot comment resolution, and approved the final TS for publication. This TS includes more execution policies, an exception_list type to communicate multiple parallel exceptions, and more parallel algorithms including wavefront, reductions, inductions, parallel for, and more. It also includes task_block support to enable writing custom parallel algorithms.

Graphics (io2d) is deferred: Many thanks to Michael McLaughlin especially, and to Guy Davidson, Michael Kazakov, and David Ludwig for their assistance with the design, specification, and implementation of this cross-platform library. This is a project that has been worked on for several years, and for the past two years has been primarily responding to committee tweak requests; unfortunately, although the committee requested a series of updates to the proposal that have been applied, at this meeting the committee decided that it does not have interest to pursue further work on graphics in any form at this time after all. However, the io2d project will continue, and be available on package managers (Conan, vcpkg), and we expect a renewed proposal in the medium term; in the meantime, watch Guy Davidson’s blog for news and updates.

Also, LEWG adopted P0921r2 as a statement of their direction for future C++ standard library evolution compatibility guarantees.

Updates on other major proposals

Here are other major items in progress. You’ll notice that the first six (6!) of them mention expectations for our next meeting this November in San Diego. Not all of those items will make C++20 in San Diego, but people are going to try. It’s not surprising that San Diego is going to be a busy meeting, though; that was expected, because it’s the last meeting to merge major features into C++20, and deadlines are famously motivational. — Just do not expect all of the following to make C++20, and I’m listing them in rough order starting with the most likely to make it in.

(very likely for C++20) Ranges: In my previous trip report I mentioned that the core concepts from the Ranges TS were expected to make C++20, but the rest of the Ranges TS would be “C++20 or C++23.” Since then we have made faster than expected progress, and it now looks like Ranges is “likely” to make C++20 in the next meeting or two. For those interested in details, in addition to all of the Ranges TS, also paper P0789 on Range Adaptors and Utilities have now progressed to detailed wording review and are targeting C++20. In sum, to quote Eric Niebler: “If you liked the Ranges TS, you’ll love C++20.”

(likely for C++20) Concepts: “convenience” notation for constrained templates: We already added the concepts core feature to C++20, and at this meeting we had further discussions on adding a convenience syntax to write constrained templates without resorting to the low-level “requires” keyword. The two major active proposals that received discussion were from Bjarne Stroustrup and from me. The good news is that the Evolution Working Group (EWG) liked both, which means that for the first time we have a proposal based on the TS syntax (Bjarne’s preference) that could get consensus to be approved!

The key people are continuing to work to come up with a merged proposal that might be adoptable for C++20 in November in San Diego, and I’m pleased to report that as of this post-meeting mailing we for the first time have a unified proposal that lists most of the previous authors of papers in this area as coauthors, you can find it here: P1141R0: “Yet another approach for constrained declarations.” I’m guardedly optimistic that we may have a winner here; we’ll know in San Diego. (I sometimes delay my trip report until the post-meeting mailing is available so that everyone can see the latest papers, and knowing that this new paper was coming was one reason I delayed this report.)

(maybe for C++20) Coroutines: EWG considered an alternative, then decided to go forward with the TS approach. That came up for full committee vote but fell just short and was not adopted for C++20; the proposers will continue to work on improving consensus over the summer by addressing remaining concerns and we expect coroutines to be proposed again for C++20 at our November meeting in San Diego.

Modules: For the first time, the committee saw a merged approach that both major proposers said satisfies their requirements; that announcement was met by applause in the room. The merged proposal aims to combine the “best of” the Modules TS and the Atom alternative proposal, and that direction was approved by EWG. EWG did not approve the poll to incorporate a subset of it into C++20 at this meeting; it is not yet clear whether part of the proposal can make C++20 but we are going to have an extra two-day meeting in September to help make progress and we expect it to be proposed again for C++20 at our November meeting in San Diego.

Executors: This is still not expected to be part of C++20, but key people have not given up and are trying to make it happen, and one never can tell. We are going to hold an extra two-day meeting in September to help make progress on Executors, and expect to have a lively discussion about options to merge all or parts of it into C++20 in November in San Diego.

Networking: This is pretty much ready except that it depends on Executors. Soon after Executors are ready for C++20, Networking is ready to slide in right behind it.

Clearly San Diego is going to be a busy meeting. But before then we have two extra design meetings on modules and executors to help improve their chances of progress; those will be co-located with CppCon in September, to take place near the CppCon site on the days just before the conference begins. On top of that, there will also be an extra library wording issues meeting in the Chicago area in August… all in all, it’ll be a full summer and fall before we even get to San Diego.

Additionally, SG12 had productive discussions about undefined behavior (including with participation from our sister ISO working group WG23, Programming Language Vulnerabilities), and SG15 had a second exploratory evening session focusing on package managers for C++.

What’s next

Here is a cheat-sheet summary of our current expectations for some of the major pieces of work. Note that, as always, this is an estimate only. The bolded parts are the major changes from last time, including that Ranges as a whole is looking very likely for C++20.

wg21-schedule-2018-06

And here is an updated snapshot of where we are on the timeline for C++20 and the TSes that are completed, in flight, or expected to begin:

wg21-timeline-2018-06

Thank you again to the approximately 140 experts who attended this meeting, and the many more who participate in standardization through their national bodies! Have a good spring… we look forward now to our next “extra” meetings in September (Bellevue, WA, USA) and the next regular WG21 meeting in November (San Diego, CA, USA).

Interview: On simplifying C++

I was also interviewed recently by Anastasia Kazakova for the CLion blog, and that interview is now live:

Toward a more powerful and simpler C++ with Herb Sutter

Topics include:

  • Concepts and modules (and coroutines) as the true hot topics right now
  • How my work on metaclasses was motivated and developed
  • Obligatory aside on operator<=> which grew out of the same work
  • Good and bad ways to learn from other languages and their experience
  • What are the next questions to be answered for metaclasses proposal
  • What has been the committee’s feedback so far
  • How can we expect to see reflection, compile-time code, injection, and metaclasses both progress in committee and get built into production compilers
  • How toolable are today’s C++11/14/17 features, and what about toolability for metaclasses