GotW #91: Smart Pointer Parameters

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #105.

How should you prefer to pass smart pointers, and why?

Problem

JG Question

1. What are the performance implications of the following function declaration? Explain.

void f( shared_ptr<widget> );

Guru Questions

2. What are the correctness implications of the function declaration in #1? Explain with clear examples.

3. A colleague is writing a function f that takes an existing object of type widget as a required input-only parameter, and trying to decide among the following basic ways to take the parameter (omitting const):

void f( widget* );              (a)
void f( widget& );              (b)
void f( unique_ptr<widget> );   (c)
void f( unique_ptr<widget>& );  (d)
void f( shared_ptr<widget> );   (e)
void f( shared_ptr<widget>& );  (f)

Under what circumstances is each appropriate? Explain your answer, including where const should or should not be added anywhere in the parameter type.

(There are other ways to pass the parameter, but we will consider only the ones shown above.)

GotW #90 Solution: Factories

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #104.

What should factory functions return, and why?

Problem

While spelunking through the code of a new project you recently joined, you find the following factory function declaration:

widget* load_widget( widget::id desired );

JG Question

1. What’s wrong with this return type?

Guru Questions

2. Assuming that widget is a polymorphic type, what is the recommended return type? Explain your answer, including any tradeoffs.

3. You’d like to actually change the return type to match the recommendation in #2, but at first you worry about breaking source compatibility with existing calling code; recompiling existing callers is fine, but having to go change them all is not. Then you have an “aha!” moment, realizing that this is a fairly new project and all of your calling code is written using modern C++ idioms, and you go ahead and change the return type without fear, knowing it will require few or no code changes to callers. What makes you so confident?

4. If widget is not a polymorphic type, what is the recommended return type? Explain.

Solution

1. What’s wrong with this return type?

First, what can we know or reasonably expect from the two-line problem description?

We’re told load_widget is a factory function. It produces an object by “loading” it in some way and then returning it to the caller. Since the return type is a pointer, the result might be null.

The caller will reasonably expect to use the object, whether by calling member functions on it or passing it to other functions or in some other way. That isn’t safe unless the caller owns the object to ensure it is alive—either the caller gets exclusive ownership, or it gets shared ownership if the factory also maintains an internal strong or weak reference.

Because the caller has or shares ownership, he has to do something when the object is no longer needed. If the ownership is exclusive, he should destroy the object somehow. Otherwise, if the ownership is shared, he should decrement some shared reference count.

Unfortunately, returning a widget* has two major problems. First, it’s unsafe by default, because the default mode of operation (i.e., when the caller writes whitespace) is to leak a widget:

// Example 1: Leak by default. Really, this is just so 20th-century...
// 
widget* load_widget( widget::id desired );

:::

load_widget( some_id ); // oops

The Example 1 code compiles cleanly, runs, and (un)happily leaks the widget.

Guideline: Don’t use explicit new, delete, and owning * pointers, except in rare cases encapsulated inside the implementation of a low-level data structure.

Second, the signature conveys absolutely no information other than “a widget? sure, here you go! enjoy.” The documentation may state that (and how) the caller is to own the object, but the function declaration doesn’t—it’s either exclusive ownership or shared ownership, but which? Read and remember the function’s documentation, because the function declaration isn’t telling. The signature alone doesn’t even say whether the caller shares in ownership at all.

2. Assuming that widget is a polymorphic type, what is the recommended return type? Explain your answer, including any tradeoffs.

If widget is a type that’s intended to be used polymorphically and held by pointer or reference, the factory should normally return a unique_ptr to transfer ownership to the caller, or a shared_ptr if the factory shares ownership by retaining a strong reference inside its internal data structures.

Guideline: A factory that produces a reference type should return a unique_ptr by default, or a shared_ptr if ownership is to be shared with the factory.

This solves both problems: safety, and self-documentation.

First, consider how this immediately solves the safety problem in Example 1:

// Example 2: Clean up by default. Much better...
// 
unique_ptr<widget> load_widget( widget::id desired );

:::

load_widget( some_id ); // cleans up

The Example 2 code compiles cleanly, runs, and happily cleans up the widget. But it’s not just correct by default—it’s correct by construction, because there’s no way to make a mistake that results in a leak.

Aside: Someone might say, “but can’t someone still write load_widget(some_id).release()?” Of course they can, if they’re pathological; the correct answer is, “don’t do that.” Remember, our concern is to protect against Murphy, not Machiavelli—against bugs and mistakes, not deliberate crimes—and such pathological abuses fall into the latter category. That’s no different from, and doesn’t violate type safety any more than, explicitly calling Dispose early in a C# using block or explicitly calling close early in a Java try-with-resources block.

What if the cleanup should be done by something other than a plain delete call? Easy: Just use a custom deleter. The icing on the cake is that the factory itself knows which deleter is appropriate and can state it at the time it constructs the return value; the caller doesn’t need to worry about it, especially if he takes the result using an auto variable.

Second, this is self-documenting: A function that returns a unique_ptr or a value clearly documents that it’s a pure “source” function, and one that returns a shared_ptr clearly documents that it’s returning shared ownership and/or observation.

Finally, why prefer unique_ptr by default, if you don’t need to express shared ownership? Because it’s the Right Thing To Do for both performance and correctness, as noted in GotW #89, and leaves all options open to the caller:

Returning unique_ptr expresses returning unique ownership, which is the norm for a pure “source” factory function.
unique_ptr can’t be beat in efficiency—moving one is about as cheap as moving/copying a raw pointer.
If the caller wants to manage the produced object’s lifetime via shared_ptr, they can easily convert to shared_ptr via an implicit move operation—no need to say std::move because the compiler already knows that the returned value is a temporary object.
If the caller is using any other arbitrary method of maintaining the object’s lifetime, he can convert to a custom smart pointer or other lifetime management scheme simply by calling .release(). This can be useful, and isn’t possible with shared_ptr.

Here it is in action:

// Example 2, continued
//

// Accept as a unique_ptr (by default)
auto up = load_widget(1);

// Accept as a shared_ptr (if desired)
auto sp = shared_ptr<widget>{ load_widget(2) };

// Accept as your own smart pointer (if desired)
auto msp = my::smart_ptr<widget>{ load_widget(3).release() };

Of course, if the factory retains some shared ownership or observation, whether via an internal shared_ptr or weak_ptr, return shared_ptr. The caller will be forced to continue using it as a shared_ptr, but in that case that’s appropriate.

3. […] you go ahead and change the return type without fear, knowing it will require few or no code changes to callers. What makes you so confident?

Modern portable C++ code uses unique_ptr, shared_ptr, and auto. Returning unique_ptr works with all three; returning shared_ptr works with the last two.

If the caller accepts the return value in an auto variables, such as auto w = load_widget(whatever);, then the type will just naturally be correct, normal dereferencing will just work, and the only source ripple will be if the caller tries to explicitly delete (if so, the delete line can rather appropriately be deleted) or tries to store into a non-local object of a different type.

Guideline: Prefer declaring variables using auto. It’s shorter, and helps to insulate your code from needless source ripples due to minor type changes.

Otherwise, if the caller isn’t using auto, then it’s likely already using the result to initialize a unique_ptr or shared_ptr because modern C++ calling code does not traffick in raw pointers for non-parameter variables (more on this next time). In either case, returning a unique_ptr just works: A unique_ptr can be seamlessly moved into either of those types, and if the semantics are to return shared ownership and then the caller should already be using a shared_ptr and things will again work just fine (only probably better than before, because for the original return by raw pointer to work correctly the return type was probably forced to jump through the enable_shared_from_this hoop, which isn’t needed if we just return a shared_ptr explicitly).

4. If widget is not a polymorphic type, what is the recommended return type? Explain.

If widget is not a polymorphic type, which typically means it’s a copyable value type or a move-only type, the factory should return a widget by value. But what kind of value?

In C++98, programmers would often resort to returning a large object by pointer just to avoid the penalty of copying its state:

// Example 4(a): Obsolete convention: return a * just to avoid a copy 
//
/*BAD*/ vector<gadget>* load_gadgets() {
    vector<gadget>* ret = new vector<gadget>();
    // ... populate *ret ...
    return ret;
}

// Obsolete calling code (note: NOT exception-safe) 
vector<gadget>* p = load_gadgets();
if(p) use(*p);
delete p;

This has all of the usability and fragility problems discussed in #1. Today, normally we should just return by value, because we will incur only a cheap move operation, not a deep copy, to hand the result to the caller:

// Example 4(b): Default recommendation: return the value
//
vector<gadget> load_gadgets() {
    vector<gadget> ret;
    // ... populate ret ...
    return ret;
}

// Calling code (exception-safe)
auto v = load_gadgets();
use(v);

Most of the time, return movable objects by value. That’s all there is to it, if the only reason for the pointer on the return type was to avoid the copy.

There could be one additional reason the function might have returned a pointer, namely to return nullptr to indicate failure to produce an object. Normally it’s better throw an exception to report an error if we fail to load the widget. However, if not being able to load the widget is normal operation and should not be considered an error, return an optional<widget>, and probably make the factory noexcept if no other kinds of errors need to be reported than are communicated well by returning an empty optional<widget>.

// Example 4(c): Alternative if not returning an object is normal
//
optional<vector<gadget>> load_gadgets() noexcept {
    vector<gadget> ret;
    // ... populate ret ...
    if( success )            // return vector (might be empty)
        return move(ret);    // note: move() here to avoid a silent copy
    else
        return {};           // not returning anything
}

// Calling code (exception-safe)
auto v = load_gadgets();
if(v) use(*v);

Guideline: A factory that produces a non-reference type should return a value by default, and throw an exception if it fails to create the object. If not creating the object can be a normal result, return an optional<> value.

Coda

By the way, see that test for if(v) in the last line of Example 4(c)? It calls a cool function of optional<T>, namely operator bool. What makes bool so cool? In part because of how many C++ features it exercises. Here is its declaration… just think of what this lets you safely do, including at compile time! Enjoy.

constexpr explicit optional<T>::operator bool() const noexcept;

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: Johannes Schaub, Leo, Vincent Jacquet.

GotW #90: Factories

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #104.

What should factory functions return, and why?

Problem

While spelunking through the code of a new project you recently joined, you find the following factory function declaration:

widget* load_widget( widget::id desired );

JG Question

1. What’s wrong with this return type?

Guru Questions

2. Assuming that widget is a polymorphic type, what is the recommended return type? Explain your answer, including any tradeoffs.

4. If widget is not a polymorphic type, what is the recommended return type? Explain.

GotW #89 Solution: Smart Pointers

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #103.

There’s a lot to love about standard smart pointers in general, and unique_ptr in particular.

Problem

JG Question

1. When should you use shared_ptr vs. unique_ptr? List as many considerations as you can.

Guru Question

2. Why should you almost always use make_shared to create an object to be owned by shared_ptrs? Explain.

3. Why should you almost always use make_unique to create an object to be initially owned by a unique_ptr? Explain.

4. What’s the deal with auto_ptr?

Solution

1. When should you use shared_ptr vs. unique_ptr?

When in doubt, prefer unique_ptr by default, and you can always later move-convert to shared_ptr if you need it. If you do know from the start you need shared ownership, however, go directly to shared_ptr via make_shared (see #2 below).

There are three major reasons to say “when in doubt, prefer unique_ptr.”

First, use the simplest semantics that are sufficient: Choose the right smart pointer to most directly express your intent, and what you need (now). If you are creating a new object and don’t know that you’ll eventually need shared ownership, use unique_ptr which expresses unique ownership. You can still put it in a container (e.g., vector<unique_ptr<widget>>) and do most other things you want to do with a raw pointer, only safely. If you later need shared ownership, you can always move-convert the unique_ptr to a shared_ptr.

Second, a unique_ptr is more efficient than a shared_ptr. A unique_ptr doesn’t need to maintain reference count information and a control block under the covers, and is designed to be just about as cheap to move and use as a raw pointer. When you don’t ask for more than you need, you don’t incur overheads you won’t use.

Third, starting with unique_ptr is more flexible and keeps your options open. If you start with a unique_ptr, you can always later convert to a shared_ptr via move, or to another custom smart pointer (or even to a raw pointer) via .get() or .release().

Guideline: Prefer to use the standard smart pointers, unique_ptr by default and shared_ptr if sharing is needed. They are the common types that all C++ libraries can understand. Use other smart pointer types only if necessary for interoperability with other libraries, or when necessary for custom behavior you can’t achieve with deleters and allocators on the standard pointers.

2. Why should you almost always use make_shared to create an object to be owned by shared_ptrs? Explain.

Note: If you need to create an object using a custom allocator, which is rare, you can use allocate_shared. Note that even though its name is slightly different, allocate_shared should be viewed as “just the flavor of make_shared that lets you specify an allocator,” so I’m mainly going to talk about them both as make_shared here and not distinguish much between them.

There are two main cases where you can’t use make_shared (or allocate_shared) to create an object that you know will be owned by shared_ptrs: (a) if you need a custom deleter, such as because of using shared_ptrs to manage a non-memory resource or an object allocated in a nonstandard memory area, you can’t use make_shared because it doesn’t support specifying a deleter; and (b) if you are adopting a raw pointer to an object being handed to you from other (usually legacy) code, you would construct a shared_ptr from that raw pointer directly.

Guideline: Use make_shared (or, if you need a custom allocator, allocate_shared) to create an object you know will be owned by shared_ptrs, unless you need a custom deleter or are adopting a raw pointer from elsewhere.

So, why use make_shared (or, if you need a custom allocator, allocate_shared) whenever you can, which is nearly always? There are two main reasons: simplicity, and efficiency.

First, with make_shared the code is simpler. Write for clarity and correctness first.

Second, using make_shared is more efficient. The shared_ptr implementation has to maintain housekeeping information in a control block shared by all shared_ptrs and weak_ptrs referring to a given object. In particular, that housekeeping information has to include not just one but two reference counts:

A “strong reference” count to track the number of shared_ptrs currently keeping the object alive. The shared object is destroyed (and possibly deallocated) when the last strong reference goes away.
A “weak reference” count to track the number of weak_ptrs currently observing the object. The shared housekeeping control block is destroyed and deallocated (and the shared object is deallocated if it was not already) when the last weak reference goes away.

If you allocate the object separately via a raw new expression, then pass it to a shared_ptr, the shared_ptr implementation has no alternative but to allocate the control block separately, as shown in Example 2(a) and Figure 2(a).

// Example 2(a): Separate allocation
auto sp1 = shared_ptr<widget>{ new widget{} };
auto sp2 = sp1;

Figure 2(a): Approximate memory layout for Example 2(a).

We’d like to avoid doing two separate allocations here. If you use make_shared to allocate the object and the shared_ptr all in one go, then the implementation can fold them together in a single allocation, as shown in Example 2(b) and Figure 2(b).

// Example 2(b): Single allocation
auto sp1 = make_shared<widget>();
auto sp2 = sp1;

Figure 2(b): Approximate memory layout for Example 2(b).

Note that combining the allocations has two major advantages:

It reduces allocation overhead, including memory fragmentation. First, the most obvious way it does this is by reducing the number of allocation requests, which are typically more expensive operations. This also helps reduce contention on allocators (some allocators don’t scale well). Second, using only one chunk of memory instead of two reduces the per-allocation overhead. Whenever you ask for a chunk of memory, the system must give you at least that many bytes, and often gives you a few more because of using fixed-size pools or tacking on housekeeping information per allocation. So by using a single chunk of memory, we tend to reduce the total extra overhead. Finally, we also naturally reduce the number of “dead” extra in-between gaps that cause fragmentation.
It improves locality. The reference counts are frequently used with the object, and for small objects are likely to be on the same cache line, which improves cache performance (as long as there isn’t some thread copying the smart pointer in a tight loop; don’t do that).

As always, when you can express more of what you’re trying to achieve as a single function call, you’re giving the system a better chance to figure out a way to do the job more efficiently. This is just as true when inserting 100 elements into a vector using a single range-insert call to v.insert( first, last ) instead of 100 calls to v.insert( value ) as it is when using a single call to make_shared instead of separate calls to new widget() and shared_ptr( widget* ).

There are two more advantages: Using make_shared avoids explicit new and avoids an exception safety issue. Both of these also apply to make_unique, so we’ll cover them under #3.

3. Why should you almost always use make_unique to create an object to be initially owned by a unique_ptr? Explain.

As with make_shared, there are two main cases where you can’t use make_unique to create an object that you know will be owned (at least initially) by a unique_ptr: if you need a custom deleter, or if you are adopting a raw pointer.

Otherwise, which is nearly always, prefer make_unique.

Guideline: Use make_unique to create an object that isn’t shared (at least not yet), unless you need a custom deleter or are adopting a raw pointer from elsewhere.

Besides symmetry with make_shared, make_unique offers at least two other advantages. First, you should prefer use make_unique<T>() instead of the more-verbose unique_ptr<T>{ new T{} } because you should avoid explicit new in general:

Guideline: Don’t use explicit new, delete, and owning * pointers, except in rare cases encapsulated inside the implementation of a low-level data structure.

Second, it avoids some known exception safety issues with naked new. Here’s an example:

void sink( unique_ptr<widget>, unique_ptr<gadget> );

sink( unique_ptr<widget>{new widget{}},
      unique_ptr<gadget>{new gadget{}} ); // Q1: do you see the problem?

Briefly, if you allocate and construct the new widget first, then get an exception while allocating or constructing the new gadget, the widget is leaked. You might think: “Well, I could just change new widget{} to make_unique<widget>() and this problem would go away, right?” To wit:

sink( make_unique<widget>(),
      unique_ptr<gadget>{new gadget{}} );         // Q2: is this better?

The answer is no, because C++ leaves the order of evaluation of function arguments unspecified and so either the new widget or the new gadget could be performed first. If the new gadget is allocated and constructed first, then the make_unique<widget> throws, we have the same problem.

But while just changing one of the arguments to use make_unique doesn’t close the hole, changing them both to make_unique really does completely eliminate the problem:

sink( make_unique<widget>(), make_unique<gadget>() );  // exception-safe

This exception safety issue is covered in more detail in GotW #56.

Guideline: To allocate an object, prefer to write make_unique by default, and write make_shared when you know the object’s lifetime is going to be managed by using shared_ptrs.

4. What’s the deal with auto_ptr?

auto_ptr is most charitably characterized as a valiant attempt to create a unique_ptr before C++ had move semantics. auto_ptr is now deprecated, and should not be used in new code.

If you have auto_ptr in an existing code base, when you get a chance try doing a global search-and-replace of auto_ptr to unique_ptr; the vast majority of uses will work the same, and it might expose (as a compile-time error) or fix (silently) a bug or two you didn’t know you had.

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: celeborn2bealive, Andy Prowl, Chris Vine, Marek.

GotW #89: Smart Pointers

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #103.

There’s a lot to love about standard smart pointers in general, and unique_ptr in particular.

Problem

JG Question

1. When should you use shared_ptr vs. unique_ptr? List as many considerations as you can.

Guru Question

2. Why should you almost always use make_shared to create an object to be owned by shared_ptrs? Explain.

3. Why should you almost always use make_unique to create an object to be initially owned by a unique_ptr? Explain.

4. What’s the deal with auto_ptr?

GotW #6b Solution: Const-Correctness, Part 2

const and mutable are powerful tools for writing safer code. Use them consistently.

Problem

Guru Question

In the following code, add or remove const (including minor variants and related keywords) wherever appropriate. Note: Don’t comment on or change the structure of this program. It’s contrived and condensed for illustration only.

For bonus points: In what places are the program’s results undefined or uncompilable due to const errors?

class polygon {
public:
    polygon() : area{-1} {}

    void add_point( const point pt ) { area = -1;
                                       points.push_back(pt); }

    point get_point( const int i ) { return points[i]; }

    int get_num_points() { return points.size(); }

    double get_area() {
        if( area < 0 )   // if not yet calculated and cached
            calc_area();     // calculate now
        return area;
    }

private:
    void calc_area() {
        area = 0;
        vector<point>::iterator i;
        for( i = begin(points); i != end(points); ++i )
            area += /* some work using *i */;
    }

    vector<point> points;
    double        area;
};

polygon operator+( polygon& lhs, polygon& rhs ) {
    auto ret = lhs;
    auto last = rhs.get_num_points();
    for( auto i = 0; i < last; ++i ) // concatenate
        ret.add_point( rhs.get_point(i) );
    return ret;
}

void f( const polygon& poly ) {
    const_cast<polygon&>(poly).add_point( {0,0} );
}

void g( polygon& const poly ) { poly.add_point( {1,1} ); }

void h( polygon* const poly ) { poly->add_point( {2,2} ); }

int main() {
    polygon poly;
    const polygon cpoly;

    f(poly);
    f(cpoly);
    g(poly);
    h(&poly);
}

Solution

When I pose this kind of problem, I find that most people think the problem is on the easy side and address only the more-usual const issues. There are, however, subtleties that are worth knowing, and hence this Item.

1. The point object is passed by value, so there is little benefit to declaring it const.

    void  add_point( const point pt )

In this particular case, because the function is defined inline, the const value parameter can make sense. This is because for inline functions the declaration and definition are the same. Otherwise, const value parameters should appear only on the definition, not on the declaration. Let’s see why.

Putting the const on a value parameter in a function declaration is irrelevant outside the function—it makes no difference to the caller and can only confuse readers. To the compiler, the function signature is the same whether you include this const in front of a value parameter or not. For example:

// value parameter: top-level const is not part of function signature
int f( int );
int f( const int );    // redeclares f(int): this is the same function

// non-value parameter: top-level const is part of function signature
int g( int& );
int g( const int& );   // overloads g(int&): these are two functions

But putting the const on the value parameter does make a difference to how it can be used inside the function’s actual definition. Remember that, inside a function, the parameters are just the first set of local variables, so putting a const on a value parameter simply means that the function can’t modify its local variable, which only happens to be a parameter. Here’s an example that declares and then defines the same function f:

int f( int );          // declaration: no const

int f( const int i ) { // definition: use const to express "read-only"

    vector<int> v;
    v.push_back(i);    // ok, only reads from i

    i = 42;            // error, attempts to modify i

}

Guideline: Consider not writing const on pass-by-value function parameters when only forward-declaring a function. You can always add it on the definition to express a read-only parameter.

2. get_point and get_num_points should be const.

    point get_point( const int i ) { return points[i]; }

    int   get_num_points() { return points.size(); }

These functions should be marked const, because they don’t change the state of the object.

3. get_area should be const.

    double get_area() {
        if( area < 0 )   // if not yet calculated and cached
            calc_area();     // calculate now
        return area;
    }

Even though this function modifies the object’s internal state, we should consider making it const. Why? Because this function does not modify the object’s observable state; we are doing some caching here, but that’s an internal implementation detail and the object is logically const even if it isn’t physically const.

4. Therefore calc_area should also be const.

    void calc_area() {
        area = 0;
        vector<point>::iterator i;
        for( i = begin(points); i != end(points); ++i )
            area += /* some work using *i */;
    }

Once we make get_area be const, this private helper function ought also to be const.

In turn, once you make this function const, the compiler will tell you that you also need to do something about the member variable area, which should be:

declared mutable, so that it’s writable in a const member function; and
synchronized using a mutex or made atomic<>, so that it’s concurrency-safe, as discussed in GotW #6a.

5. Also, calc_area should use a const_iterator.

The iterator should not change the state of the points collection, and so it ought to be a const_iterator. We’re now forced to make this change anyway if we’re making calc_area be a const member function, but note that if we had said auto for the iterator’s type we wouldn’t have had to make any change at all. While we’re at it, the for loop inside calc_area: It should prefer to use the range-based for loop, as well as auto.

Combining all that, we get this simpler and const-correct code:

      for( auto& pt : points )
          area += /* some work using pt */;

Guideline: Prefer declaring variables using auto.

Guideline: Prefer range-based for loops to naked iterator-incrementing for loops when visiting the elements of the range in order.

6. area should be mutable and synchronized.

    double        area;

As noted already, in conjunction with the other changes the internal cache variable will area now want to be mutable so that it can be used correctly and safely inside const member functions, and because it is now a shared variable potentially used by multiple concurrent const operations it must also be synchronized—protected with a mutex or made atomic.

Bonus Question: Before reading on, which should it be: Protected by a mutex? or made atomic<double>?

Have you thought about it? All right, let’s continue…

Both work, but a mutex is usually overkill for a single variable.

Option 1 is to use a mutex in the perhaps-soon-to-be-canonical “mutable mutex mutables” pattern:

// Option 1: Use a mutex 

    double get_area() const {
        auto lock = unique_lock<mutex>{mutables};
        if( area < 0 )   // if not yet calculated and cached
            calc_area();     // calculate now
        return area;
    }

private:
    // ...
    mutable mutex  mutables;      // canonical pattern: mutex that
    mutable double area;          //   covers all mutable members

Option 1 generalizes well if you add more data members in the future. However, it’s also more invasive and generalizes less well if you add more const member functions in the future that use area, because they will all have to remember to acquire a lock on the mutex before using area.

Option 2 is to just change double to mutable atomic<double>. This is attractive because the “mutable part” of polygon is just a single variable. That can work, but you have to be careful because that’s not the only necessary change, for two reasons:

The minor reason is that atomic<double> doesn’t support +=, so if we only change area‘s type then calc_area will no longer compile. That can be worked around, but leads us to the major reason…
The major reason is that, because calc_area is a compound operation and must be safe to run on multiple threads concurrently, we must restructure calc_area to be safe to run concurrently. In particular it should not perform intermediate updates to area, and should ensure that multiple competing concurrent updates to area don’t cause overwrites that lose written values.

There are several ways to do it, but the simplest is probably to allow benign redundant recalculations in the case of concurrent calls to calc_area, on the grounds that it’s probably no worse than blocking the concurrent calls which would have to wait anyway.

// Option 2: Use an atomic

    void calc_area() const {
        auto tmp = 0.0;              // do all the work off to the side
        for( auto& pt : points )
            tmp += /* some work using pt */;
        area = tmp;                  // then commit with a single write
    }

private:
    // ...
    mutable atomic<double> area;

Notice that concurrent const operations that call to calc_area can still overlap and overwrite each other’s results, but that’s benign because they’re concurrent const operations so they will all calculate the same value. Also, concurrent calc_area calls use the shared points variable in a loop which should make us mentally check that it doesn’t cause cache contention, but because they’re all readers it won’t and so this too is fine.

7. operator+’s rhs parameter should be a reference to const.

polygon operator+( polygon& lhs, polygon& rhs ) {

The rhs parameter should be passed by reference to const, of course.

Guideline: Prefer passing a read-only parameter by const& if you are only going to read from it (not make a copy of it).

“But wait!” I can just hear some of you saying, “you forgot about lhs! Shouldn’t it be const& too?” Actually, not so much:

8. operator+’s lhs parameter should be passed by value.

The key point is that we’re going to copy from it anyway, in this case immediately:

    auto ret = lhs;

When you’re in the special case of “read-only parameter that you’re going to take copy of anyway,” there are several ways to accept the parameter, which I’ll cover in detail in another GotW. For now, suffice it to say that usually you shouldn’t overthink these options, and just use pass-by-value as the simplest method, which offers some advantages that we also touched on in GotW #4:

If the caller passes a named polygon object (an lvalue), there’s no difference. Both pass-by-const& followed by an explicit copy and pass-by-value will perform one copy.
If the caller passes a temporary polygon object (an rvalue), the compiler automatically move-constructs lhs from that, which probably makes no difference for a small type like polygon but can be considerably cheaper for many types.

Guideline: Prefer passing a read-only parameter by value if you’re going to make a copy of the parameter anyway, because it enables move from rvalue arguments.

9. Also in operator+, last should be const.

    auto last = rhs.get_num_points();
    for( auto i = 0; i < last; ++i ) // concatenate
        ret.add_point( rhs.get_point(i) );
    return ret;
}

Since last should never change, prefer to say so by making it const.

Guideline: Prefer to make variables, including locals, const if they should not change.

Incidentally, notice that once we make rhs a reference-to-const parameter as noted above, we see another reason why get_point should be a const member function.

10. f’s const_cast may give undefined behavior, and is morally wrong anyway.

void f( const polygon& poly ) {
    const_cast<polygon&>(poly).add_point( {0,0} );
}

Bonus: The result of the const_cast is undefined if the referenced object was declared as const—which it is in the case of f(cpoly) below.

The parameter isn’t really const, so don’t declare it as const and then try to modify it anyway. Lying to the compiler, never mind to the caller, is a bad idea, never mind morally reprehensible in most value systems.

11. g’s const is illegal and useless.

void g( polygon& const poly ) { poly.add_point( {1,1} ); }

This const is illegal; you can’t apply const directly to the reference itself, besides which references are already const inasmuch as they cannot be reseated to refer to a different object.

void h( polygon* const poly ) { poly->add_point( {2,2} ); }

Note that h‘s const merely ensures that h‘s body won’t modify the pointer value. This is the same as the const value parameters in add_point and get_point, and perfectly fine on the definition.

12. Examining the mainline.

int main() {
    polygon poly;
    const polygon cpoly;

    f(poly);

This is fine.

    f(cpoly);

As already noted, this causes undefined results when f tries to cast away the const-ness of its parameter and then modify it.

    g(poly);

This is fine.

    h(&poly);
}

This is fine.

Summary

Here is a revised version of the code that corrects the const issues noted above, but does not attempt to correct any other poor style. Note that because of the atomic member, which is not copyable, we now provide polygon‘s copy and move operations explicitly.

class polygon {
public:
    polygon() : area{-1} {}

    polygon( const polygon& other ) : points{other.points}, area{-1} { }

    polygon( polygon&& other ) 
        : points{move(other.points)}, area{other.area.load()}
        { other.area = -1; }

    polygon& operator=( const polygon& other )
        { points = other.points; area = -1; return *this; }

    polygon& operator=( polygon&& other ) {
        points = move(other.points);
        area = other.area.load();
        other.area = -1;
        return *this;
    }

    void add_point( point pt ) 
        { area = -1; points.push_back(pt); }

    point get_point( int i ) const { return points[i]; }

    int get_num_points() const { return points.size(); }

    double get_area() const {
        if( area < 0 )   // if not yet calculated and cached
            calc_area();     // calculate now
        return area;
    }

private:
    void calc_area() const {
        auto tmp = 0.0;
        for( auto& pt : points )
            tmp += /* some work using pt */;
        area = tmp;
    }

    vector<point>          points;
    mutable atomic<double> area;
};

polygon operator+( polygon lhs, const polygon& rhs ) {
    const auto last = rhs.get_num_points();
    for( auto i = 0; i < last; ++i ) // concatenate
        lhs.add_point( rhs.get_point(i) );
    return lhs;
}

void f( polygon& poly ) { poly.add_point( {0,0} ); }

void g( polygon& poly ) { poly.add_point( {1,1} ); }

void h( polygon* poly ) { poly->add_point( {2,2} ); }

int main() {
    auto poly = polygon{};

    f(poly);
    g(poly);
    h(&poly);
}

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: mttpd, Vincent Lascaux, jlehrer, Motti, Fernando Pelliccioni, Leo, Mathias Stearn.

C++ and Beyond: My material for December, and early-bird registration (through June 9)

If you’re thinking of coming to C++ and Beyond this December, consider registering in the next two weeks to get the $300 discount.

I’ve just announced that much (and possibly all) of my material will be in heavily interactive sessions about modern C++11/C++14 style and idioms, covering the “complete C++11 package” that we’re calling C++14. We know C++14’s shape because as of the Bristol meeting in April it’s now feature-complete, and only international commenting and other fine-tuning remains. We know C++14 is real and that at least two of the major commercial compilers will be implementing all of C++14 by next year. And we know C++14 really does complete C++11, and it’s compelling enough that it’s made me rewrite my Guru of the Week series and Exceptional C++ books, targeting C++14. From the description:

This session will cover modern and current C++ style, focusing on C++14. I’ll demonstrate how major features and idioms from C++98 are now entirely replaced or subsumed and should be used no more; how other major features and idioms have been dramatically improved to the point where you code is cleaner and safer and you’ll even think in a different style; and how pervasive styles as common as variable declarations are changed forever, and not just for style but for serious technical safety and efficiency benefits. For one thing, you’ll never look at ‘auto’ the same way again (there, I said it; bring out the lederhosen and pitchforks! or attend the session and challenge in person to dig deep into the good reasons for the new reality).

Scott has announced one of his prospective talks: “Concurrent Data Structures and Standard C++.” From his description:

Concurrent data structures permit multiple writers to simultaneously modify a single data structure. Used properly, they can avoid scalability bottlenecks in multithreaded systems. Used improperly, they can decrease program performance.

There are no concurrent data structures in the C++98 standard library. The C++11 standard library is similarly bare, and C++14 is unlikely to change that. Nevertheless, concurrent data structures for C++ developers are widely available from sources such as Microsoft’s PPL, Intel’s TBB, and Boost. In this talk, I’ll examine the motivation and use cases for concurrent data structures, discuss their limitations, survey offerings common to PPL and TBB, and contrast concurrent APIs with those of seemingly similar serial counterparts (e.g., concurrent_vector vs. std::vector). I’ll also explain why writing your own concurrent data structure is much more complicated and error-prone than most people initially imagine. (If you’re not familiar with the ABA problem, this presentation will explain why you should be.)

For more information, see the C&B blog. I look forward to meeting and re-meeting many of you at C&B. The attendance is limited to 64 this year, and it will be a delightful technical C++-fest in a cozy lodge far from distractions and with lots of fireplaces.

Lost two comments

As mentioned in my GotW kickoff post, I’m experimenting with software and a workflow that lets me maintain a single source document and use it to produce the work in multiple targets, in particular to post to the blog here, to produce print books, and to produce e-books.

However, there have been kinks. In particular, on the blog I sometimes repost an update to an already-posted solution, and though it usually correctly updates the existing one, sometimes it chooses to create a new post instead and I have to delete the duplicate. That happened this morning, and I had a remove an accidental duplicate of the GotW #6a solution post.

Unfortunately, people had already left two comments on that duplicate post, and those comments seem to have got lost in the ether. I apologize for that; I think this is the first time I’ve ever lost someone’s comment, but it did happen this morning as I’m still working out kinks in the software. I believe I have a workaround and that it won’t happen again.

GotW #6b: Const-Correctness, Part 2

const and mutable are powerful tools for writing safer code. Use them consistently.

Problem

Guru Question

For bonus points: In what places are the program’s results undefined or uncompilable due to const errors?

class polygon {
public:
    polygon() : area{-1} {}

    void add_point( const point pt ) { area = -1;
                                       points.push_back(pt); }

    point get_point( const int i ) { return points[i]; }

    int get_num_points() { return points.size(); }

    double get_area() {
        if( area < 0 )   // if not yet calculated and cached
            calc_area();     // calculate now
        return area;
    }

private:
    void calc_area() {
        area = 0;
        vector<point>::iterator i;
        for( i = begin(points); i != end(points); ++i )
            area += /* some work using *i */;
    }

    vector<point> points;
    double        area;
};

polygon operator+( polygon& lhs, polygon& rhs ) {
    auto ret = lhs;
    auto last = rhs.get_num_points();
    for( auto i = 0; i < last; ++i ) // concatenate
        ret.add_point( rhs.get_point(i) );
    return ret;
}

void f( const polygon& poly ) {
    const_cast<polygon&>(poly).add_point( {0,0} );
}

void g( polygon& const poly ) { poly.add_point( {1,1} ); }

void h( polygon* const poly ) { poly->add_point( {2,2} ); }

int main() {
    polygon poly;
    const polygon cpoly;

    f(poly);
    f(cpoly);
    g(poly);
    h(&poly);
}

GotW #6a Solution: Const-Correctness, Part 1

const and mutable have been in C++ for many years. How well do you know what they mean today?

Problem

JG Question

1. What is a “shared variable”?

Guru Questions

2. What do const and mutable mean on shared variables?

3. How are const and mutable different in C++98 and C++11?

Solution

1. What is a “shared variable”?

A “shared variable” is one that could be accessed from more than one thread at the same time.

This concept is important in the C++ memory model. For example, the C++ memory model (the core of which is described in ISO C++ §1.10) prohibits the invention of a write to a “potentially shared memory location” that would not have been written to in a sequentially consistent execution of the program, and the C++ standard library refers to this section when it prohibits “modify[ing] objects accessible by [other] threads” through a const function, as we will see in #2.

2. What do const and mutable mean on shared variables?

Starting with C++11, const on a variable that is possibly shared means “read-only or as good as read-only” for the purposes of concurrency. Concurrent const operations on the same object are required to be safe without the calling code doing external synchronization.

If you are implementing a type, unless you know objects of the type can never be shared (which is generally impossible), this means that each of your const member functions must be either:

truly physically/bitwise const with respect to this object, meaning that they perform no writes to the object’s data; or else
internally synchronized so that if it does perform any actual writes to the object’s data, that data is correctly protected with a mutex or equivalent (or if appropriate are atomic<>) so that any possible concurrent const accesses by multiple callers can’t tell the difference.

Types that do not respect this cannot be used with the standard library, which requires that:

“… to prevent data races (1.10). … [a] C++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.”—ISO C++ §17.6.5.9

Similarly, writing mutable on a member variable means what it has always meant: The variable is “writable but logically const.” Note what this implies:

The “logically const” part now means “can be used safely by multiple concurrent const operations.”
The “mutable” and “writable” part further means that some const operations may actually be writers of the shared variable, which means it’s inherently got to be correct to read and write concurrently, so it should be protected with a mutex or similar, or made atomic<>.

In general, remember:

Guideline: Remember the “M&M rule”: For a member variable, mutable and mutex (or atomic) go together.

This applies in both directions, to wit:

(1) For a member variable, mutable implies mutex (or equivalent): A mutable member variable is presumed to be a mutable shared variable and so must be synchronized internally—protected with a mutex, made atomic, or similar.

(2) For a member variable, mutex (or similar synchronization type) implies mutable: A member variable that is itself of a synchronization type, such as a mutex or a condition variable, naturally wants to be mutable, because you will want to use it in a non-const way (e.g., take a std::lock_guard<mutex>) inside concurrent const member functions.

We’ll see an example of (2) in Part 2, GotW #6b.

3. How are const and mutable different in C++98 and C++11?

First, let’s be clear: C++98 single-threaded code still works. C++11 has excellent C++98 compatibility, and even though the meaning of const has evolved, C++98 single-threaded code that uses the old “logically const” meaning of const is still valid.

With C++98, we taught a generation of C++ developers that “const means logically const, not physically/bitwise const.” That is, in C++98 we taught that const meant only that the observable state of the object (say, via its non-private member functions) should not change as far as the caller could tell, but its internal bits might change in order to update counters and instrumentation and other data not accessible via the type’s public or protected interface.

That definition is not sufficient for concurrency. With C++11 and onward, which now includes a concurrency memory model and thread safety specification for the standard library, this is now much simpler: const now really does mean “read-only, or safe to read concurrently”—either truly physically/bitwise const, or internally synchronized so that any actual writes are synchronized with any possible concurrent const accesses so the callers can’t tell the difference.

Although existing C++98-era types still work just fine in C++98-era single-threaded code for compatibility, those types and any new ones you write today should obey the new stricter requirement if they could be used on multiple threads. The good news is that most existing types already followed that rule, and code that relies on casting away const and/or using mutable data members in single-threaded code has already been generally questionable and relatively rare.

Summary

Don’t shoot yourself (or your fellow programmers) in the foot. Write const-correct code.

Using const consistently is simply necessary for correctly-synchronized code. That by itself is ample reason to be consistently const-correct, but there’s more: It lets you document interfaces and invariants far more effectively than any mere /* I promise not to change this */ comment can accomplish. It’s a powerful part of “design by contract.” It helps the compiler to stop you from accidentally writing bad code. It can even help the compiler generate tighter, faster, smaller code. That being the case, there’s no reason why you shouldn’t use it as much as possible, and every reason why you should.

Remember that the correct use of mutable is a key part of const-correctness. If your class contains a member that could change even for const objects and operations, make that member mutable and protect it with a mutex or make it atomic. That way, you will be able to write your class’ const member functions easily and correctly, and users of your class will be able to correctly create and use const and non-const objects of your class’ type.

It’s true that not all commercial libraries’ interfaces are const-correct. That isn’t an excuse for you to write const-incorrect code, though. It is, however, one of the few good excuses to write const_cast, plus a detailed comment nearby grumbling about the library vendor’s laziness and how you’re looking for a replacement product.

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: mttpd, jlehrer, Chris Vine.