Feeds:
Posts
Comments

If you’re thinking of coming to C++ and Beyond this December, consider registering in the next two weeks to get the $300 discount.

I’ve just announced that much (and possibly all) of my material will be in heavily interactive sessions about modern C++11/C++14 style and idioms, covering the “complete C++11 package” that we’re calling C++14. We know C++14′s shape because as of the Bristol meeting in April it’s now feature-complete, and only international commenting and other fine-tuning remains. We know C++14 is real and that at least two of the major commercial compilers will be implementing all of C++14 by next year. And we know C++14 really does complete C++11, and it’s compelling enough that it’s made me rewrite my Guru of the Week series and Exceptional C++ books, targeting C++14. From the description:

This session will cover modern and current C++ style, focusing on C++14. I’ll demonstrate how major features and idioms from C++98 are now entirely replaced or subsumed and should be used no more; how other major features and idioms have been dramatically improved to the point where you code is cleaner and safer and you’ll even think in a different style; and how pervasive styles as common as variable declarations are changed forever, and not just for style but for serious technical safety and efficiency benefits. For one thing, you’ll never look at ‘auto’ the same way again (there, I said it; bring out the lederhosen and pitchforks! or attend the session and challenge in person to dig deep into the good reasons for the new reality).

Scott has announced one of his prospective talks: “Concurrent Data Structures and Standard C++.” From his description:

Concurrent data structures permit multiple writers to simultaneously modify a single data structure. Used properly, they can avoid scalability bottlenecks in multithreaded systems. Used improperly, they can decrease program performance.

There are no concurrent data structures in the C++98 standard library. The C++11 standard library is similarly bare, and C++14 is unlikely to change that.  Nevertheless, concurrent data structures for C++ developers are widely available from sources such as Microsoft’s PPL, Intel’s TBB, and Boost. In this talk, I’ll examine the motivation and use cases for concurrent data structures, discuss their limitations, survey offerings common to PPL and TBB, and contrast concurrent APIs with those of seemingly similar serial counterparts (e.g., concurrent_vector vs. std::vector). I’ll also explain why writing your own concurrent data structure is much more complicated and error-prone than most people initially imagine.  (If you’re not familiar with the ABA problem, this presentation will explain why you should be.)

For more information, see the C&B blog. I look forward to meeting and re-meeting many of you at C&B. The attendance is limited to 64 this year, and it will be a delightful technical C++-fest in a cozy lodge far from distractions and with lots of fireplaces.

As mentioned in my GotW kickoff post, I’m experimenting with software and a workflow that lets me maintain a single source document and use it to produce the work in multiple targets, in particular to post to the blog here, to produce print books, and to produce e-books.

However, there have been kinks. In particular, on the blog I sometimes repost an update to an already-posted solution, and though it usually correctly updates the existing one, sometimes it chooses to create a new post instead and I have to delete the duplicate. That happened this morning, and I had a remove an accidental duplicate of the GotW #6a solution post.

Unfortunately, people had already left two comments on that duplicate post, and those comments seem to have got lost in the ether. I apologize for that; I think this is the first time I’ve ever lost someone’s comment, but it did happen this morning as I’m still working out kinks in the software. I believe I have a workaround and that it won’t happen again.

const and mutable are powerful tools for writing safer code. Use them consistently.

Problem

Guru Question

In the following code, add or remove const (including minor variants and related keywords) wherever appropriate. Note: Don’t comment on or change the structure of this program. It’s contrived and condensed for illustration only.

For bonus points: In what places are the program’s results undefined or uncompilable due to const errors?

class polygon {
public:
polygon() : area{-1} {}

void add_point( const point pt ) { area = -1;
points.push_back(pt); }

point get_point( const int i ) { return points[i]; }

int get_num_points() { return points.size(); }

double get_area() {
if( area < 0 ) // if not yet calculated and cached
calc_area(); // calculate now
return area;
}

private:
void calc_area() {
area = 0;
vector<point>::iterator i;
for( i = begin(points); i != end(points); ++i )
area += /* some work using *i */;
}

vector<point> points;
double area;
};

polygon operator+( polygon& lhs, polygon& rhs ) {
auto ret = lhs;
auto last = rhs.get_num_points();
for( auto i = 0; i < last; ++i ) // concatenate
ret.add_point( rhs.get_point(i) );
return ret;
}

void f( const polygon& poly ) {
const_cast<polygon&>(poly).add_point( {0,0} );
}

void g( polygon& const poly ) { poly.add_point( {1,1} ); }

void h( polygon* const poly ) { poly->add_point( {2,2} ); }

int main() {
polygon poly;
const polygon cpoly;

f(poly);
f(cpoly);
g(poly);
h(&poly);
}

const and mutable have been in C++ for many years. How well do you know what they mean today?

Problem

JG Question

1. What is a “shared variable”?

Guru Questions

2. What do const and mutable mean on shared variables?

3. How are const and mutable different in C++98 and C++11?

Solution

1. What is a “shared variable”?

A “shared variable” is one that could be accessed from more than one thread at the same time.

This concept is important in the C++ memory model. For example, the C++ memory model (the core of which is described in ISO C++ §1.10) prohibits the invention of a write to a “potentially shared memory location” that would not have been written to in a sequentially consistent execution of the program, and the C++ standard library refers to this section when it prohibits “modify[ing] objects accessible by [other] threads” through a const function, as we will see in #2.

2. What do const and mutable mean on shared variables?

Starting with C++11, const on a variable that is possibly shared means “read-only or as good as read-only” for the purposes of concurrency. Concurrent const operations on the same object are required to be safe without the calling code doing external synchronization.

If you are implementing a type, unless you know objects of the type can never be shared (which is generally impossible), this means that each of your const member functions must be either:

  • truly physically/bitwise const with respect to this object, meaning that they perform no writes to the object’s data; or else
  • internally synchronized so that if it does perform any actual writes to the object’s data, that data is correctly protected with a mutex or equivalent (or if appropriate are atomic<>) so that any possible concurrent const accesses by multiple callers can’t tell the difference.

Types that do not respect this cannot be used with the standard library, which requires that:

“… to prevent data races (1.10). … [a] C++ standard library function shall not directly or indirectly modify objects (1.10) accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function’s non-const arguments, including this.”—ISO C++ §17.6.5.9

Similarly, writing mutable on a member variable means what it has always meant: The variable is “writable but logically const.” Note what this implies:

  • The “logically const” part now means “can be used safely by multiple concurrent const operations.”
  • The “mutable” and “writable” part further means that some const operations may actually be writers of the shared variable, which means it’s inherently got to be correct to read and write concurrently, so it should be protected with a mutex or similar, or made atomic<>.

In general, remember:

Guideline: Remember the “M&M rule”: For a member variable, mutable and mutex (or atomic) go together.

This applies in both directions, to wit:

(1) For a member variable, mutable implies mutex (or equivalent): A mutable member variable is presumed to be a mutable shared variable and so must be synchronized internally—protected with a mutex, made atomic, or similar.

(2) For a member variable, mutex (or similar synchronization type) implies mutable: A member variable that is itself of a synchronization type, such as a mutex or a condition variable, naturally wants to be mutable, because you will want to use it in a non-const way (e.g., take a std::lock_guard<mutex>) inside concurrent const member functions.

We’ll see an example of (2) in Part 2, GotW #6b.

3. How are const and mutable different in C++98 and C++11?

First, let’s be clear: C++98 single-threaded code still works. C++11 has excellent C++98 compatibility, and even though the meaning of const has evolved, C++98 single-threaded code that uses the old “logically const” meaning of const is still valid.

With C++98, we taught a generation of C++ developers that “const means logically const, not physically/bitwise const.” That is, in C++98 we taught that const meant only that the observable state of the object (say, via its non-private member functions) should not change as far as the caller could tell, but its internal bits might change in order to update counters and instrumentation and other data not accessible via the type’s public or protected interface.

That definition is not sufficient for concurrency. With C++11 and onward, which now includes a concurrency memory model and thread safety specification for the standard library, this is now much simpler: const now really does mean “read-only, or safe to read concurrently”—either truly physically/bitwise const, or internally synchronized so that any actual writes are synchronized with any possible concurrent non-const accesses so the callers can’t tell the difference.

Although existing C++98-era types still work just fine in C++98-era single-threaded code for compatibility, those types and any new ones you write today should obey the new stricter requirement if they could be used on multiple threads. The good news is that most existing types already followed that rule, and code that relies on casting away const and/or using mutable data members in single-threaded code has already been generally questionable and relatively rare.

Summary

Don’t shoot yourself (or your fellow programmers) in the foot. Write const-correct code.

Using const consistently is simply necessary for correctly-synchronized code. That by itself is ample reason to be consistently const-correct, but there’s more: It lets you document interfaces and invariants far more effectively than any mere /* I promise not to change this */ comment can accomplish. It’s a powerful part of “design by contract.” It helps the compiler to stop you from accidentally writing bad code. It can even help the compiler generate tighter, faster, smaller code. That being the case, there’s no reason why you shouldn’t use it as much as possible, and every reason why you should.

Remember that the correct use of mutable is a key part of const-correctness. If your class contains a member that could change even for const objects and operations, make that member mutable and protect it with a mutex or make it atomic. That way, you will be able to write your class’ const member functions easily and correctly, and users of your class will be able to correctly create and use const and non-const objects of your class’ type.

It’s true that not all commercial libraries’ interfaces are const-correct. That isn’t an excuse for you to write const-incorrect code, though. It is, however, one of the few good excuses to write const_cast, plus a detailed comment nearby grumbling about the library vendor’s laziness and how you’re looking for a replacement product.

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: mttpd, jlehrer, Chris Vine.

const and mutable have been in C++ for many years. How well do you know what they mean today?

 

Problem

JG Question

1. What is a “shared variable”?

Guru Questions

2. What do const and mutable mean on shared variables?

3. How are const and mutable different in C++98 and C++11?

Virtual functions are a pretty basic feature, but they occasionally harbor subtleties that trap the unwary. If you can answer questions like this one, then you know virtual functions cold, and you’re less likely to waste a lot of time debugging problems like the ones illustrated below.

Problem

JG Question

1. What do the override and final keywords do? Why are they useful?

Guru Question

2. In your travels through the dusty corners of your company’s code archives, you come across the following program fragment written by an unknown programmer. The programmer seems to have been experimenting to see how some C++ features worked.

(a) What could be improved in the code’s correctness or style?

(b) What did the programmer probably expect the program to print, but what is the actual result?

class base {
public:
    virtual void f( int );
    virtual void f( double );
    virtual void g( int i = 10 );
};

void base::f( int ) {
    cout << "base::f(int)" << endl;
}

void base::f( double ) {
    cout << "base::f(double)" << endl;
}

void base::g( int i ) {
    cout << i << endl;
}

class derived: public base {
public:
    void f( complex<double> );
    void g( int i = 20 );
};

void derived::f( complex<double> ) {
    cout << "derived::f(complex)" << endl;
}

void derived::g( int i ) {
    cout << "derived::g() " << i << endl;
}

int main() {
    base    b;
    derived d;
    base*   pb = new derived;

    b.f(1.0);
    d.f(1.0);
    pb->f(1.0);

    b.g();
    d.g();
    pb->g();

    delete pb;
}

Solution

1. What do the override and final keywords do? Why are they useful?

These keywords give explicit control over virtual function overriding. Writing override declares the intent to override a base class virtual function. Writing final makes a virtual function no longer overrideable in further-derived classes, or a class no longer permitted to have further-derived classes.

They are useful because they let the programmer explicitly declare intent in a way the language can enforce at compile time. If you write override but there is no matching base class function, or you write final and a further-derived class tries to implicitly or explicitly override the function anyway, you get a compile-time error.

Of the two, by far the more commonly useful is override; uses for final are rarer.

2. (a) What could be improved in the code’s correctness or style?

First, let’s consider some style issues, and one real error:

1. The code uses explicit new, delete, and an owning *.

Avoid using owning raw pointers and explicit new and delete except in rare cases like when you’re writing the internal implementation details of a low-level data structure.

{
    base*   pb = new derived;

    ...

    delete pb;
}

Instead of new and base*, use make_unique and unique_ptr<base>.

{
    auto pb = unique_ptr<base>{ make_unique<derived>() };

    ...

} // automatic delete here

Guideline: Don’t use explicit new, delete, and owning * pointers, except in rare cases encapsulated inside the implementation of a low-level data structure.

However, that delete brings us to another issue unrelated to how we allocate and manage the lifetime of the object, namely:

2. base’s destructor should be virtual or protected.

class base {
public:
    virtual void f( int );
    virtual void f( double );
    virtual void g( int i = 10 );
};

This looks innocuous, but the writer of base forgot to make the destructor either virtual or protected. As it is, deleting via a pointer-to-base without a virtual destructor is evil, pure and simple, and corruption is the best thing you can hope for because the wrong destructor will get called, derived class members won’t be destroyed, and operator delete will be invoked with the wrong object size.

Guideline: Make base class destructors public and virtual, or protected and nonvirtual.

Exactly one of the following can be true for a polymorphic type:

  • Either destruction via a pointer to base is allowed, in which case the function has to be public and had better be virtual;
  • or else it isn’t, in which case the function has to be protected (private is not allowed because the derived destructor must be able to invoke the base destructor) and would naturally also be nonvirtual (when the derived destructor invokes the base destructor, it does so nonvirtually whether declared virtual or not).

Interlude

For the next few points, it’s important to differentiate three terms:

  • To overload a function f means to provide another function with the same name in the same scope but with different parameter types. When f is actually called, the compiler will try to pick the best match based on the actual parameters that are supplied.
  • To override a virtual function f means to provide another function with the same name and the same parameter types in a derived class.
  • To hide a function f that exists in an enclosing scope (base class, outer class, or namespace) means to provide another function with the same name in an inner scope (derived class, nested class, or namespace), which will hide the same function name in an enclosing scope.

3. derived::f is neither an override nor an overload.

    void derived::f( complex<double> )

derived does not overload the base::f functions, it hides them. This distinction is very important, because it means that base::f(int) and base::f(double) are not visible in the scope of derived.

If the author of derived intended to hide the base functions named f, then this is all right. Usually, however, the hiding is inadvertent and surprising, and the correct way to bring the names into the scope of derived is to write the using-declaration using base::f; inside derived.

Guideline: When providing a non-overridden function with the same name as an inherited function, be sure to bring the inherited functions into scope with a using-declaration if you don’t want to hide them.

4. derived::g overrides base::g but doesn’t say “override.”

    void g( int i = 20 )  /* override */

This function overrides the base function, so it should say override explicitly. This documents the intent, and lets the compiler tell you if you’re trying to override something that’s not virtual or you got the signature wrong by mistake.

Guideline: Always write override when you intend to override a virtual function.

5. derived::g overrides base::g but changes the default argument.

    void g( int i = 20 )

Changing the default argument is decidedly user-unfriendly. Unless you’re really out to confuse people, don’t change the default arguments of the inherited functions you override. Yes, this is legal C++, and yes, the result is well-defined; and no, don’t do it. Further below, we’ll see just how confusing this can be.

Guideline: Never change the default arguments of overridden inherited functions.

We could go one step further:

Guideline: Avoid default arguments on virtual functions in general.

Finally, public virtual functions are great when a class is acting as a pure abstract base class (ABC) that only specifies the virtual interface without implementations, like a C# or Java interface does.

Guideline: Prefer to have a class contain only public virtual functions, or no public virtual functions (other than the destructor which is special).

A pure abstract base class should have only public virtual functions. …

But when a class is both providing virtual functions and their implementations, consider the Non-Virtual Interface pattern (NVI) that makes the public interface and the virtual interface separate and distinct.

… For any other base class, prefer making public member functions non-virtual, and virtual member functions non-public; the former should have any default arguments and can be implemented in terms of the latter.

This cleanly separates the public interface from the derivation interface, lets each follow its natural form best suited for its distinct audience, and avoids having one function exist in tension from doing double duty with two responsibilities. Among other benefits, using NVI will often clarify your class’s design in important ways, including for example that the default arguments which matter to the caller therefore naturally belong on the public interface, not on the virtual interface. Following this pattern means that several classes of potential problems, including this one of virtuals with default arguments, just naturally don’t arise.

The C++ standard library follows NVI nearly universally, and other modern OO languages and environments have rediscovered this principle for their own library design guidelines, such as in the .NET Framework Design Guidelines.

2. (b) What did the programmer probably expect the program to print, but what is the actual result?

Now that we have those issues out of the way, let’s look at the mainline and see whether it does that the programmer intended:

int main() {
    base    b;
    derived d;
    base*   pb = new derived;

    b.f(1.0);

No problem. This first call invokes base::f( double ), as expected.

    d.f(1.0);

This calls derived::f( complex<double> ). Why? Well, remember that derived doesn’t declare using base::f; to bring the base functions named f into scope, and so clearly base::f( int ) and base::f( double ) can’t be called. They are not present in the same scope as derived::f( complex<double> ) so as to participate in overloading.

The programmer may have expected this to call base::f( double ), but in this case there won’t even be a compile error because fortunately(?) complex<double> provides an implicit conversion from double, and so the compiler interprets this call to mean derived::f( complex<double>(1.0) ).

    pb->f(1.0);

Interestingly, even though the base* pb is pointing to a derived object, this calls base::f( double ) because overload resolution is done on the static type (here base), not the dynamic type (here derived). You have a base pointer, you get the base interface.

For the same reason, the call pb->f(complex<double>(1.0)); would not compile, because there is no satisfactory function in the base interface.

    b.g();

This prints 10, because it simply invokes base::g( int ) whose parameter defaults to the value 10. No sweat.

    d.g();

This prints derived::g() 20, because it simply invokes derived::g( int ) whose parameter defaults to the value 20. Also no sweat.

    pb->g();

This prints derived::g() 10.

“Wait a minute!” you might protest. “What’s going on here?” This result may temporarily lock your mental brakes and bring you to a screeching halt until you realize that what the compiler has done is quite proper. (Although, of course, the programmer of derived ought to be taken out into the back parking lot and yelled at.) The thing to remember is that, like overloads, default parameters are taken from the static type (here base) of the object, hence the default value of 10 is taken. However, the function happens to be virtual, and so the function actually called is based on the dynamic type (here derived) of the object. Again, this can be avoided by avoiding default arguments on virtual functions, such as by following NVI and avoiding public virtual functions entirely.

    delete pb;
}

Finally, as noted, this shouldn’t be needed because you should be using unique_ptrs which do the cleanup for you, and base should have a virtual destructor so that destruction via any pointer to base is correct.

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: litb1, KrzaQ, mttpd.

Virtual functions are a pretty basic feature, but they occasionally harbor subtleties that trap the unwary. If you can answer questions like this one, then you know virtual functions cold, and you’re less likely to waste a lot of time debugging problems like the ones illustrated below.

 

Problem

JG Question

1. What do the override and final keywords do? Why are they useful?

Guru Question

2. In your travels through the dusty corners of your company’s code archives, you come across the following program fragment written by an unknown programmer. The programmer seems to have been experimenting to see how some C++ features worked.

(a) What could be improved in the code’s correctness or style?

(b) What did the programmer probably expect the program to print, but what is the actual result?

class base {
public:
virtual void f( int );
virtual void f( double );
virtual void g( int i = 10 );
};

void base::f( int ) {
cout << "base::f(int)" << endl;
}

void base::f( double ) {
cout << "base::f(double)" << endl;
}

void base::g( int i ) {
cout << i << endl;
}

class derived: public base {
public:
void f( complex<double> );
void g( int i = 20 );
};

void derived::f( complex<double> ) {
cout << "derived::f(complex)" << endl;
}

void derived::g( int i ) {
cout << "derived::g() " << i << endl;
}

int main() {
base b;
derived d;
base* pb = new derived;

b.f(1.0);
d.f(1.0);
pb->f(1.0);

b.g();
d.g();
pb->g();

delete pb;
}

Follow

Get every new post delivered to your Inbox.

Join 1,423 other followers