Feeds:
Posts
Comments

Archive for the ‘GotW’ Category

GotW #96: Oversharing

Following on from #95, let’s consider reasons and methods to avoid mutable sharing in the first place…

 

Problem

Consider the following code from GotW #95′s solution, where some_obj is a shared variable visible to multiple threads which then synchronize access to it.

// thread 1
{
lock_guard hold(mut_some_obj); // acquire lock
code_that_reads_from( some_obj ); // passes some_obj by const &
}

// thread 2
{
lock_guard hold(mut_some_obj); // acquire lock
code_that_modifies( some_obj ); // passes some_obj by non-const &
}

 

JG Questions

1. Why do mutable shared variables like some_obj make your code:

(a) more complex?

(b) more brittle?

(c) less scalable?

 

Guru Questions

2. Give an example of how the code that uses a mutable shared variable like some_obj can be changed so that the variable is:

(a) not shared.

(b) not mutable.

3. Let’s say we’re in a situation where we can’t apply the techniques from the answers to #2, so that the variable itself must remain shared and apparently mutable. Is there any way that the internal implementation of the variable can make the variable be physically not shared and/or not mutable, so that the calling code can treat it as a logically shared-and-mutable object yet not need to perform external synchronization? If so, explain. If not, why not?

Read Full Post »

This GotW was written to answer a set of related frequently asked questions. So here’s a mini-FAQ on “thread safety and synchronization in a nutshell,” and the points we’ll cover apply to thread safety and synchronization in pretty much any mainstream language.

 

Problem

JG Questions

1. What is a race condition, and how serious is it?

2. What is a correctly synchronized program? How do you achieve it? Be specific.

 

Guru Questions

3. Consider the following code, where some_obj is a shared variable visible to multiple threads.

// thread 1 (performs no additional synchronization)
code_that_reads_from( some_obj ); // passes some_obj by const &

// thread 2 (performs no additional synchronization)
code_that_modifies( some_obj ); // passes some_obj by non-const &

If threads 1 and 2 can run concurrently, is this code correctly synchronized if the type of some_obj is:

(a) int?

(b) string?

(c) vector<map<int,string>>?

(d) shared_ptr<widget>?

(e) mutex?

(f) condition_variable?

(g) atomic<unsigned>?

Hint: This is actually a two-part question, not a seven-part question. There are only two unique answers, each of which covers a subset of the cases.

4. External synchronization means that the code that uses/owns a given shared object is responsible for performing synchronization on that object. Answer the following questions related to external synchronization:

(a) What is the normal external synchronization responsibility of code that owns and uses a given shared variable?

(b) What is the “basic thread safety guarantee” that all types must obey to enable calling code to perform normal external synchronization?

(c) What partial internal synchronization can still be required within the shared variable’s implementation?

5. Full internal synchronization (a.k.a. “synchronized types” or “thread-safe types”) means that a shared object performs all necessary synchronization internally within that object, so that calling code does not need to perform any external synchronization. What types should be fully internally synchronized, and why?

 

Solution

Preface

The discussion in this GotW applies not only to C++ but also to any mainstream language, except mainly that certain races have defined behavior in C# and Java. But the definition of what variables need to be synchronized, the tools we use to synchronize them, and the distinction between external and internal synchronization and when you use each one, are the same in all mainstream languages. If you’re a C# or Java programmer, everything here applies equally to you, with some minor renaming such as to rename C++ atomic to C#/Java volatile, although some concepts are harder to express in C#/Java (such as identifying the read-only methods on an otherwise mutable shared object; there are readonly fields and “read-only” properties that have get but not set, but they express a subset of what you can express using C++ const on member functions).

Note: C++ volatile variables (which have no analog in languages like C# and Java) are always beyond the scope of this and any other article about the memory model and synchronization. That’s because C++ volatile variables aren’t about threads or communication at all and don’t interact with those things. Rather, a C++ volatile variable should be viewed as portal into a different universe beyond the language — a memory location that by definition does not obey the language’s memory model because that memory location is accessed by hardware (e.g., written to by a daughter card), have more than one address, or is otherwise “strange” and beyond the language. So C++ volatile variables are universally an exception to every guideline about synchronization because are always inherently “racy” and unsynchronizable using the normal tools (mutexes, atomics, etc.) and more generally exist outside all normal of the language and compiler including that they generally cannot be optimized by the compiler (because the compiler isn’t allowed to know their semantics; a volatile int vi; may not behave anything like a normal int, and you can’t even assume that code like vi = 5; int read_back = vi; is guaranteed to result in read_back == 5, or that code like int i = vi; int j = vi; that reads vi twice will result in i == j which will not be true if vi is a hardware counter for example). For more discussion, see my article “volatile vs. volatile.”

 

1. What is a race condition, and how serious is it?

A race condition occurs when two threads access the same shared variable concurrently, and at least one is a non-const operation (writer). Concurrent const operations are valid, and do not race with each other.

Consecutive nonzero-length bitfields count as a single variable for the purpose of defining what a race condition is.

Terminology note: Some people use “race” in a different sense, where in a program with no actual race conditions (as defined above) still operations on different threads could interleave in different orders in different executions of a correctly-synchronized program depending on how fast threads happen to execute relative to each other. That’s not a race condition in the sense we mean here—a better term for that might be “timing-dependent code.”

If a race condition occurs, your program has undefined behavior. C++ does not recognize any so-called “benign races”—and in languages that have recognized some races as “benign” the community has gradually learned over time that many of them actually, well, aren’t.

Guideline: Reads (const operations) on a shared object are safe to run concurrently with each other without synchronization.

 

2. What is a correctly synchronized program? How do you achieve it? Be specific.

A correctly synchronized program is one that contains no race conditions. You achieve it by making sure that, for every shared variable, every thread that performs a write (non-const operation) on that variable is synchronized so that no other reads or writes of that variable on other threads can run concurrently with that write.

The shared variable usually protected by:

  • (commonly) using a mutex or equivalent;
  • (very rarely) by making it atomic if that’s appropriate, such as in low-lock code; or
  • (very rarely) for certain types by performing the synchronization internally, as we will see below.

 

3. Consider the following code… If threads 1 and 2 can run concurrently, is this code correctly synchronized if the type of some_obj is: (a) int? (b) string? (c) vector<map<int,string>>? (d) shared_ptr<widget>?

No. The code has one thread reading (via const operations) from some_obj, and a second thread writing to the same variable. If those threads can execute at the same time, that’s a race and a direct non-stop ticket to undefined behavior land.

The answer is to synchronize access to the variable, for example using a mutex:

// thread 1
{
lock_guard hold(mut_some_obj); // acquire lock
code_that_reads_from( some_obj ); // passes some_obj by const &
}

// thread 2
{
lock_guard hold(mut_some_obj); // acquire lock
code_that_modifies( some_obj ); // passes some_obj by non-const &
}

Virtually all types, including shared_ptr and vector and other types, are just as thread-safe as int; they’re not special for concurrency purposes. It doesn’t matter whether some_obj is an int, a string, a container, or a smart pointer… concurrent reads (const operations) are safe without synchronization, but the shared object is writeable, then the code that owns the object has to synchronize access to it.

But when I said this is true for “virtually all types,” I meant all types except for types that are not fully internally synchronized, which brings us to the types that, by design, are special for concurrency purposes…

 

… If threads 1 and 2 can run concurrently, is this code correctly synchronized if the type of g+shared is: (e) mutex? (f) condition_variable? (g) atomic<unsigned>?

Yes. For these types, the code is okay, because these types already perform full internal synchronization and so they are safe to access without external synchronization.

In fact, these types had better be safe to use without external synchronization, because they’re synchronization primitives you need to use as tools to synchronize other variables! And its turns out that that’s no accident…

Guideline: A type should only be fully internally synchronized if and only if its purpose is to provide inter-thread communication (e.g., a message queue) or synchronization (e.g., a mutex).

 

4. External synchronization means that the code that uses/owns a given shared object is responsible for performing synchronization on that object. Answer the following questions related to external synchronization:

(a) What is the normal external synchronization responsibility of code that owns and uses a given shared variable?

The normal synchronization duty of care is simply this: The code that knows about and owns a writeable shared variable has to synchronize access to it. It will typically do that using a mutex or similar (~99.9% of the time), or by making it atomic if that’s possible and appropriate (~0.1% of the time).

Guideline: The code that knows about and owns a writeable shared variable is responsible for synchronizing access to it.

 

(b) What is the “basic thread safety guarantee” that all types must obey to enable calling code to perform normal external synchronization?

To make it possible for the code that uses a shared variable to do the above, two basic things must be true.

First, concurrent operations on different objects must be safe. For example, let’s say we have two X objects x1 and x2, each of which is only used by one thread. Then consider this situation:

// Case A: Using distinct objects

// thread 1 (performs no additional synchronization)
x1.something(); // do something with x1

// thread 2 (performs no additional synchronization)
x2 = something_else; // do something else with x2

This must always be considered correctly synchronized. Remember, we stated that x1 and x2 are distinct objects, and cannot be aliases for the same object or similar hijinks.

Second, concurrent const operations that are just reading from the same variable x must be safe:

// Case B: const access to the same object

// thread 1 (performs no additional synchronization)
x.something_const(); // read from x (const operation)

// thread 2 (performs no additional synchronization)
x.something_else_const(); // read from x (const operation)

This code too must be considered correctly synchronized, and had better work without external synchronization. It’s not a race, because the two threads are both performing const accesses and reading from the shared object.

This brings us to the case where there might be a combination of internal and external synchronization required…

 

(c) What partial internal synchronization can still be required within the shared variable’s implementation?

In some classes, objects that from the outside appear to be distinct but still may share state under the covers, without the calling code being able to tell that two apparently distinct objects are connected under the covers. Note that this not an exception to the previous guideline—it’s the same guideline!

Guideline: It is always true that the code that knows about and owns a writeable shared variable is responsible for synchronizing access to it. If the writeable shared state is hidden inside the implementation of some class, then it’s simply that class’ internals that are the ‘owning code’ that has to synchronize access to (just) the shared state that only it knows about.

A classic case of “under-the-covers shared state” is reference counting, and the two poster-child examples are std::shared_ptr and copy-on-write. Let’s use shared_ptr as our main example.

A reference-counted smart pointer like shared_ptr keeps a reference count under the covers. Let’s say we have two distinct shared_ptr objects sp1 and sp2, each of which is used by only one thread. Then consider this situation:

// Case A: Using distinct objects

// thread 1 (performs no additional synchronization)
auto x = sp1; // read from sp1 (writes the count!)

// thread 2 (performs no additional synchronization)
sp2 = something_else; // write to sp2 (writes the count!)

This code must be considered correctly synchronized, and had better work as shown without any external synchronization. Okay, fine …

… but what if sp1 and sp2 are pointing to the same object and so share a reference count? If so, that reference count is a writeable shared object, and so it must be synchronized to avoid a race—but it is in general impossible for the calling code to do the right synchronization, because it is not even aware of the sharing! The code we just saw above doesn’t see the count, doesn’t know the count variable’s name, and doesn’t in general know which pointers share counts.

Similarly, consider two threads just reading from the same variable sp:

// Case B: const access to the same object

// thread 1 (performs no additional synchronization)
auto sp3 = sp; // read from sp (writes the count!)

// thread 2 (performs no additional synchronization)
auto sp4 = sp; // read from sp (writes the count!)

This code too must be considered correctly synchronized, and had better work without external synchronization. It’s not a race, because the two threads are both performing const accesses and reading from the shared object. But under the covers, reading from sp to copy it increments the reference count, and so again that reference count is a writeable shared object, and so it must be synchronized to avoid a race—and again it is in general impossible for the calling code to do the right synchronization, because it is not even aware of the sharing.

So to deal with these cases, the code that knows about the shared reference count, namely the shared_ptr implementation, has to synchronize access to the reference count. For reference counting, this is typically done by making the reference count a mutable atomic variable. (See also GotW #6a and #6b.)

For completeness, yes, of course external synchronization is still required as usual if the calling code shared a given visible shared_ptr object and makes that same shared_ptr object writable across threads:

// Case C: External synchronization still required as usual
// for non-const access to same visible shared object

// thread 1
{
lock_guard hold(mut_sp); // acquire lock
auto sp3 = sp; // read from sp
}

// thread 2
{
lock_guard hold(mut_sp); // acquire lock
sp = something_else; // modify sp
}

So it’s not like shared_ptr is a fully internally synchronized type; if the caller is sharing an object of that type, the caller must synchronize access to it like it would do for other types, as noted in Question 3(d).

So what’s the purpose of the internal synchronization? It’s only to do necessary synchronization on the parts that the internals know are shared and that the internals own, but that the caller can’t synchronize because he doesn’t know about the sharing and shouldn’t need to because the caller doesn’t own them, the internals do. So in the internal implementation of the type we do just enough internal synchronization to get back to the level where the caller can assume his usual duty of care and in the usual ways correctly synchronize any objects that might actually be shared.

The same applies to other uses of reference counting, such as copy-on-write strategies. It also applies generally to any other internal sharing going on under the covers between objects that appear distinct and independent to the calling code.

Guideline: If you design a class where two objects may invisibly share state under the covers, it is your class’ responsibility to internally synchronize access to that mutable shared state (only) that it owns and that only it can see, because the calling code can’t. If you opt for under-the-covers-sharing strategies like copy-on-write, be aware of the duty you’re taking on for yourself and code with care.

For why such internal shared state should be mutable, see GotW #6a and #6b.

 

5. What types should be fully internally synchronized, and why?

There is exactly one category of types which should be fully internally synchronized, so that any object of that type is always safe to use concurrently without external synchronization: Inter-thread synchronization and communication primitives themselves. This includes standard types like mutexes and atomics, but also inter-thread communication and synchronization types you might write yourself such as a message queue (communicating messages from one thread to another), Producer/Consumer active objects (again passing data from one concurrent entity to another), or a thread-safe counter (communicating counter increments and decrements among multiple threads).

If you’re wondering if there might be other kinds of types that should be internally synchronized, consider: The only type for which it would make sense to always internally synchronize every operation is a type where you know every object is going to be both (a) writeable and (b) shared across threads… and that means that the type is by definition designed to be used for inter-thread communication and/or synchronization.

 

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: Daniel Hardman, Casey, Alb, Marcel Wid, ixache.

Read Full Post »

This GotW was written to answer a set of related frequently asked questions. So here’s a mini-FAQ on “thread safety and synchronization in a nutshell,” and the points we’ll cover apply to thread safety and synchronization in pretty much any mainstream language.

 

Problem

JG Questions

1. What is a race condition, and how serious is it?

2. What is a correctly synchronized program? How do you achieve it? Be specific.

 

Guru Questions

3. Consider the following code, where some_obj is a shared variable visible to multiple threads.

// thread 1 (performs no additional synchronization)
code_that_reads_from( some_obj ); // passes some_obj by const &

// thread 2 (performs no additional synchronization)
code_that_modifies( some_obj ); // passes some_obj by non-const &

If threads 1 and 2 can run concurrently, is this code correctly synchronized if the type of some_obj is:

(a) int?

(b) string?

(c) vector<map<int,string>>?

(d) shared_ptr<widget>?

(e) mutex?

(f) condition_variable?

(g) atomic<unsigned>?

Hint: This is actually a two-part question, not a seven-part question. There are only two unique answers, each of which covers a subset of the cases.

4. External synchronization means that the code that uses/owns a given shared object is responsible for performing synchronization on that object. Answer the following questions related to external synchronization:

(a) What is the normal external synchronization responsibility of code that owns and uses a given shared variable?

(b) What is the “basic thread safety guarantee” that all types must obey to enable calling code to perform normal external synchronization?

(c) What partial internal synchronization can still be required within the shared variable’s implementation?

5. Full internal synchronization (a.k.a. “synchronized types” or “thread-safe types”) means that a shared object performs all necessary synchronization internally within that object, so that calling code does not need to perform any external synchronization. What types should be fully internally synchronized, and why?

Read Full Post »

Now the unnecessary headers have been removed, and avoidable dependencies on the internals of the class have been eliminated. Is there any further decoupling that can be done? The answer takes us back to basic principles of solid class design.

 

Problem

JG Question

1. What is the tightest coupling you can express in C++? And what’s the second-tightest?

Guru Question

2. The Incredible Shrinking Header has now been greatly trimmed, but there may still be ways to reduce the dependencies further. What further #includes could be removed if we made further changes to X, and how?

This time, you may make any changes at all to X as long as they don’t change its public interface, so that existing code that uses X is unaffected. Again, note that the comments are important.

//  x.h: after converting to use a Pimpl to hide implementation details
//
#include <iosfwd>
#include <memory>
#include "a.h" // class A (has virtual functions)
#include "b.h" // class B (has no virtual functions)
class C;
class E;

class X : public A, private B {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;

private:
struct impl;
std::unique_ptr<impl> pimpl; // ptr to a forward-declared class
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}

 

Solution

1. What is the tightest coupling you can express in C++? And what’s the second-tightest?

Friendship and inheritance, respectively.

A friend of a class has access to everything in that class, including all of its private data and functions, and so the code in a friend depends on every detail of the type. Now that’s a close friend!

A class derived from a class Base has access to public and protected members in Base, and depends on the size and layout of Base because it contains a Base subobject. Further, the inheritance relationship means that a derived type is at least by default substitutable for its Base; whether the inheritance is public or nonpublic only changes what other code can see and make use of the substitutability. That’s pretty tight coupling, second only to friendship.

 

2. What further #includes could be removed if we made further changes to X, and how?

Many programmers still seem to march to the “It isn’t OO unless you inherit!” battle hymn, by which I mean that they use inheritance more than necessary. I’ll save the whole lecture for another time, but my bottom line is simply that inheritance (including but not limited to IS-A) is a much stronger relationship than HAS-A or USES-A. When it comes to managing dependencies, therefore, you should always prefer composition/membership over inheritance wherever possible. To paraphrase Einstein: ‘Use as strong a relationship as necessary, but no stronger.’

In this code, X is derived publicly from A and privately from B. Recall that public inheritance should always model IS-A and satisfy the Liskov Substitutability Principle (LSP). In this case X IS-A A and there’s naught wrong with it, so we’ll leave that as it is.

But did you notice the curious thing about B‘s virtual functions?

“What?” you might say. “B has no virtual functions.”

Right. That is the curious thing.

B is a private base class of X. Normally, the only reason you would choose private inheritance over composition/membership is to gain access to protected members—which most of the time means “to override a virtual function.” (There are a few other rare and obscure reasons to inherit, but they’re, well, rare and obscure.) Otherwise you wouldn’t choose inheritance, because it’s almost the tightest coupling you can express in C++, second only to friendship.

We are given that B has no virtual functions, so there’s probably no reason to prefer the stronger relationship of inheritance—unless X needs access to some protected function or data in B, of course, but for now I’ll assume that this is not the case. So, instead of having a base subobject of type B, X probably ought to have simply a member object of type B. Therefore, the way to further simplify the header is:

 

(a) Remove unnecessary inheritance from class B.

#include "b.h"  // class B (has no virtual functions)

Because the B member object should be private (it is, after all, an implementation detail), and in order to get rid of the b.h header entirely, this member should live in X‘s hidden pimpl portion.

Guideline: Never inherit when composition is sufficient.

 

This leaves us with header code that’s vastly simplified from where we started in GotW #7a:

//  x.h: after removing unnecessary inheritance
//
#include <iosfwd>
#include <memory>
#include "a.h" // class A (has virtual functions)
class B;
class C;
class E;

class X : public A {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;

private:
struct impl;
std::unique_ptr<impl> pimpl; // this now quietly includes a B
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}

 

After three passes of progressively greater simplification, the final result is that x.h is still using other class names all over the place, but clients of X need only pay for three #includes: a.h, memory, and iosfwd. What an improvement over the original!

 

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: juanchopanza, anicolaescu, Bert Rodiers.

Read Full Post »

Now the unnecessary headers have been removed, and avoidable dependencies on the internals of the class have been eliminated. Is there any further decoupling that can be done? The answer takes us back to basic principles of solid class design.

 

Problem

JG Question

1. What is the tightest coupling you can express in C++? And what’s the second-tightest?

Guru Question

2. The Incredible Shrinking Header has now been greatly trimmed, but there may still be ways to reduce the dependencies further. What further #includes could be removed if we made further changes to X, and how?

This time, you may make any changes at all to X as long as they don’t change its public interface, so that existing code that uses X is unaffected. Again, note that the comments are important.

//  x.h: after converting to use a Pimpl to hide implementation details
//
#include <iosfwd>
#include <memory>
#include "a.h" // class A (has virtual functions)
#include "b.h" // class B (has no virtual functions)
class C;
class E;

class X : public A, private B {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;

private:
struct impl;
std::unique_ptr<impl> pimpl; // ptr to a forward-declared class
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}

Read Full Post »

Now that the unnecessary headers have been removed, it’s time for Phase 2: How can you limit dependencies on the internals of a class?

 

Problem

JG Questions

1. What does private mean for a class member in C++?

2. Why does changing the private members of a type cause a recompilation?

Guru Question

3. Below is how the header from the previous Item looks after the initial cleanup pass. What further #includes could be removed if we made some suitable changes, and how?

This time, you may make changes to X as long as X‘s base classes and its public interface remain unchanged; any current code that already uses X should not be affected beyond requiring a simple recompilation.

//  x.h: sans gratuitous headers
//
#include <iosfwd>
#include <list>

// None of A, B, C, or D are templates.
// Only A and C have virtual functions.
#include "a.h" // class A
#include "b.h" // class B
#include "c.h" // class C
#include "d.h" // class D
class E;

class X : public A, private B {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;

private:
std::list<C> clist;
D d;
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}

 

Solution

1. What does private mean for a class member in C++?

It means that outside code cannot access that member. Specifically, it cannot name it or call it.

For example, given this class:

class widget {
public:
void f() { }
private:
void f(int) { }
int i;
};

Outside code cannot use the name of the private members:

 int main() {
auto w = widget{};
w.f(); // ok
w.f(42); // error, cannot access name "f(int)"
w.i = 42; // error, cannot access name "i"
}

 

2. Why does changing the private members of a type cause a recompilation?

Because private data members can change the size of the object, and private member functions participate in overload resolution.

Note that accessibility is still safely enforced: Calling code still doesn’t get to use the private parts of the class. However, the compiler gets to know all about them at all times, including as it compiles the calling code. This does increase build coupling, but it’s for a deliberate reason: C++ has always been designed for efficiency, and a little-appreciated cornerstone of that is that C++ is designed to by default expose a type’s full implementation to the compiler in order to make aggressive optimization easier. It’s one of the fundamental reasons C++ is an efficient language.

 

3. What further #includes could be removed if we made some suitable changes, and how? … any current code that already uses X should not be affected beyond requiring a simple recompilation.

There are a few things we weren’t able to do in the previous problem:

  • We had to leave a.h and b.h. We couldn’t get rid of these because X inherits from both A and B, and you always have to have full definitions for base classes so that the compiler can determine X‘s object size, virtual functions, and other fundamentals. (Can you anticipate how to remove one of these? Think about it: Which one can you remove, and why/how? The answer will come shortly.)
  • We had to leave list, c.h and d.h. We couldn’t get rid of these right away because a list<C> and a D appear as private data members of X. Although C appears as neither a base class nor a member, it is being used to instantiate the list member, and some have compilers required that when you instantiate list<C> you be able to see the definition of C. (The standard doesn’t require a definition here, though, so even if the compiler you are currently using has this restriction, you can expect the restriction to go away over time.)

Now let’s talk about the beauty of Pimpls.

 

The Pimpl Idiom

C++ lets us easily encapsulate the private parts of a class from unauthorized access. Unfortunately, because of the header file approach inherited from C, it can take a little more work to encapsulate dependencies on a class’ privates.

“But,” you say, “the whole point of encapsulation is that the client code shouldn’t have to know or care about a class’ private implementation details, right?” Right, and in C++ the client code doesn’t need to know or care about access to a class’ privates (because unless it’s a friend it isn’t allowed any), but because the privates are visible in the header the client code does have to depend upon any types they mention. This coupling between the caller and the class’s internal details creates dependencies on both (re)compilation and binary layout.

How can we better insulate clients from a class’ private implementation details? One good way is to use a special form of the handle/body idiom, popularly called the Pimpl Idiom because of the intentionally pronounceable pimpl pointer, as a compilation firewall.

A Pimpl is just an opaque pointer (a pointer to a forward-declared, but undefined, helper class) used to hide the private members of a class. That is, instead of writing this:

// file widget.h
//
class widget {
// public and protected members
private:
// private members; whenever these change,
// all client code must be recompiled
};

We write instead:

// file widget.h
//
#include <memory>

class widget {
public:
widget();
~widget();
// public and protected members
private:
struct impl;
std::unique_ptr<impl> pimpl; // ptr to a forward-declared class
};

// file widget.cpp
//
#include "widget.h"

struct widget::impl {
// private members; fully hidden, can be
// changed at will without recompiling clients
};

widget::widget() : pimpl{ make_unique<widget::impl>(/*...*/) } { }
widget::~widget() =default;

Every widget object dynamically allocates its impl object. If you think of an object as a physical block, we’ve essentially lopped off a large chunk of the block and in its place left only “a little bump on the side”—the opaque pointer, or Pimpl. If copy and move are appropriate for your type, write those four operations to perform a deep copy that clones the impl state.

The major advantages of this idiom come from the fact that it breaks the caller’s dependency on the private details, including breaking both compile-time dependencies and binary dependencies:

  • Types mentioned only in a class’ implementation need no longer be defined for client code, which can eliminate extra #includes and improve compile speeds.
  • A class’ implementation can be changed—that is, private members can be freely added or removed—without recompiling client code. This is a useful technique for providing ABI-safety or binary compatibility, so that the client code is not dependent on the exact layout of the object.

The major costs of this idiom are in performance:

  • Each construction/destruction must allocate/deallocate memory.
  • Each access of a hidden member can require at least one extra indirection. (If the hidden member being accessed itself uses a back pointer to call a function in the visible class, there will be multiple indirections, but is usually easy to avoid needing a back pointer.)

And of course we’re replacing any removed headers with the <memory> header.

We’ll come back to these and other Pimpl issues in GotW #24. For now, in our example, there were three headers whose definitions were needed simply because they appeared as private members of X. If we instead restructure X to use a Pimpl, we can immediately make several further simplifications:

#include <list>
#include "c.h" // class C
#include "d.h" // class D

One of these headers (c.h) can be replaced with a forward declaration because C is still being mentioned elsewhere as a parameter or return type, and the other two (list and d.h) can disappear completely.

Guideline: For widely-included classes whose implementations may change, or to provide ABI-safety or binary compatibility, consider using the compiler-firewall idiom (Pimpl Idiom) to hide implementation details. Use an opaque pointer (a pointer to a declared but undefined class) declared as struct impl; std::unique_ptr<impl> pimpl; to store private nonvirtual members.

 

Note: We can’t tell from the original code by itself whether or not X had (default) copy or move operations. If it did, then to preserve that we would need to write them again ourselves since the move-only unique_ptr member suppresses the implicit generation of copy construction and copy assignment, and the user-declared destructor suppresses the implicit generation of move construction and move assignment. If we do need to write them by hand, the move constructor and move assignment can be =defaulted, and the copy constructor and copy assignment will need to copy the Pimpl object.

After making that additional change, the header looks like this:

//  x.h: after converting to use a Pimpl
//
#include <iosfwd>
#include <memory>
#include "a.h" // class A (has virtual functions)
#include "b.h" // class B (has no virtual functions)
class C;
class E;

class X : public A, private B {
public:
~X(); // defined out of line
// and copy/move operations if X had them before

X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;

private:
struct impl;
std::unique_ptr<impl> pimpl; // ptr to a forward-declared class
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}

Without more extensive changes, we still need the definitions for A and B because they are base classes, and we have to know at least their sizes in order to define the derived class X.

The private details go into X‘s implementation file where client code never sees them and therefore never depends upon them:

//  Implementation file x.cpp
//
#include <list>
#include "c.h" // class C
#include "d.h" // class D
using namespace std;

struct X::impl {
list<C> clist;
D d;
};

X::X() : pimpl{ make_unique<X::impl>(/*...*/) } { }
X::~X() =default;

That brings us down to including only four headers, which is a great improvement—but it turns out that there is still a little more we could do, if only we were allowed to change the structure of X more extensively. This leads us nicely into Part 3…

 

Acknowledgments

Thanks to the following for their feedback to improve this article: John Humphrey, thokra, Motti Lanzkron, Marcelo Pinto.

Read Full Post »

Now that the unnecessary headers have been removed, it’s time for Phase 2: How can you limit dependencies on the internals of a class?

Problem

JG Questions

1. What does private mean for a class member in C++?

2. Why does changing the private members of a type cause a recompilation?

Guru Question

3. Below is how the header from the previous Item looks after the initial cleanup pass. What further #includes could be removed if we made some suitable changes, and how?

This time, you may make changes to X as long as X‘s base classes and its public interface remain unchanged; any current code that already uses X should not be affected beyond requiring a simple recompilation.

//  x.h: sans gratuitous headers
//
#include <iosfwd>
#include <list>

// None of A, B, C, or D are templates.
// Only A and C have virtual functions.
#include "a.h"  // class A
#include "b.h"  // class B
#include "c.h"  // class C
#include "d.h"  // class D
class E;

class X : public A, private B {
public:
       X( const C& );
    B  f( int, char* );
    C  f( int, C );
    C& g( B );
    E  h( E );
    virtual std::ostream& print( std::ostream& ) const;

  private:
    std::list<C> clist;
    D            d_;
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
    return x.print(os);
}

Read Full Post »

Managing dependencies well is an essential part of writing solid code. C++ supports two powerful methods of abstraction: object-oriented programming and generic programming. Both of these are fundamentally tools to help manage dependencies, and therefore manage complexity. It’s telling that all of the common OO/generic buzzwords—including encapsulation, polymorphism, and type independence—along with most design patterns, are really about describing ways to manage complexity within a software system by managing the code’s interdependencies.

When we talk about dependencies, we usually think of run-time dependencies like class interactions. In this Item, we will focus instead on how to analyze and manage compile-time dependencies. As a first step, try to identify (and root out) unnecessary headers.

Problem

JG Question

1. For a function or a class, what is the difference between a forward declaration and a definition?

Guru Question

2. Many programmers habitually #include many more headers than necessary. Unfortunately, doing so can seriously degrade build times, especially when a popular header file includes too many other headers.

In the following header file, what #include directives could be immediately removed without ill effect? You may not make any changes other than removing or rewriting (including replacing) #include directives. Note that the comments are important.

//  x.h: original header
//
#include <iostream>
#include <ostream>
#include <list>

// None of A, B, C, D or E are templates.
// Only A and C have virtual functions.
#include "a.h"  // class A
#include "b.h"  // class B
#include "c.h"  // class C
#include "d.h"  // class D
#include "e.h"  // class E

class X : public A, private B {
public:
       X( const C& );
    B  f( int, char* );
    C  f( int, C );
    C& g( B );
    E  h( E );
    virtual std::ostream& print( std::ostream& ) const;

  private:
    std::list<C> clist;
    D            d_;
  };

std::ostream& operator<<( std::ostream& os, const X& x ) {
    return x.print(os);
}

Solution

1. For a function or class, what is the difference between a forward declaration and a definition?

A forward declaration of a (possibly templated) function or class simply introduces a name. For example:

class widget;  // "widget" names a class 

widget* p;     // ok: allocates sizeof(*) space typed as widget*

widget  w;     // error: wait, what? how big is that? does it have a
               //        default constructor?

Again, a forward declaration only introduces a name. It lets you do things that require only the name, such as declaring a pointer to it—all pointers to objects are the same size and have the same set of operations you can perform on them, and ditto for pointers to nonmember functions, so the name is all you need to make a strongly-typed and fully-usable variable that’s a pointer to class or pointer to function.

What a class forward declaration does not do is tell you anything about what you can do with the type itself, such as what constructors or member functions it has or how big it is if you want to allocate space for one. If you try to create a widget w; with only the above code, you’ll get a compile-time error because widget has no definition yet and so the compiler can’t know how much space to allocate or what functions the type has (including whether it has a default constructor).

A class definition has a body and lets you know the class’s size and know the names and types of its members:

class widget { // "{" means definition
    widget();
    // ...
};

widget* p;     // ok: allocs sizeof(ptr) space typed as widget*

widget  w;     // ok: allocs sizeof(widget) space typed as widget
               //     and calls default constructor

2. In the following header file, what #include directives could be immediately removed without ill effect?

Of the first two standard headers mentioned in x.h, one can be immediately removed because it’s not needed at all, and the second can be replaced with a smaller header:

1. Remove iostream.

#include <iostream>

Many programmers #include <iostream> purely out of habit as soon as they see anything resembling a stream nearby. Class X does make use of streams, that’s true; but it doesn’t mention anything specifically from iostream, which mainly declares the standard stream objects like cout. At the most, X needs ostream alone for its basic_ostream type, and even that can be whittled down as we will see.

Guideline: Never #include unnecessary header files.

2. Replace ostream with iosfwd.

#include <ostream>

Parameter and return types only need to be forward-declared, so instead of the full definition of ostream we really only need its forward declaration.

However, you can’t write the forward declaration yourself using something like class ostream;. First, ostream lives in namespace std in which you can’t redeclare existing standard types and objects. Second, ostream is an alias for basic_ostream<char> which you couldn’t reliably forward-declare even if you were allowed to because library implementations are allowed to do things like add their own extra template parameters beyond those required by the standard that of course your code wouldn’t know about—which is one of the primary reasons for the rule that programmers aren’t allowed to write their own declarations for things in namespace std.

All is not lost, though: The standard library helpfully provides the header iosfwd, which contains forward declarations for all of the stream templates and their standard aliases, including basic_ostream and ostream. So all we need to do is replace #include <ostream> with #include <iosfwd>.

Guideline: Prefer to #include <iosfwd> when a forward declaration of a stream will suffice.

Incidentally, once you see iosfwd, one might think that the same trick would work for other standard library templates like string and list. There are, however, no comparable “stringfwd” or “listfwd” standard headers. The iosfwd header was created to give streams special treatment for backwards compatibility, to avoid breaking code written in years past for the “old” non-templated version of the iostreams subsystem. It is hoped that a real solution will come in a future version of C++ that supports modules, but that’s a topic for a later time.

There, that was easy. We can now move on to…

… what? “Not so fast!” I hear some of you say. “This header does a lot more with ostream than just mention it as a parameter or return type. The inlined operator<< actually uses an ostream object! So it must need ostream‘s definition, right?”

That’s a reasonable question. Happily, the answer is: No, it doesn’t. Consider again the function in question:

std::ostream& operator<<( std::ostream& os, const X& x ) {
    return x.print(os);
}

This function mentions an ostream& as both a parameter and a return type, which most people know doesn’t require a definition. And it passes its ostream& parameter in turn as a parameter to another function, which many people don’t know doesn’t require a definition either—it’s the same as if it were a pointer, ostream*, discussed above. As long as that’s all we’re doing with the ostream&, there’s no need for a full ostream definition—we’re not really using an ostream itself at all, such as by calling functions on it, we’re only using a reference to type for which we only need to know the name. Of course, we would need the full definition if we tried to call any member functions, for example, but we’re not doing anything like that here.

So, as I was saying, we can now move on to get rid of one of the other headers, but only one just yet:

3. Replace e.h with a forward declaration.

#include "e.h"  // class E

Class E is just being mentioned as a parameter and as a return type in function E h(E), so no definition is required and x.h shouldn’t be pulling in e.h in the first place because the caller couldn’t even be calling this function if he didn’t have the definition of E already, so there’s no point in including it again. (Note this would not be true if E were only a return type, such as if the signature were E h();, because in that case it’s good style to include E’s definition for the caller’s convenience so he can easily write code like auto val = x.h();.) All we need to do is replace #include “e.h” with class E;.

Guideline: Never #include a header when a forward declaration will suffice.

That’s it.

You may be wondering why we can’t get rid of the other headers yet. It’s because to define class X means you need to know its size in order to know how much space to allocate for an X object, and to know X’s size you need to know at least the size of every base class and data member. So we need the definitions of A and B because they are base classes, and we need the header definitions of list, C, and D because they are used to define the data members. How we can begin to address some of these is the subject of Part 2…

 

Acknowledgments

Thanks to the following for their feedback to improve this article: Gennaro, Sebastien Redl, Emmanuel Thivierge.

Read Full Post »

GotW #7a: Minimizing Compile-Time Dependencies, Part 1

Managing dependencies well is an essential part of writing solid code. C++ supports two powerful methods of abstraction: object-oriented programming and generic programming. Both of these are fundamentally tools to help manage dependencies, and therefore manage complexity. It’s telling that all of the common OO/generic buzzwords—including encapsulation, polymorphism, and type independence—along with the lion’s share of design patterns, are really about describing ways to manage complexity within a software system by managing the code’s interdependencies.

When we talk about dependencies, we usually think of run-time dependencies like class interactions. In this Item, we will focus instead on how to analyze and manage compile-time dependencies. As a first step, try to identify (and root out) unnecessary headers.

Problem

JG Question

1. For a function or a class, what is the difference between a forward declaration and a definition?

Guru Question

2. Many programmers habitually #include many more headers than necessary. Unfortunately, doing so can seriously degrade build times, especially when a popular header file includes too many other headers.

In the following header file, what #include directives could be immediately removed without ill effect? You may not make any changes other than removing or rewriting #include directives. Note that the comments are important.

//  x.h: original header
//
#include <iostream>
#include <ostream>
#include <list>

// None of A, B, C, D or E are templates.
// Only A and C have virtual functions.
#include "a.h" // class A
#include "b.h" // class B
#include "c.h" // class C
#include "d.h" // class D
#include "e.h" // class E

class X : public A, private B {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;

private:
std::list<C> clist;
D d_;
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}

Read Full Post »

Toward correct-by-default, efficient-by-default, and pitfall-free-by-default variable declarations, using “AAA style”… where “triple-A” is both a mnemonic and an evaluation of its value.

Problem

JG Questions

1. What does this code do? What would be a good name for some_function?

template<class Container, class Value>
void some_function( Container& c, const Value& v ) {
    if( find(begin(c), end(c), v) == end(c) )
        c.emplace_back(v); 
    assert( !c.empty() );
}

2. What does “write code against interfaces, not implementations” mean, and why is it generally beneficial?

Guru Questions

3. What are some popular concerns about using auto to declare variables? Are they valid? Discuss.

4. When declaring a new local variable x, what advantages are there to declaring it using auto and one of the two following syntaxes:

(a) auto x = init; when you don’t need to commit to a specific type? (Note: The expression init might include calling a helper that performs partial type adjustment, such as as_signed, while still not committing to a specific type.)

(b) auto x = type{ init }; when you do want to commit to a specific type by naming a type?

List as many as you can. (Hint: Look back to GotW #93.)

5. Explain how using the style suggested in #4 is consistent with, or actively leverages, the following other C++ features:

(a) Heap allocation syntax.

(b) Literal suffixes, including user-defined literal operators.

(c) Named lambda syntax.

(d) Function declarations.

(e) Template alias declarations.

6. Are there any cases where it is not possible to use the style in #4 to declare all local variables?

Solution

1. What does this code do? What would be a good name for some_function?

template<class Container, class Value>
void append_unique( Container& c, const Value& v ) {
    if( find(begin(c), end(c), v) == end(c) )
        c.emplace_back(v); 
    assert( !c.empty() );
}

Let’s call this function append_unique. First, it checks to see whether the value v is already in the container. If not, it appends it at the end. Finally, it asserts that c is not empty, since by now it must contain one copy of the value v.

You probably thought this question was fairly easy.

Maybe too easy.

If so, good. That’s the point of the example. Hold the thought, and we’ll come back to this in Question 3.

2. What does “write code against interfaces, not implementations” mean, and why is it generally beneficial?

It means we should care principally about “what,” not “how.” This separation of concerns applies at all levels in high-quality modern software—hiding code, hiding data, and hiding type. Each increases encapsulation and reduces coupling, which are essential for large-scale and robust software.

Please indulge a little repetition in the following paragraphs. It’s there to make a point about similarity.

Hiding code. With the invention of separately compiled functions and structured programming, we gained “encapsulation to hide code.” The caller knows the signature only—the function’s internal code is not his concern and not accessible programmatically, even if the function is inline and the body happens to be visible in source code. We try hard not to inadvertently leak implementation details, such as internal data structure types. The point is that the caller does not, and should not, commit to knowledge of the current internal code; if he did, it would create interdependencies and make separately compiled libraries impossible.

Hiding data (and code). With object oriented styles (OO), we gained two new manifestations of this separation. First, we got “more encapsulation to hide both code and data.” The caller knows the class name, bases, and member function signatures only—the class’s internal data and internal code are hidden and not accessible programmatically, even though the private class members are lexically visible in the class definition and inline function bodies may also be visible. (In turn, dynamic libraries and the potential future-C++ modules work aim to accomplish the same thing at a still larger scale.) Again we try hard not to inadvertently leak implementation details, and again the point is that the caller does not, and should not, commit to knowledge of the current internal data or code, which would make the class difficult to ever change or to ship on its own as a library.

Hiding type (run-time polymorphism). Second, OO also gave us “separation of interfaces to hide type.” A base class or interface can delegate work to a concrete derived implementation via virtual functions. Now the interface the caller sees and the implementation are actually different types, and the caller knows the base type only—he doesn’t know or care about the concrete type, including even its size. The point, once again, is that the caller does not, and should not, commit to a single concrete type, which would make the caller’s code less general and less able to be reused with new types.

Hiding type (compile-time polymorphism). With templates, we gained a new compile-time form of this separation—and it’s still “separation of interfaces to hide type.” The caller knows an ad-hoc “duck typed” set of operations he wants to perform using a type, and any type that supports those operations will do just fine. The contemplated future C++ concepts feature will allow making this stricter and less ad-hoc, but still avoids committing to a concrete type at all. The whole point is still is that the caller does not, and should not, commit to a single concrete type, which would make the caller’s code less generic and less able to be reused with new types.

3. What are some popular concerns about using auto to declare variables? Are they valid? Discuss.

In many languages, not just C++, there are several reasons people commonly give for why they are reluctant to use auto to declare variables (or the equivalent in another language, such as var or let). We could summarize them as: laziness, commitment, and readability. Let’s take them in order.

Laziness and commitment

First, laziness: One common concern is that “writing auto to declare a variable is primarily about saving typing.” However, this is just a misunderstanding of auto. As we saw in GotW #92 and #93 and will see again below, the main reasons to declare variables using auto are for correctness, performance, maintainability, and robustness—and, yes, convenience, but that’s in last place on the list.

Guideline: Remember that preferring auto variables is motivated primarily by correctness, performance, maintainability, and robustness—and only lastly about typing convenience.

Second, commitment: “But in some cases I do want to commit to a specific type, not automatically deduce it, so I can’t use auto.” It’s true that sometimes you do want to commit to a specific type, but you can still use auto. As demonstrated in GotW #92 and #93, not only can you still write declarations of the form auto x = type{ init }; (instead of type x{init};) to commit to a specific type, but there are good reasons for doing so, such as that saying auto means you can’t possibly forget to initialize the variable.

Guideline: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. It is self-documenting to show that the code is explicitly requesting a conversion, it guarantees the variable will be initialized, and it won’t allow an accidental implicit narrowing conversion. Only when you do want explicit narrowing, use ( ) instead of { }.

(Un)readability?

The third and most common argument concerns readability: “My code gets unreadable quickly when I don’t know what exact type my variable is without hunting around to see what that function or expression returns, so I can’t just use auto all the time.” There is truth to this, including losing the ability to search for occurrences of specific types when using the non-typed syntax auto x = expr; in 4(a) below, so this appears at first to be a strong argument. And it’s true that any feature can be overused. However, I think this argument is actually weaker than it first seems for four reasons, two minor and two major.

The two minor counterarguments are:

  • The “can’t use auto” part isn’t actually true, because as we just saw above you can be explicit about your type and still use auto, with good benefit.
  • The argument doesn’t apply when you’re using an IDE, because you can always tell the exact type, for example by hovering over the variable. Granted, this mitigation goes away when you leave the IDE, such as if you print the code.

But we should focus on the two major counterarguments:

  • It reflects a bias to code against implementations, not interfaces. Overcommitting to explicit types makes code less generic and more interdependent, and therefore more brittle and limited. It runs counter to the excellent reasons to “write code against interfaces, not implementations” we saw in Question 2.
  • We (meaning you) already ignore actual types all the time…

“… Wait, what? I do not ignore types all the time,” someone might say. Actually, not only do you do it, but you’re so comfortable and cavalier about it that you may not even realize you’re doing it. Let’s go back to that code in Question 1:

template<class Container, class Value>
void append_unique( Container& c, const Value& v ) {
    if( find(begin(c), end(c), v) == end(c) )
        c.emplace_back(v); 
    assert( !c.empty() );
}

Quick quiz: How many specific types are mentioned in that function? Name as many as you can.

Take a moment to consider that before reading on…

… We can see pretty quickly that the answer is a nice round number: Zero. Zilch. (Pedantic mode: Yes, there’s void, but I’m going to declare that void doesn’t count because it’s to denote “no type,” it’s not a meaningful type.)

Not a single specific type appears anywhere in this code, and the lack of exact types makes it much more powerful and doesn’t significantly harm its readability. Like most people, you probably thought Question 1 felt “easy” when we did it in isolation. Granted, this is generic code, and not all your code will be templates—but the point is that the code isn’t unreadable even though it doesn’t mention specific types, and in fact auto gives you the ability to write generic code even when not writing a template.

So starting with the cases illustrated in this short example, let’s consider some places where we routinely ignore exact types. First, function template parameters:

  • What exact type is Container? We have no idea, and that’s great… anything we can call begin, end, emplace_back and empty on and otherwise use as needed by this code will do just fine. In fact, we’re glad we don’t know anything about the exact type, because it means we’re following the Open/Closed Principle and staying open for extension— this append_unique will work fine with a type that won’t be written until years from now. Interestingly, the concepts feature currently being proposed for ISO C++ to express template parameter constraints doesn’t change how this works at all, it only makes it more convenient to express and check the requirements. Note how much more powerful this is compared to OO style frameworks: In OO frameworks where containers have to inherit from a base class or interface, that’s already inducing coupling and limiting the ability to just plug in and use arbitrary suitable types. It is important that we can know nothing at all about the type here besides its necessary interface, not even restricting it by as much as limiting it to types in a particular inheritance hierarchy. We should strongly resist compromising this wonderful and powerful “strictly typed but loosely coupled” genericity.
  • What exact type is Value? Again, we don’t know, and we don’t want to know… anything we can pass to find and emplace_back is just dandy. At this point some of you may be thinking: “Oh yes we know what type it is, it’s the container’s value type!” No, it doesn’t have to be that, it just has to be convertible, and that’s important. For example, we want vector<string> vec; append_unique(vec, “xyzzy”); to work, and “xyzzy” is a const char[6], not a string.

Second, function return values:

  • What type does find return? Some iterator type, the same as begin(c) coughed up, but we don’t know specifically what type it is just from reading this code, and it doesn’t matter. We can look up the signature if we’re feeling really curious, but nobody bothers doing that because anything that’s comparable to end(c) will do.
  • What type does empty return? We don’t even think twice about it. Something testable like a bool… we don’t care much what exactly as long as we can “not” it.

Third, many function parameters:

  • What specific type does emplace_back take? Don’t know; might be the same as v, might not. Really don’t care. Can we pass v to it? Yes? Groovy.

And that’s just in this example. We routinely and desirably ignore types in many other places, such as:

  • Fourth, any temporary object: We never get to name the object, much less name its type, and we may know what the type is but we don’t care about actually spelling out either name in our code.
  • Fifth, any use of a base class: We don’t know the dynamic concrete type we’re actually using, and that’s a benefit, not a bug.
  • Sixth, any call to a virtual function: Ditto; plus on top of that if the virtual function return type itself could also be covariant for another layer of “we don’t know the dynamic concrete type” since in the presence of covariance we don’t know what type we’re actually getting back.
  • Seventh, any use of function<>, bind, or other type erasure: Just think about how little we actually know, and how happy it makes us. For example, given a function<int(string)>, not only don’t we know what specific function or object it’s bound to, we don’t even know that thing’s signature—it might not actually even take a string or return an int, because conversions are allowed in both directions, so it only has to take something a string can be converted to, and return something that can be converted to an int. All we know is that it’s something that we can invoke with a string and that gives us back something we can use as an int. Ignorance is bliss.
  • Eighth, Any use of a C++14 generic lambda function: A generic lambda just means the function call operator is a template, after all, and like any function template it gets stamped out individually for whatever actual argument types you pass each time you use it.

There are probably more.

Although lack of commitment may be a bad thing in other areas of life, not committing to a specific type is often desirable by default in reusable code.

4. When declaring a new local variable x, what advantages are there to declaring it using auto and one of the two following syntaxes:

Let’s consider the base case first, which has by far the strongest arguments in its favor and is gaining quite a bit of traction in the C++ community.

(a) auto x = init; when you don’t need to commit to a specific type?

GotW #93 offered many concrete examples to support habitually declaring local variables using auto x = expr; when you don’t need to explicitly commit to a type. The advantages include:

  • It guarantees the variable will be initialized. Uninitialized variables are impossible because once you start by saying auto the = is required and cannot be forgotten.
  • It is efficient by default and guarantees that no implicit conversions (including narrowing conversions), temporary objects, or wrapper indirections will occur. In particular, prefer using auto instead of function<> to name lambdas unless you need the type erasure and indirection.
  • It guarantees that you will use the correct exact type now.
  • It guarantees that you will continue to use the correct exact type under maintenance as the code changes, and the variable’s type automatically tracks other functions’ and expressions’ types unless you explicitly said otherwise.
  • It is the simplest way to portably spell the implementation-specific type of arithmetic operations on built-in types, which vary by platform, and ensure that you cannot accidentally get lossy narrowing conversions when storing the result.
  • It is the only good option for hard-to-spell and impossible-to-spell types such as lambdas, binders, detail:: helpers, and template helpers (including expression templates when they should stay unevaluated for performance), short of resorting to repetitive decltype expressions or more-expensive indirections like function<>.
  • It is more symmetric and consistent with other parts of modern C++ (see Question 5).
  • And yes, it is just generally simpler and less typing.

See GotW #93 for concrete examples of these cases, where using auto helps eliminate correctness bugs, performance bugs, and silently nonportable code.

As noted in the questions, the expression init might include calling a helper that performs partial type adjustment, such as as_signed, while still not committing to a specific type. As shown in GotW #93, prefer to use auto x = as_signed(integer_expr); or auto x = as_unsigned(integer_expr); to store the result of an integer computation that should be signed or unsigned—these should be viewed as “casts that preserve width,” so we are not casting to a specific type but rather casting an attribute of the type while correctly preserving the other basic characteristics of the type, notably by not forcing it to commit to a particular size.

Using auto together with as_signed or as_unsigned makes code more portable: the variable will both be large enough (thanks to auto) and preserve the required signedness on all platforms. Note that signed/unsigned conversions within integer_expr may still occur and so you may need additional finer-grained as_signed/as_unsigned casts within the expression for full portability.

(b) auto x = type{ init }; when you do want to commit to a specific type by naming a type?

This is the explicitly typed form, and it still has advantages but they are not as clearly strong as implicitly typed form. The jury is still out on whether to recommend this one wholesale, as we’re still trying it out, but it does offer some advantages and I suggest you try it out for a while and see if it works well for you.

So here’s the recommendation to consider trying out for yourself: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. (Only when you do want to allow explicit narrowing, use ( ) instead of { }.) The advantages of this typed auto declaration style include:

  • It guarantees the variable will be initialized; you can’t forget.
  • It is self-documenting to show that the code is explicitly requesting a conversion.
  • It won’t allow an accidental implicit narrowing conversion.
  • It is more symmetric and consistent, both with the basic auto x = init; form and with other parts of C++…

… which brings us to Question 5.

5. Explain how using the style suggested in #4 is consistent with, or actively leverages, the following other C++ features:

Let’s start off this question with some side-by-side examples that give us a taste of the symmetry we gain when we habitually declare variables using modern auto style. Starting with two examples where we don’t need to commit to a type and then two where we do, we see that the right-hand style is not only more robust and maintainable for the reasons already given (for example, can you spot a subtle difference in the type of s, where the auto style is more correct?), but also arguably cleaner and more regular with the type consistently on the right when it is mentioned:

// Classic C++ declaration order     // Modern C++ style

const char* s = "Hello";             auto s = "Hello";
widget w = get_widget();             auto w = get_widget();

employee e{ empid };                 auto e = employee{ empid };
widget w{ 12, 34 };                  auto w = widget{ 12, 34 };

Now consider the (dare we say elegant) symmetry with each of the following.

(a) Heap allocation syntax.

When allocating heap variables, did you notice that the type name is already on the right naturally anyway? And since it’s there, we don’t want to have to repeat it. (I’ll show the raw “new” form for completeness, but prefer make_unique and make_shared in that order for allocation in modern code, resorting to raw new only well-encapsulated inside the implementation of low-level data structures.)

// Classic C++ declaration order     // Modern C++ style

widget* w = new widget{};            /* auto w = new widget{}; */
unique_ptr<widget> w                 auto w = make_unique<widget>();
  = make_unique<widget>();

(b) Literal suffixes, including user-defined literal operators.

Using auto declaration style doesn’t merely work naturally with built-in literal suffixes like ul for unsigned long, plus user-defined literals including standard ones now in draft C++14, but it actively encourages using them:

// Classic C++ declaration order     // Modern C++ style

int x = 42;                          auto x = 42;
float x = 42.;                       auto x = 42f;
unsigned long x = 42;                auto x = 42ul;
std::string x = "42";                auto x = "42"s;   // C++14
chrono::nanoseconds x{ 42 };         auto x = 42ns;    // C++14

Based on the examples so far, which do you think is more regular? But wait, there’s more…

(c) Named lambda syntax.
(d) Function declarations.

Lambdas have unutterable types, and auto is the best way to capture them exactly and efficiently. But because their declarations are now so similar, let’s consider lambdas and (other) functions together, and in the last two lines of this example also use C++14 return type deduction:

// Classic C++ declaration order     // Modern C++ style

int f( double );                     auto f (double) -> int;
…                                    auto f (double) { /*...*/ };
…                                    auto f = [=](double) { /*...*/ };

(e) Template alias declarations.

Modern C++ frees us from the tyranny of un-template-able typedef:

// Classic C++ workaround            // Modern C++ style

typedef set<string> dict;            using dict = set<string>;

template<class T> struct myvec {     template<class T>
  typedef vector<T,myalloc> type;    using myvec = vector<T,myalloc>;
};

An observation

Have you noticed that the C++ world is moving to a left-to-right declaration style everywhere, of the form

category name = type and/or initializer ;

where “category” can be auto or using?

Take a moment to re-skim the two columns of examples above. Even ignoring correctness and performance advantages, do you find the right-hand column to be most consistent, and most readable?

6. Are there any cases where it is not possible to use the style in #4 to declare all local variables?

There is one case I know of where this style cannot be followed, and it applies to the type-specific auto x = type{ init }; form. In that form, type has to be moveable (even though the move operation will be routinely elided by compilers), so these won’t work:

auto lock = lock_guard<mutex>{ m };  // error, not moveable
auto ai   = atomic<int>{};           // error, not moveable

(Aside: For at least some of these cases, an argument could be made that this is actually more of a defect in the type itself, in particular that perhaps atomic<int> should be moveable.)

Having said that, there are three other cases I know of that you might encounter that may at first look like they don’t work with this auto style, but actually do. Let’s consider those for completeness.

First, the basic form auto x = init; will exactly capture an initializer_list or a proxy type, such as an expression template. This is a feature, not a bug, because you have a convenient way to spell both “capture the list or proxy” and “resolve the computation” depending which you mean, and the default syntax goes to the more efficient one: If you want to efficiently capture the list or proxy, use the basic form which gives you performance by default, and if you mean to force the proxy to resolve the computation, specify the explicit type to ask for the conversion you want. For example:

auto i1 = { 1 };                       // initializer_list<int>
auto i2 = 1;                           // int

auto a = matrix{...}, b = matrix{...}; // some type that does lazy eval
auto ab = a * b;                       // to capture the lazy-eval proxy
auto c = matrix{ a * b };              // to force computation

Second, here is a rare case that you may discover now that we have auto: Due to the mechanics of the C++ grammar, you can’t legally write a multi-word type like long long or class widget in the place where type goes in the auto x = type{ init }; form. However, note that this affects only those two cases:

  • The multi-word built-in types like long long, where you’re better off anyway writing a known-width type alias or using a literal.
  • Elaborated type specifiers like class widget, where the “class” part is already redundant. The “class widget” syntax is allowed as a compatibility holdover from C which liked seeing struct widget everywhere unless you typedef‘d the struct part away.

So just avoid the multi-word form and use the better alternative instead:

auto x = long long{ 42 };            // error
auto x = int64_t{ 42 };              // ok, better 
auto x = 42LL;                       // ok, better 

auto y = class X{1,2,3};             // error
auto y = X{1,2,3};                   // ok

Summary

We already ignore explicit and exact types much of the time, including with temporary objects, virtual functions, templates, and more. This is a feature, not a bug, because it makes our code less tightly coupled, and more generic, flexible, reusable, and future-proof.

Declaring variables using auto, whether or not we want to commit to a type, offers advantages for correctness, performance, maintainability, and robustness, as well as typing convenience. Furthermore, it is an example of how the C++ world is moving to a left-to-right declaration style everywhere, of the form

category name = type and/or initializer ;

where “category” can be auto or using, and we can get not only correctness and performance but also consistency benefits by using the style to consistently declare local variables (including using literals and user-defined literals), function declarations, named lambdas, aliases, template aliases, and more.

Acknowledgments

Thanks in particular to Scott Meyers and Andrei Alexandrescu for their time and insights in reviewing and discussing drafts of this material. Both helped generate candidate names for this idiom; it was Alexandrescu who suggested the name “AAA (almost always auto)” which I merged with the best names I’d thought of to that point (“auto style” or “auto (+type) style”) to get “AAA Style (almost always auto).” Thanks also to the following for their feedback to improve this article: Adrian, avjewe, mttpd, ned, zadecn, noniussenior, Marcel Wid, J Guy Davidson, Mark Garcia.

Read Full Post »

Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 1,979 other followers