My //build/ talk on Friday @ noon PDT (webcast)

The session schedule for this week’s //build/ conference in San Francisco has now been posted.

I have a talk on Friday at noon Pacific time, titled “The Future of C++.” Note this is a Microsoft conference, so the talk is specifically about the future of the Visual C++ product, but nevertheless it’s all about Standard C++ because I’ll start with a short update on ISO goings-on and the bulk of the talk will be an update on standards conformance in Visual C++ and explaining a number of the most modern ISO C++ features.

On Friday, you can watch my talk live at Channel 9. In the meantime, you can get the keynotes and some other major sessions at the same link all day today, tomorrow, and Friday… as I write this, a cool guy named Steve has the camera and just gave away thousands of nifty 8″ tablets. I’ll be in the same place in two days to talk C++.

If you’re in San Francisco for //build/ and care about C++, you want to be in South Hall: Gateway Ballroom for session 2-306.

If you’re not at the conference but use Visual C++ as one of your C++ compilers, you’ll want to watch the talk live in the webcast or on demand about 24-48 hours after the talk ends.

Even if you don’t use Visual C++ a lot right now, you might find some of the ISO C++ standards context and updates interesting.

Stay tuned.

GotW #94 Special Edition: AAA Style (Almost Always Auto)

Toward correct-by-default, efficient-by-default, and pitfall-free-by-default variable declarations, using “AAA style”… where “triple-A” is both a mnemonic and an evaluation of its value.

 

Problem

JG Questions

1. What does this code do? What would be a good name for some_function?

template<class Container, class Value>
void some_function( Container& c, const Value& v ) {
if( find(begin(c), end(c), v) == end(c) )
c.emplace_back(v);
assert( !c.empty() );
}

2. What does “write code against interfaces, not implementations” mean, and why is it generally beneficial?

Guru Questions

3. What are some popular concerns about using auto to declare variables? Are they valid? Discuss.

4. When declaring a new local variable x, what advantages are there to declaring it using auto and one of the two following syntaxes:

(a) auto x = init; when you don’t need to commit to a specific type? (Note: The expression init might include calling a helper that performs partial type adjustment, such as as_signed, while still not committing to a specific type.)

(b) auto x = type{ init }; when you do want to commit to a specific type by naming a type?

List as many as you can. (Hint: Look back to GotW #93.)

5. Explain how using the style suggested in #4 is consistent with, or actively leverages, the following other C++ features:

(a) Heap allocation syntax.

(b) Literal suffixes, including user-defined literal operators.

(c) Named lambda syntax.

(d) Function declarations.

(e) Template alias declarations.

6. Are there any cases where it is not possible to use the style in #4 to declare all local variables?

GotW #93 Solution: Auto Variables, Part 2

Why prefer declaring variables using auto? Let us count some of the reasons why…

Problem

JG Question

1. In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
void traverser( const vector<int>& v ) {
    for( vector<int>::iterator i = begin(v); i != end(v); i += 2 )
        // ...
}

// (b)
vector<int> v1(5);
vector<int> v2 = 5;

// (c)
gadget get_gadget();
// ...
widget w = get_gadget();

// (d)
function<void(vector<int>)> get_size
    = [](const vector<int>& x) { return x.size(); };

Guru Question

2. Same question, subtler examples: In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
widget w;

// (b)
vector<string> v;
int size = v.size();

// (c) x and y are of some built-in integral type
int total = x + y;

// (d) x and y are of some built-in integral type
int diff = x - y;
if(diff < 0) { /*...*/ }

// (e)
int i = f(1,2,3) * 42.0;

Solution

As you worked through these cases, perhaps you noticed a pattern: The cases are mostly very different, but what they have in common is that they illustrate reason after reason motivating why (and how) to use auto to declare variables. Let’s dig in and see.

1. In the following code, what actual or potential pitfalls exist, which would using auto variable declarations fix, and why or why not?

(a) will not compile

// (a)
void traverser( const vector<int>& v ) {
    for( vector<int>::iterator i = begin(v); i != end(v); i += 2 )
        // ...
}

With (a), the most important pitfall is that the code doesn’t compile. Because v is const, you need a const_iterator. The old-school way to fix this is to write const_iterator:

vector<int>::const_iterator i = begin(v)     // ok + requires thinking

However, that requires thinking to remember, “ah, v is a reference to const, I better remember to write const_ in front of its iterator type… and take it off again if I ever change v to be a reference to non-const… and also change the “vector” part of i‘s type if v is some other container type…”

Not that thinking is a bad thing, mind you, but this is really just a tax on your time when the simplest and clearest thing to write is auto:

auto i = begin(v)                           // ok, best

Using auto is not only correct and clear and simpler, but it stays correct if we change the type of the parameter to be non-const or pass some other type of container, such as if we make traverser into a template in the future.

Guideline: Prefer to declare local variables using auto x = expr; when you don’t need to explicitly commit to a type. It is simpler, guarantees that you will use the correct type, and guarantees that the type stays correct under maintenance.

Although our focus is on the variable declaration, there’s another independent bug in the code: The += 2 increment can zoom you off the end of the container. When writing a strided loop, check your iterator increment against end on each increment (best to write it once as a checked_next(i,end) helper that does it for you), or use an indexed loop something like for( auto i = 0L; i < v.size(); i += 2 ) which is more natural to write correctly.

(b) and (c) rely on implicit conversions

// (b)
vector<int> v1(5);     // 1
vector<int> v2 = 5;    // 2

Line 1 performs an explicit conversion and so can call vector‘s explicit constructor that takes an initial size.

Line 2 doesn’t compile because its syntax won’t call an explicit constructor. As we saw in GotW #1, it really means “convert 5 to a temporary vector<int>, then move-construct v2 from that,” so line 2 only works for types where the conversion is not explicit.

Some people view the asymmetry between 1 and 2 as a pitfall, at least conceptually, for several reasons: First, the syntaxes are not quite the same and so learning when to use each can seem like finicky detail. Second, some people like line 2’s syntax better but have to switch to line 1 to get access to explicit constructors. Finally, with this syntax, it’s easy to forget the (5) or = 5 initializer, and then we’re into case 2(a), which we’ll get to in a moment.

If we use auto, we have a single syntax that is always obviously explicit:

auto v2 = vector<int>(5);

Next, case (c) is similar to (b):

// (c)
gadget get_gadget();
// ...
widget w = get_gadget();

This works, assuming that gadget is implicitly convertible to widget, but creates a temporary object. That’s a potential performance pitfall, as the creation of the temporary object is not at all obvious from reading the call site alone in a code review. If we can use a gadget just as well as a widget in this calling code and so don’t explicitly need to commit to the widget type, we could write the following which guarantees there is no implicit conversion because auto always deduces the basic type exactly:

// better, if you don't need an explicit type
auto w = get_gadget();

Guideline: Prefer to declare local variables using auto x = expr; when you don’t need to explicitly commit to a type. It is efficient by default and guarantees that no implicit conversions or temporary objects will occur.

By the way, if you’ve been wondering whether that “=” in auto x = expr; causes a temporary object plus a move or copy, wonder no longer: No, it constructs x directly. (See GotW #1.)

Now, what if we said widget here because we know about the conversion and really do want to deal with a widget? Then writing auto is still more self-documenting:

// better, if you do need to commit to an explicit type
auto w = widget{ get_gadget() };

Guideline: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. It is self-documenting to show that the code is explicitly requesting a conversion.

Note that this last version technically requires a move operation, but compilers are explicitly allowed to elide that and construct w directly—and compilers routinely do that, so there is no performance penalty in practice.

(d) creates an indirection, and commits to a single type

// (d)
function<void(vector<int>)> get_size
    = [](const vector<int>& x) { return x.size(); };

Case (d) has two problems, and auto can help with both of them. (Bonus points if you noticed that a form of “auto” is actually already helping in a third way.)

First, the lambda object is converted to a function<>. That can be appropriate when passing or returning the lambda to a function, but it costs an indirection because function<> has to erase the actual type and create a wrapper around its target to hold it and invoke it. In this case, we appear to be using the lambda locally, and so the correct default way to capture it is using auto, which binds to the exact (compiler-generated and otherwise-unutterable-by-you) type of the lambda and so doesn’t incur an indirection:

// partly improved
auto get_size = [](const vector<int>& x) { return x.size(); };

Guideline: Prefer to use auto name = to name a lambda function object. Use std::function</*…*/> name = only when you need to rebind it to another target or pass it to another function that needs a std::function<>.

Second, the lambda commits to a specific argument type—it only works with vector<int>, and not with vector<double> or set<string> or anything else that is also able to report a .size(). The way to fix that is to write another auto:

// best
auto get_size = [](const auto& x) { return x.size(); };

// yes, you could use this "too cute" variation for slightly less typing
//              [](auto&& x) { return x.size(); };
// but you'll also get less const-enforcement and that isn't a good deal

This still creates just a single object, but with a templated function call operator so that it can be invoked with different types of arguments, and so will work with any type of container that supports calling .size()

Guideline: Prefer to use auto lambda parameter types. They are just as efficient as explicit parameter types, and allow you to call the same lambda with different argument types.

… and did you notice the “third auto” that was there all along? Even in the original example, we’ve been implicitly using automatic type deduction in a third place by allowing the lambda to deduce its return type, and so now with the fully generic “best” version of the code that return type will always be exactly whatever .size() returns for whatever kind of object we’re calling .size() on, which can be different for different argument types. All in all, that’s pretty nifty.

Guideline: Prefer to use implicit return type deduction for lambda functions.

2. Same question, subtler examples: In the following code, what actual or potential pitfalls exist, which would using auto variable declarations fix, and why or why not?

(a) might leave the variable uninitialized.

// (a)
widget w;

This creates an object of type widget. However, we can’t tell just looking at this line whether it’s initialized or contains garbage values. As noted in GotW #1, if widget is a built-in type or aggregate type, its members won’t get initialized. Uninitialized variables should be avoided by default, and only used deliberately in cases where you really want to start with an uninitialized memory region for performance reasons—notably when you have a large object, such as an array, that is expensive to zero-initialize and is immediately going to be overwritten anyway, such as if it’s being used as an “out” parameter.

Guideline: Always initialize variables, except only when you can prove garbage values are okay, typically because you will immediately overwrite the contents.

Would auto help here? Indeed it would:

auto w = widget{};    // guaranteed to be initialized

One of the key benefits of declaring a local variable using auto is that the “=” is required—there’s no way to declare the variable without setting an initial value. Further, this is explicit and clear just from reading the above variable declaration on its own during a code review, without having to go inquire in the type’s header about the exact details of the type and poll the neighborhood for character references who will swear it’s not now, and is even under maintenance never likely to become, an aggregate.

Guideline: Prefer to declare local variables using auto. It guarantees that you cannot accidentally leave the variable uninitialized.

(b) might perform a silent narrowing conversion.

// (b)
vector<string> v;
int size = v.size();

This will compile, run, and sometimes lose information because it uses an implicit narrowing conversion. Not the safest route to a happy weekend when the bug report from the field comes in on Friday night—normally from a large and important customer, because the bug will be exercised only with larger data sizes.

Here’s why: The return type of vector<string>::size() is vector<string>::size_type, but what’s that? It depends on your implementation, because the standard leaves it implementation-defined. But one thing I guarantee you is that “it ain’t no int“—for at least two reasons, which lead to at least two ways this can lose information by silent narrowing:

  • Sign:
    size_type is required to be an unsigned integer value, so this code is asking to convert it to a signed value. That’s bad enough even if sizeof(size_type) == sizeof(int) and it throws away the high bit—and with it the upper half of the representable values—to make room for the sign bit. It’s worse than that if sizeof(size_type) > sizeof(int), which brings us to the second problem, because that’s actually likely…
  • Size:
    size_type basically needs to be the same size as a pointer, since it may have to represent any offset in a vector<char> that is larger than half the machine’s address space. In 64-bit code, 64-bit pointers mean 64-bit size_types. However, if on the same system an int is still 32 bits for compatibility (and this is common), then size_type is bigger than int, and converting to int throws away not just the high-order bit, but over half of the bits and the vast majority of the representable values.

Of course, you won’t notice on small vectors as long as .size() < 2(CHAR_BITS*sizeof(int)-1). That doesn’t mean it’s not a bug; it just means it’s a latent bug.

Does auto help? Yes indeed:

auto size = v.size();    // exact type, guaranteed no narrowing

Guideline: Prefer to declare local variables using auto. It guarantees that you get the exact type and cannot accidentally get narrowing conversions.

(c), (d), and (e) have potential narrowing and signedness issues.

// (c) x and y are of some built-in integral type
int total = x + y;

In case (c), we might also have a narrowing conversion. The simplest way to see this is that if either x or y is larger than int, which is what we’re trying to store the result into, then we’ve definitely got a silent narrowing conversion here, with the same issues as already described in (b). And even if x and y are ints today, if under maintenance the type of one later changes to something like long or size_t, the code silently becomes lossy—and possibly only on some platforms, if it changes to long and that’s the same size as int on some platforms you target but larger than int on others.

Note that, even if you know the exact types of x and y, you will get different types for x+y on different platforms, particularly if one is signed and one is unsigned. If both x and y are signed, or both are unsigned, and one’s type has more bits than the other, that’s the type of the result. If one is signed and the other is unsigned then other rules kick in, and the size and signedness of the result can vary on different platforms depending on the relative actual sizes and the signedness of x and y on that platform. (This is one of the consequences of C and C++ not standardizing the sizes of the built-in types; for example, we know a long is guaranteed to be at least as big as an int, but we don’t know how many bits each is, and the answer varies by compiler and platform.)

Does auto help here? Almost always “yes,” but in one case “yes with a little help you really want to reach for anyway.”

By default, write for correctness, clarity, and portability first: To avoid lossy narrowing conversions, auto is your portability pal and you should use it by default. Writing auto is much better than writing it out by hand as std::common_type< decltype(x), decltype(y) >.

auto total = x + y;    // exact type, guaranteed no narrowing

Guideline: Prefer to declare local variables using auto. It guarantees that you get the exact type and so is the simplest way to portably spell the implementation-specific type of arithmetic operations on built-in types, which vary by platform, and ensure that you cannot accidentally get narrowing conversions when storing the result.

However, what if in rare cases this code may be in a tight loop where performance matters, and auto may select a wider type than you know you need to store all possible values? For example, in some cases performing arithmetic using uint64_t instead of uint32_t could be twice as slow. If you first prove that this actually matters using hard profiler data, and then further prove by performing other validation that you won’t (or won’t care if you do) encounter results that would lose value by narrowing, then go ahead and commit to an explicit type—but prefer to do it using the following style:

// rare cases: use auto + <cstdint> type
auto total = uint_fast64_t{ x+y };  // total is an unsigned 64-bit value
             // ^ see note [1]

// or use auto + size-preserving signed/unsigned helper [2]
auto total = as_unsigned( x+y );    // total is unsigned and size of x+y
  • Still use auto to naturally make this more self-documenting and make the code review easy, because auto syntax makes it explicit that you’re performing a conversion.
  • Use a portable sized type name from the standard <cstdint> header, because you almost certainly care about size and this makes the size portable.[1]

    Guideline: Prefer using the <cstdint> type aliases in code that cares about the size of your numeric variables. Avoid relying on what your current platform(s) happen to do.

    Guideline: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. It is self-documenting to show that the code is explicitly requesting a conversion, and won’t allow an accidental implicit narrowing conversion. Only when you do want explicit narrowing, use ( ) instead of { }.

Case (d) is similar:

// (d) x and y are of some built-in integral type
int diff = x - y;
if(diff < 0) { /*...*/ }

This time, we’re doing a subtraction. No matter whether x and y are signed or not, putting the answer in a signed variable like this is the right thing to do—the result could be negative, after all.

However, we have two issues. The first, again, is that int may not be big enough to avoid truncating the result, so we might lose information if x – y produces something larger than an int. Using auto can help with that.

The second is that x – y might give a strange answer, which isn’t the programmer’s fault but is something you want to remember about arithmetic in C and C++. Consider this code:

unsigned long x    = 42;
signed short  y    = 43;
auto          diff = x - y;   // one actual result: 18446744073709551615
if(diff < 0) { /*...*/ }      // um, oops – branch won't be taken

“Wait, what?” you ask. On nearly all platforms, an unsigned long is bigger than a signed short, and because of the promotion rules the type of s – u, and therefore of result, will be… unsigned long. Which is, well, not very signed. So depending on the types of x and y, and depending on your actual platform, it may be that the branch won’t be taken, which clearly isn’t the same as the original code.

Guideline: Combine signed and unsigned arithmetic carefully.

Before you say, “then I always want signed!” remember that if you overflow then unsigned arithmetic wraps, which can be valid for your use, whereas signed arithmetic has undefined behavior, which is quite unlikely to be useful. Sometimes you really need signed, and sometimes you really need unsigned, even though often you won’t care.

From observing auto‘s effect in case (d), it might seem like auto has helped one problem… but was it at the expense of creating another?

Yes, on the one hand, auto did indeed help us: Using auto ensured we could write portable and correct code where the result wasn’t needlessly narrowed. If we didn’t care about signedness, which is often true, that’s quite sufficient.

On the other hand, using auto might not preserve signedness in a computation like x – y that’s supposed to return something with a sign, or it might not preserve unsignedness when that’s desirable. But this isn’t so much an issue with auto itself as that we have to be careful when combining signed and unsigned arithmetic, and by binding to an exact type auto is exposing this issue with some code that might potentially be already nonportable, or have corner cases the developer wasn’t aware of when he wrote it.

So what’s a good answer? Consider using auto together with the as_signed or as_unsigned conversion helper we saw before, which is used in lieu of a cast to a specific type; the helper is written out more fully in the endnotes. [2] Then we get the best of both worlds—we don’t commit to an explicit type, but we ensure the basic size and signedness in portable code that will work as intended on many different compilers and platforms.

Guideline: Prefer to use auto x = as_signed(integer_expr); or auto x = as_unsigned(integer_expr); to store the result of an integer computation that should be signed or unsigned. Using auto together with as_signed or as_unsigned makes code more portable: the variable will both be large enough and preserve the required signedness on all platforms. (Signed/unsigned conversions within integer_expr may still occur.)

Finally, case (e) brings floating point into the picture:

// (e)
int i = f(1,2,3) * 42.0;

Here we have our by-now-yawnworthy-typical narrowing—and an easy case because it isn’t even hiding, it’s saying int and 42.0 right there in the same breath, which is narrowing almost regardless of what type f returns.

Does auto help? Yes, in making our code self-documenting and more reviewable, as we noted before. If we follow the auto x = type{expr}; declaration style, we would be (happily) forced to write the conversion explicitly, and when we initially use { } we get an error that in fact it’s a narrowing conversion, which we acknowledge (again explicitly) by switching to ( ):

auto i = int( f(1,2,3) * 42.0 );

This code is now free of implicit conversions, including implicit narrowing conversions. If our team’s coding style says to use auto x = expr; or auto x = type{expr}; wherever possible, then in a code review just seeing the ( ) parens can immediately connote explicit narrowing; adding a comment doesn’t hurt either.

But for floating point calculations, can using auto by itself hurt? Consider this example, contributed by Andrei Alexandrescu:

float f1 = /*...*/, f2 = /*...*/;

auto   f3 = f1 + f2;   // correct, but on some compilers/platforms...
double f4 = f1 + f2;   // ... this might keep more bits of precision

As Alexandrescu notes: “Machines are free to do intermediate calculations in a larger precision than the target, and in many cases (and traditionally in C) calculations are done in double precision. So for f3 we have a sum done in double precision, which is then truncated down to float. For f4, the sum is preserved at full precision.”

Does this mean using auto creates a potential flaw here? Not really. In the language, the type of f1 + f2 is still float, and the naked auto maintains that exact type for us. However, if we do want to follow the pattern of switching to double early in a complex computation, we can and should say so:

float f1 = /*...*/, f2 = /*...*/;

auto f5 = double{f1} + f2;

Summary

We’ve seen a number of reasons to prefer to declare variables using auto, optionally with an explicit type if you do want to commit to a specific type.

If you’re observed a pattern in this GotW’s Guidelines, you’ll already have a sense of what’s coming in GotW #94… a Special Edition on, you guessed it, auto style.

Notes

[1] Another reason to prefer using the <cstdint> typedef names is because, due to a quirk in the C++ language grammar, only a single-word type is allowed where uint64_t appears in this example. That’s fine nearly always because it’s all you need for class types and all typedef and using alias names and most built-in types, but you can’t directly name arrays or the multi-word built-in types like unsigned int or long long in that position; for the latter, use the uintNN_t-style typedef names instead. The exact ones, such as uint64_t, are “optional” in the standard, but they are in the standard and expected to be widely implemented so I used them. The “least” and “fast” ones are required, so if you don’t have uint64_t you can use uint_least64_t or uint_fast64_t.

[2] The helpers preserve the size of the type while changing only the signedness. Thanks to Andrei Alexandrescu for this basic idea; any errors are mine, not his. The C++98 way is to provide a set of overloads for each type, but a modern version might look something like the following which uses the C++11 std::make_signed/make_unsigned facilities.

// C++11 version
//
template<class T>
typename make_signed<T>::type as_signed(T t)
    { return make_signed<T>::type(t); }

template<class T>
typename make_unsigned<T>::type as_unsigned(T t)
    { return make_unsigned<T>::type(t); }

Note that with C++14 this gets even sweeter, using auto return type deduction to eliminate typename and repetition, and the _t alias to replace ::type:

// C++14 version, option 1
//
template<class T> auto as_signed  (T t){ return make_signed_t  <T>(t); }
template<class T> auto as_unsigned(T t){ return make_unsigned_t<T>(t); }

or you can equivalently write these function templates as named lambdas:

// C++14 version, option 2
//
auto as_signed   =[](auto x){ return make_signed_t  <decltype(x)>(x); };
auto as_unsigned =[](auto x){ return make_unsigned_t<decltype(x)>(x); };

Sweet, isn’t it? Once you have a compiler that supports these features, pick whichever suits your fancy.

Acknowledgments

Thanks in particular to Scott Meyers and Andrei Alexandrescu for their time and insights in reviewing and discussing drafts of this material. Thanks also to the following for their feedback to improve this article: mttpd, Jim Park, Yuri Khan, Arne, rhalbersma, Tom, Martin Ba, John, Frederic Dumont, Sebastian.

GotW #93: Auto Variables, Part 2

Why prefer declaring variables using auto? Let us count some of the reasons why…

 

Problem

JG Question

1. In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
void traverser( const vector<int>& v ) {
for( vector<int>::iterator i = begin(v); i != end(v); i += 2 )
// ...
}

// (b)
vector<int> v1(5);
vector<int> v2 = 5;

// (c)
gadget get_gadget();
// ...
widget w = get_gadget();

// (d)
function<void(vector<int>)> get_size
= [](vector<int> x) { return x.size(); };

Guru Question

2. Same question, subtler examples: In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
widget w;

// (b)
vector<string> v;
int size = v.size();

// (c) x and y are of some built-in integral type
int total = x + y;

// (d) x and y are of some built-in integral type
int diff = x - y;
if(diff < 0) { /*...*/ }

// (e)
int i = f(1,2,3) * 42.0;

GotW #92 Solution: Auto Variables, Part 1

What does auto do on variable declarations, exactly? And how should we think about auto? In this GotW, we’ll start taking a look at C++’s oldest new feature.

 

Problem

JG Questions

1. What is the oldest C++11 feature? Explain.

2. What does auto mean when declaring a local variable?

Guru Questions

3. In the following code, what is the type of variables a through k, and why? Explain.

int         val = 0;
auto a = val;
auto& b = val;
const auto c = val;
const auto& d = val;

int& ir = val;
auto e = ir;

int* ip = &val;
auto f = ip;

const int ci = val;
auto g = ci;

const int& cir = val;
auto h = cir;

const int* cip = &val;
auto i = cip;

int* const ipc = &val;
auto j = ipc;

const int* const cipc = &val;
auto k = cipc;

4. In the following code, what type does auto deduce for variables a and b, and why? Explain.

int val = 0;

auto a { val };
auto b = { val };

 

Solution

1. What is the oldest C++11 feature? Explain.

auto x = something; to declare a new local variable whose type is deduced from something, and isn’t just always int.

Bjarne Stroustrup likes to point out that auto for deducing the type of local variables is the oldest feature added in the 2011 release of the C++ standard. He implemented it in C++ 28 years earlier, in 1983—which incidentally was the same year the language’s name was changed to C++ from C with Classes (the new name was unveiled publicly on January 1, 1984), and the same year Stroustrup added other fundamental features including const (later adopted by C), virtual functions, & references, and BCPL-style // comments.

Alas, Stroustrup was forced to remove auto because of compatibility concerns with C’s then-existing implicit int rule, which has since been abandoned in C. We’re glad auto is now back and here to stay.

2. What does auto mean when declaring a local variable?

It means to deduce the type from the expression used to initialize the new variable. In particular, auto local variables deduction is exactly the same as type deduction for parameters of function templates—by specification, the rule for auto variables says “do what function templates are required to do”—plus they can capture initializer_list as a type. For example:

template<class T> void f( T ) { }

int val = 0;

f( val ); // deduces T == int, calls f<int>( val )
auto x = val; // deduces T == int, x is of type int

When you’re new to auto, the key thing to remember is that you really are declaring your own new local variable. That is, “what’s on the left” is my new variable, and “what’s on the right” is just its initial value:

auto my_new_variable = its_initial_value;

You want your new variable to be just like some existing variable or expression over there, and be initialized from it, but that only means that you want the same basic type, not necessarily that other variable’s own personal secondary attributes such as top-level const– or volatile-ness and &/&& reference-ness which are per-variable. For example, just because he’s const doesn’t mean you’re const, and vice versa.

It’s kind of like being identical twins: Andy may be genetically just like his brother Bobby and is part of the same family, but he’s not the same person; he’s a distinct person and can make his own choice of clothes and/or jewelry, go to be seen on the scene in different parts of town, and so forth. So your new variable will be just like that other one and be part of the same type family, but it’s not the same variable; it’s a distinct variable with its own choice of whether it wants to be dressed with const, volatile, and/or a & or && reference, may be visible to different threads, and so forth.

Remembering this will let us easily answer the rest of our questions.

3. In the following code, what is the type of variables a through k, and why? Explain.

Quick reminder: auto means “take exactly the type on the right-hand side, but strip off top-level const/volatile and &/&&.” Armed with that, these are mostly pretty easy.

For simplicity, these examples use const and &. The rules for adding or removing const and volatile are the same, and the rules for adding or removing & and && are the same.

int         val = 0;
auto a = val;
auto& b = val;
const auto c = val;
const auto& d = val;

For a through d, the type is what you get from replacing auto with int: int, int&, const int, and const int&, respectively. The same ability to add const applies to volatile, and the same ability to add & applies to &&. (Note that && will be what Scott Meyers calls a universal reference, just as with templates, and does in some cases bring across the const-ness if it’s binding to something const.)

Now that we’ve exercised adding top-level const (or volatile) and & (or &&) on the left, let’s consider how they’re removed on the right. Note that the left hand side of a through d can be used in any combination with the right hand side of e through k.

int&        ir  = val;
auto e = ir;

The type of e is int. Because ir is a reference to val, which makes ir just another name for val, it’s exactly the same as if we had written auto e = val; here.

Remember, just because ir is a reference (another name for the existing variable val) doesn’t have any bearing on whether we want e to be a reference. If we wanted e to be a reference, we would have said auto& as we did in case b above, and it would have been a reference irrespective of whether ir happened to be a reference or not.

int*        ip  = &val; 
auto f = ip;

The type of f is int*.

const int   ci  = val;
auto g = ci;

The type of g is int.

Remember, just because ci is const (read-only) doesn’t have any bearing on whether we want g to be const. It’s a separate variable. If we wanted g to be const, we would have said const auto as we did in case c above, and it would have been const irrespective of whether ci happened to be const or not.

const int&  cir = val;
auto h = cir;

The type of h is int.

Again, remember we just drop top-level const and & to get the basic type. If we wanted h to be const and/or &, we could just add it as shown with b, c, and d above.

const int*  cip = &val;
auto i = cip;

The type of i is const int*.

Note that this isn’t a top-level const, so we don’t drop it. We pronounce cip‘s declaration right to left: The type of cip is “pointer to const int,” not “const pointer to int.” What’s const is not cip, but rather *cip, the int it’s pointing to.

int* const  ipc = &val;
auto j = ipc;

The type of j is int*. This const is a top-level const, and ipc‘s being const is immaterial to whether we want j to be const.

const int* const cipc = &val;
auto k = cipc;

The type of k is const int*.

4. In the following code, what type does auto deduce for variables a and b, and why? Explain.

As we noted in #2, the only place where an auto variable deduces anything different from a template parameter is that auto deduces an initializer_list. This brings us to the final cases:

int val = 0;

auto a { val };
auto b = { val };

The type of both a and b is std::initializer_list<int>.

That’s the only difference between auto variable deduction and template parameter deduction—by specification, because auto deduction is defined in the standard as “follow those rules over there in the templates clause, plus deduce initializer_list.”

If you’re familiar with templates and curious how auto deduction and template deduction map to each other, the table below lists the main cases and shows the equivalent syntax between the two features. For the left column, I’ll put the variable and the initialization on separate lines to emphasize how they correspond to the separated template parameter and call site on the right.

Not only are the cases equivalent in expressive power, but you might even feel that some of the auto versions feel even slicker to you than their template counterparts.

Summary

Having auto variables really brings a feature we already had (template deduction) to an even wider audience. But so far we’ve only seen what auto does. The even more interesting question is how to use it. Which brings us to our next GotW…

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: davidphilliposter, Phil Barila, Ralph Tandetzky, Marcel Wild.

GotW #92: Auto Variables, Part 1

What does auto do on variable declarations, exactly? And how should we think about auto? In this GotW, we’ll start taking a look at C++’s oldest new feature.

 

Problem

JG Questions

1. What is the oldest C++11 feature? Explain.

2. What does auto mean when declaring a local variable?

Guru Questions

3. In the following code, what is the type of variables a through k, and why? Explain.

int         val = 0;
auto a = val;
auto& b = val;
const auto c = val;
const auto& d = val;

int& ir = val;
auto e = ir;

int* ip = &val;
auto f = ip;

const int ci = val;
auto g = ci;

const int& cir = val;
auto h = cir;

const int* cip = &val;
auto i = cip;

int* const ipc = &val;
auto j = ipc;

const int* const cipc = &val;
auto k = cipc;

4. In the following code, what type does auto deduce for variables a and b, and why? Explain.

int val = 0;

auto a { val };
auto b = { val };

GotW #91 Solution: Smart Pointer Parameters

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #105.

How should you prefer to pass smart pointers, and why?

Problem

JG Question

1. What are the performance implications of the following function declaration? Explain.

void f( shared_ptr<widget> );

Guru Questions

2. What are the correctness implications of the function declaration in #1? Explain with clear examples.

3. A colleague is writing a function f that takes an existing object of type widget as a required input-only parameter, and trying to decide among the following basic ways to take the parameter (omitting const):

void f( widget* );              (a)
void f( widget& );              (b)
void f( unique_ptr<widget> );   (c)
void f( unique_ptr<widget>& );  (d)
void f( shared_ptr<widget> );   (e)
void f( shared_ptr<widget>& );  (f)

Under what circumstances is each appropriate? Explain your answer, including where const should or should not be added anywhere in the parameter type.

(There are other ways to pass the parameter, but we will consider only the ones shown above.)

Solution

1. What are the performance implications of the following function declaration? Explain.

void f( shared_ptr<widget> );

A shared_ptr stores strong and weak reference counts (see GotW #89). When you pass by value, you have to copy the argument (usually) on entry to the function, and then destroy it (always) on function exit. Let’s dig into what this means.

When you enter the function, the shared_ptr is copy-constructed, and this requires incrementing the strong reference count. (Yes, if the caller passes a temporary shared_ptr, you move-construct and so don’t have to update the count. But: (a) it’s quite rare to get a temporary shared_ptr in normal code, other than taking one function’s return value and immediately passing that to a second function; and (b) besides as we’ll see most of the expense is on the destruction of the parameter anyway.)

When exiting the function, the shared_ptr is destroyed, and this requires decrementing its internal reference count.

What’s so bad about a “shared reference count increment and decrement?” Two things, one related to the “shared reference count” and one related to the “increment and decrement.” It’s good to be aware of how this can incur performance costs for two reasons: one major and common, and one less likely in well-designed code and so probably more minor.

First, the major reason is the performance cost of the “increment and decrement”: Because the reference count is an atomic shared variable (or equivalent), incrementing and decrementing it are internally-synchronized read-modify-write shared memory operations.

Second, the less-likely minor reason is the potentially scalability-bustingly contentious nature of the “shared reference count”: Both increment and decrement update the reference count, which means that at the processor and memory level only one core at a time can be executing such an instruction on the same reference count because it needs exclusive access to the count’s cache line. The net result is that this causes some contention on the count’s cache line, which can affect scalability if it’s a popular cache line being touched by multiple threads in tight loops—such as if two threads are calling functions like this one in tight loops and accessing shared_ptrs that own the same object. “So don’t do that, thou heretic caller!” we might righteously say. Well and good, but the caller doesn’t always know when two shared_ptrs used on two different threads refer to the same object, so let’s not be quick to pile the wood around his stake just yet.

As we will see, an essential best practice for any reference-counted smart pointer type is to avoid copying it unless you really mean to add a new reference. This cannot be stressed enough. This directly addresses both of these costs and pushes their performance impact down into the noise for most applications, and especially eliminates the second cost because it is an antipattern to add and remove references in tight loops.

At this point, we will be tempted to solve the problem by passing the shared_ptr by reference. But is that really the right thing to do? It depends.

2. What are the correctness implications of the function declaration in #1?

The only correctness implication is that the function advertises in a clear type-enforced way that it will (or could) retain a copy of the shared_ptr.

That this is the only correctness implication might surprise some people, because there would seem to be one other major correctness benefit to taking a copy of the argument, namely lifetime: Assuming the pointer is not already null, taking a copy of the shared_ptr guarantees that the function itself holds a strong refcount on the owned object, and that therefore the object will remain alive for the duration of the function body, or until the function itself chooses to modify its parameter.

However, we already get this for free—thanks to structured lifetimes, the called function’s lifetime is a strict subset of the calling function’s call expression. Even if we passed the shared_ptr by reference, our function would as good as hold a strong refcount because the caller already has one—he passed us the shared_ptr in the first place, and won’t release it until we return. (Note this assumes the pointer is not aliased. You have to be careful if the smart pointer parameter could be aliased, but in this respect it’s no different than any other aliased object.)

Guideline: Don’t pass a smart pointer as a function parameter unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership.

Guideline: Prefer passing objects by value, *, or &, not by smart pointer.

If you’re saying, “hey, aren’t raw pointers evil?”, that’s excellent, because we’ll address that next.

3. A colleague is writing a function f that takes an existing object of type widget as a required input-only parameter, and trying to decide among the following basic ways to take the parameter (omitting const). Under what circumstances is each appropriate? Explain your answer, including where const should or should not be added anywhere in the parameter type.

(a) and (b): Prefer passing parameters by * or &.

void f( widget* );              (a)
void f( widget& );              (b)

These are the preferred way to pass normal object parameters, because they stay agnostic of whatever lifetime policy the caller happens to be using.

Non-owning raw * pointers and & references are okay to observe an object whose lifetime we know exceeds that of the pointer or reference, which is usually true for function parameters. Thanks to structured lifetimes, by default arguments passed to f in the caller outlive f‘s function call lifetime, which is extremely useful (not to mention efficient) and makes non-owning * and & appropriate for parameters.

Pass by * or & to accept a widget independently of how the caller is managing its lifetime. Most of the time, we don’t want to commit to a lifetime policy in the parameter type, such as requiring the object be held by a specific smart pointer, because this is usually needlessly restrictive. As usual, use a * if you need to express null (no widget), otherwise prefer to use a &; and if the object is input-only, write const widget* or const widget&.

(c) Passing unique_ptr by value means “sink.”

void f( unique_ptr<widget> );   (c)

This is the preferred way to express a widget-consuming function, also known as a “sink.”

Passing a unique_ptr by value is only possible by moving the object and its unique ownership from the caller to the callee. Any function like (c) takes ownership of the object away from the caller, and either destroys it or moves it onward to somewhere else.

Note that, unlike some of the other options below, this use of a by-value unique_ptr parameter actually doesn’t limit the kind of object that can be passed to those managed by a unique_ptr. Why not? Because any pointer can be explicitly converted to a unique_ptr. If we didn’t use a unique_ptr here we would still have to express “sink” semantics, just in a more brittle way such as by accepting a raw owning pointer (anathema!) and documenting the semantics in comments. Using (c) is vastly superior because it documents the semantics in code, and requires the caller to explicitly move ownership.

Consider the major alternative:

// Smelly 20th-century alternative
void bad_sink( widget* p );  // will destroy p; PLEASE READ THIS COMMENT

// Sweet self-documenting self-enforcing modern version (c)
void good_sink( unique_ptr<widget> p );

And how much better (c) is:

// Older calling code that calls the new good_sink is safer, because
// it's clearer in the calling code that ownership transfer is going on
// (this older code has an owning * which we shouldn't do in new code)
//
widget* pw = ... ; 

bad_sink ( pw );             // compiles: remember not to use pw again!

good_sink( pw );             // error: good
good_sink( unique_ptr<widget>{pw} );  // need explicit conversion: good

// Modern calling code that calls good_sink is safer, and cleaner too
//
unique_ptr<widget> pw = ... ;

bad_sink ( pw.get() );       // compiles: icky! doesn't reset pw
bad_sink ( pw.release() );   // compiles: must remember to use this way

good_sink( pw );             // error: good!
good_sink( move(pw) );       // compiles: crystal clear what's going on

Guideline: Express a “sink” function using a by-value unique_ptr parameter.

Because the callee will now own the object, usually there should be no const on the parameter because the const should be irrelevant.

(d) Passing unique_ptr by reference is for in/out unique_ptr parameters.

void f( unique_ptr<widget>& );  (d)

This should only be used to accept an in/out unique_ptr, when the function is supposed to actually accept an existing unique_ptr and potentially modify it to refer to a different object. It is a bad way to just accept a widget, because it is restricted to a particular lifetime strategy in the caller.

Guideline: Use a non-const unique_ptr& parameter only to modify the unique_ptr.

Passing a const unique_ptr<widget>& is strange because it can accept only either null or a widget whose lifetime happens to be managed in the calling code via a unique_ptr, and the callee generally shouldn’t care about the caller’s lifetime management choice. Passing widget* covers a strict superset of these cases and can accept “null or a widget” regardless of the lifetime policy the caller happens to be using.

Guideline: Don’t use a const unique_ptr& as a parameter; use widget* instead.

I mention widget* because that doesn’t change the (nullable) semantics; if you’re being tempted to pass const shared_ptr<widget>&, what you really meant was widget* which expresses the same information. If you additionally know it can’t be null, though, of course use widget&.

(e) Passing shared_ptr by value implies taking shared ownership.

void f( shared_ptr<widget> );   (e)

As we saw in #2, this is recommended only when the function wants to retain a copy of the shared_ptr and share ownership. In that case, a copy is needed anyway so the copying cost is fine. If the local scope is not the final destination, just std::move the shared_ptr onward to wherever it needs to go.

Guideline: Express that a function will store and share ownership of a heap object using a by-value shared_ptr parameter.

Otherwise, prefer passing a * or & (possibly to const) instead, since that doesn’t restrict the function to only objects that happen to be owned by shared_ptrs.

(f) Passing shared_ptr& is useful for in/out shared_ptr manipulation.

void f( shared_ptr<widget>& );  (f)

Similarly to (d), this should mainly be used to accept an in/out shared_ptr, when the function is supposed to actually modify the shared_ptr itself. It’s usually a bad way to accept a widget, because it is restricted to a particular lifetime strategy in the caller.

Note that per (e) we pass a shared_ptr by value if the function will share ownership. In the special case where the function might share ownership, but doesn’t necessarily take a copy of its parameter on a given call, then pass a const shared_ptr& to avoid the copy on the calls that don’t need it, and take a copy of the parameter if and when needed.

Guideline: Use a non-const shared_ptr& parameter only to modify the shared_ptr. Use a const shared_ptr& as a parameter only if you’re not sure whether or not you’ll take a copy and share ownership; otherwise use widget* instead (or if not nullable, a widget&).

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: mttpd, zahirtezcan, Jon, GregM, Andrei Alexandrescu.