Feeds:
Posts
Comments

Archive for the ‘GotW’ Category

Toward correct-by-default, efficient-by-default, and pitfall-free-by-default variable declarations, using “AAA style”… where “triple-A” is both a mnemonic and an evaluation of its value.

 

Problem

JG Questions

1. What does this code do? What would be a good name for some_function?

template<class Container, class Value>
void some_function( Container& c, const Value& v ) {
if( find(begin(c), end(c), v) == end(c) )
c.emplace_back(v);
assert( !c.empty() );
}

2. What does “write code against interfaces, not implementations” mean, and why is it generally beneficial?

Guru Questions

3. What are some popular concerns about using auto to declare variables? Are they valid? Discuss.

4. When declaring a new local variable x, what advantages are there to declaring it using auto and one of the two following syntaxes:

(a) auto x = init; when you don’t need to commit to a specific type? (Note: The expression init might include calling a helper that performs partial type adjustment, such as as_signed, while still not committing to a specific type.)

(b) auto x = type{ init }; when you do want to commit to a specific type by naming a type?

List as many as you can. (Hint: Look back to GotW #93.)

5. Explain how using the style suggested in #4 is consistent with, or actively leverages, the following other C++ features:

(a) Heap allocation syntax.

(b) Literal suffixes, including user-defined literal operators.

(c) Named lambda syntax.

(d) Function declarations.

(e) Template alias declarations.

6. Are there any cases where it is not possible to use the style in #4 to declare all local variables?

Read Full Post »

Why prefer declaring variables using auto? Let us count some of the reasons why…

Problem

JG Question

1. In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
void traverser( const vector<int>& v ) {
    for( vector<int>::iterator i = begin(v); i != end(v); i += 2 )
        // ...
}

// (b)
vector<int> v1(5);
vector<int> v2 = 5;

// (c)
gadget get_gadget();
// ...
widget w = get_gadget();

// (d)
function<void(vector<int>)> get_size
    = [](const vector<int>& x) { return x.size(); };

Guru Question

2. Same question, subtler examples: In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
widget w;

// (b)
vector<string> v;
int size = v.size();

// (c) x and y are of some built-in integral type
int total = x + y;

// (d) x and y are of some built-in integral type
int diff = x - y;
if(diff < 0) { /*...*/ }

// (e)
int i = f(1,2,3) * 42.0;

Solution

As you worked through these cases, perhaps you noticed a pattern: The cases are mostly very different, but what they have in common is that they illustrate reason after reason motivating why (and how) to use auto to declare variables. Let’s dig in and see.

1. In the following code, what actual or potential pitfalls exist, which would using auto variable declarations fix, and why or why not?

(a) will not compile

// (a)
void traverser( const vector<int>& v ) {
    for( vector<int>::iterator i = begin(v); i != end(v); i += 2 )
        // ...
}

With (a), the most important pitfall is that the code doesn’t compile. Because v is const, you need a const_iterator. The old-school way to fix this is to write const_iterator:

vector<int>::const_iterator i = begin(v)     // ok + requires thinking

However, that requires thinking to remember, “ah, v is a reference to const, I better remember to write const_ in front of its iterator type… and take it off again if I ever change v to be a reference to non-const… and also change the “vector” part of i‘s type if v is some other container type…”

Not that thinking is a bad thing, mind you, but this is really just a tax on your time when the simplest and clearest thing to write is auto:

auto i = begin(v)                           // ok, best

Using auto is not only correct and clear and simpler, but it stays correct if we change the type of the parameter to be non-const or pass some other type of container, such as if we make traverser into a template in the future.

Guideline: Prefer to declare local variables using auto x = expr; when you don’t need to explicitly commit to a type. It is simpler, guarantees that you will use the correct type, and guarantees that the type stays correct under maintenance.

Although our focus is on the variable declaration, there’s another independent bug in the code: The += 2 increment can zoom you off the end of the container. When writing a strided loop, check your iterator increment against end on each increment (best to write it once as a checked_next(i,end) helper that does it for you), or use an indexed loop something like for( auto i = 0L; i < v.size(); i += 2 ) which is more natural to write correctly.

(b) and (c) rely on implicit conversions

// (b)
vector<int> v1(5);     // 1
vector<int> v2 = 5;    // 2

Line 1 performs an explicit conversion and so can call vector‘s explicit constructor that takes an initial size.

Line 2 doesn’t compile because its syntax won’t call an explicit constructor. As we saw in GotW #1, it really means “convert 5 to a temporary vector<int>, then move-construct v2 from that,” so line 2 only works for types where the conversion is not explicit.

Some people view the asymmetry between 1 and 2 as a pitfall, at least conceptually, for several reasons: First, the syntaxes are not quite the same and so learning when to use each can seem like finicky detail. Second, some people like line 2′s syntax better but have to switch to line 1 to get access to explicit constructors. Finally, with this syntax, it’s easy to forget the (5) or = 5 initializer, and then we’re into case 2(a), which we’ll get to in a moment.

If we use auto, we have a single syntax that is always obviously explicit:

auto v2 = vector<int>(5);

Next, case (c) is similar to (b):

// (c)
gadget get_gadget();
// ...
widget w = get_gadget();

This works, assuming that gadget is implicitly convertible to widget, but creates a temporary object. That’s a potential performance pitfall, as the creation of the temporary object is not at all obvious from reading the call site alone in a code review. If we can use a gadget just as well as a widget in this calling code and so don’t explicitly need to commit to the widget type, we could write the following which guarantees there is no implicit conversion because auto always deduces the basic type exactly:

// better, if you don't need an explicit type
auto w = get_gadget();

Guideline: Prefer to declare local variables using auto x = expr; when you don’t need to explicitly commit to a type. It is efficient by default and guarantees that no implicit conversions or temporary objects will occur.

By the way, if you’ve been wondering whether that “=” in auto x = expr; causes a temporary object plus a move or copy, wonder no longer: No, it constructs x directly. (See GotW #1.)

Now, what if we said widget here because we know about the conversion and really do want to deal with a widget? Then writing auto is still more self-documenting:

// better, if you do need to commit to an explicit type
auto w = widget{ get_gadget() };

Guideline: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. It is self-documenting to show that the code is explicitly requesting a conversion.

Note that this last version technically requires a move operation, but compilers are explicitly allowed to elide that and construct w directly—and compilers routinely do that, so there is no performance penalty in practice.

(d) creates an indirection, and commits to a single type

// (d)
function<void(vector<int>)> get_size
    = [](const vector<int>& x) { return x.size(); };

Case (d) has two problems, and auto can help with both of them. (Bonus points if you noticed that a form of “auto” is actually already helping in a third way.)

First, the lambda object is converted to a function<>. That can be appropriate when passing or returning the lambda to a function, but it costs an indirection because function<> has to erase the actual type and create a wrapper around its target to hold it and invoke it. In this case, we appear to be using the lambda locally, and so the correct default way to capture it is using auto, which binds to the exact (compiler-generated and otherwise-unutterable-by-you) type of the lambda and so doesn’t incur an indirection:

// partly improved
auto get_size = [](const vector<int>& x) { return x.size(); };

Guideline: Prefer to use auto name = to name a lambda function object. Use std::function</*…*/> name = only when you need to rebind it to another target or pass it to another function that needs a std::function<>.

Second, the lambda commits to a specific argument type—it only works with vector<int>, and not with vector<double> or set<string> or anything else that is also able to report a .size(). The way to fix that is to write another auto:

// best
auto get_size = [](const auto& x) { return x.size(); };

// yes, you could use this "too cute" variation for slightly less typing
//              [](auto&& x) { return x.size(); };
// but you'll also get less const-enforcement and that isn't a good deal

This still creates just a single object, but with a templated function call operator so that it can be invoked with different types of arguments, and so will work with any type of container that supports calling .size()

Guideline: Prefer to use auto lambda parameter types. They are just as efficient as explicit parameter types, and allow you to call the same lambda with different argument types.

… and did you notice the “third auto” that was there all along? Even in the original example, we’ve been implicitly using automatic type deduction in a third place by allowing the lambda to deduce its return type, and so now with the fully generic “best” version of the code that return type will always be exactly whatever .size() returns for whatever kind of object we’re calling .size() on, which can be different for different argument types. All in all, that’s pretty nifty.

Guideline: Prefer to use implicit return type deduction for lambda functions.

2. Same question, subtler examples: In the following code, what actual or potential pitfalls exist, which would using auto variable declarations fix, and why or why not?

(a) might leave the variable uninitialized.

// (a)
widget w;

This creates an object of type widget. However, we can’t tell just looking at this line whether it’s initialized or contains garbage values. As noted in GotW #1, if widget is a built-in type or aggregate type, its members won’t get initialized. Uninitialized variables should be avoided by default, and only used deliberately in cases where you really want to start with an uninitialized memory region for performance reasons—notably when you have a large object, such as an array, that is expensive to zero-initialize and is immediately going to be overwritten anyway, such as if it’s being used as an “out” parameter.

Guideline: Always initialize variables, except only when you can prove garbage values are okay, typically because you will immediately overwrite the contents.

Would auto help here? Indeed it would:

auto w = widget{};    // guaranteed to be initialized

One of the key benefits of declaring a local variable using auto is that the “=” is required—there’s no way to declare the variable without setting an initial value. Further, this is explicit and clear just from reading the above variable declaration on its own during a code review, without having to go inquire in the type’s header about the exact details of the type and poll the neighborhood for character references who will swear it’s not now, and is even under maintenance never likely to become, an aggregate.

Guideline: Prefer to declare local variables using auto. It guarantees that you cannot accidentally leave the variable uninitialized.

(b) might perform a silent narrowing conversion.

// (b)
vector<string> v;
int size = v.size();

This will compile, run, and sometimes lose information because it uses an implicit narrowing conversion. Not the safest route to a happy weekend when the bug report from the field comes in on Friday night—normally from a large and important customer, because the bug will be exercised only with larger data sizes.

Here’s why: The return type of vector<string>::size() is vector<string>::size_type, but what’s that? It depends on your implementation, because the standard leaves it implementation-defined. But one thing I guarantee you is that “it ain’t no int“—for at least two reasons, which lead to at least two ways this can lose information by silent narrowing:

  • Sign:
    size_type is required to be an unsigned integer value, so this code is asking to convert it to a signed value. That’s bad enough even if sizeof(size_type) == sizeof(int) and it throws away the high bit—and with it the upper half of the representable values—to make room for the sign bit. It’s worse than that if sizeof(size_type) > sizeof(int), which brings us to the second problem, because that’s actually likely…
  • Size:
    size_type basically needs to be the same size as a pointer, since it may have to represent any offset in a vector<char> that is larger than half the machine’s address space. In 64-bit code, 64-bit pointers mean 64-bit size_types. However, if on the same system an int is still 32 bits for compatibility (and this is common), then size_type is bigger than int, and converting to int throws away not just the high-order bit, but over half of the bits and the vast majority of the representable values.

Of course, you won’t notice on small vectors as long as .size() < 2(CHAR_BITS*sizeof(int)-1). That doesn’t mean it’s not a bug; it just means it’s a latent bug.

Does auto help? Yes indeed:

auto size = v.size();    // exact type, guaranteed no narrowing

Guideline: Prefer to declare local variables using auto. It guarantees that you get the exact type and cannot accidentally get narrowing conversions.

(c), (d), and (e) have potential narrowing and signedness issues.

// (c) x and y are of some built-in integral type
int total = x + y;

In case (c), we might also have a narrowing conversion. The simplest way to see this is that if either x or y is larger than int, which is what we’re trying to store the result into, then we’ve definitely got a silent narrowing conversion here, with the same issues as already described in (b). And even if x and y are ints today, if under maintenance the type of one later changes to something like long or size_t, the code silently becomes lossy—and possibly only on some platforms, if it changes to long and that’s the same size as int on some platforms you target but larger than int on others.

Note that, even if you know the exact types of x and y, you will get different types for x+y on different platforms, particularly if one is signed and one is unsigned. If both x and y are signed, or both are unsigned, and one’s type has more bits than the other, that’s the type of the result. If one is signed and the other is unsigned then other rules kick in, and the size and signedness of the result can vary on different platforms depending on the relative actual sizes and the signedness of x and y on that platform. (This is one of the consequences of C and C++ not standardizing the sizes of the built-in types; for example, we know a long is guaranteed to be at least as big as an int, but we don’t know how many bits each is, and the answer varies by compiler and platform.)

Does auto help here? Almost always “yes,” but in one case “yes with a little help you really want to reach for anyway.”

By default, write for correctness, clarity, and portability first: To avoid lossy narrowing conversions, auto is your portability pal and you should use it by default. Writing auto is much better than writing it out by hand as std::common_type< decltype(x), decltype(y) >.

auto total = x + y;    // exact type, guaranteed no narrowing

Guideline: Prefer to declare local variables using auto. It guarantees that you get the exact type and so is the simplest way to portably spell the implementation-specific type of arithmetic operations on built-in types, which vary by platform, and ensure that you cannot accidentally get narrowing conversions when storing the result.

However, what if in rare cases this code may be in a tight loop where performance matters, and auto may select a wider type than you know you need to store all possible values? For example, in some cases performing arithmetic using uint64_t instead of uint32_t could be twice as slow. If you first prove that this actually matters using hard profiler data, and then further prove by performing other validation that you won’t (or won’t care if you do) encounter results that would lose value by narrowing, then go ahead and commit to an explicit type—but prefer to do it using the following style:

// rare cases: use auto + <cstdint> type
auto total = uint_fast64_t{ x+y };  // total is an unsigned 64-bit value
             // ^ see note [1]

// or use auto + size-preserving signed/unsigned helper [2]
auto total = as_unsigned( x+y );    // total is unsigned and size of x+y
  • Still use auto to naturally make this more self-documenting and make the code review easy, because auto syntax makes it explicit that you’re performing a conversion.
  • Use a portable sized type name from the standard <cstdint> header, because you almost certainly care about size and this makes the size portable.[1]

    Guideline: Prefer using the <cstdint> type aliases in code that cares about the size of your numeric variables. Avoid relying on what your current platform(s) happen to do.

    Guideline: Consider declaring local variables auto x = type{ expr }; when you do want to explicitly commit to a type. It is self-documenting to show that the code is explicitly requesting a conversion, and won’t allow an accidental implicit narrowing conversion. Only when you do want explicit narrowing, use ( ) instead of { }.

Case (d) is similar:

// (d) x and y are of some built-in integral type
int diff = x - y;
if(diff < 0) { /*...*/ }

This time, we’re doing a subtraction. No matter whether x and y are signed or not, putting the answer in a signed variable like this is the right thing to do—the result could be negative, after all.

However, we have two issues. The first, again, is that int may not be big enough to avoid truncating the result, so we might lose information if x – y produces something larger than an int. Using auto can help with that.

The second is that x – y might give a strange answer, which isn’t the programmer’s fault but is something you want to remember about arithmetic in C and C++. Consider this code:

unsigned long x    = 42;
signed short  y    = 43;
auto          diff = x - y;   // one actual result: 18446744073709551615
if(diff < 0) { /*...*/ }      // um, oops – branch won't be taken

“Wait, what?” you ask. On nearly all platforms, an unsigned long is bigger than a signed short, and because of the promotion rules the type of s – u, and therefore of result, will be… unsigned long. Which is, well, not very signed. So depending on the types of x and y, and depending on your actual platform, it may be that the branch won’t be taken, which clearly isn’t the same as the original code.

Guideline: Combine signed and unsigned arithmetic carefully.

Before you say, “then I always want signed!” remember that if you overflow then unsigned arithmetic wraps, which can be valid for your use, whereas signed arithmetic has undefined behavior, which is quite unlikely to be useful. Sometimes you really need signed, and sometimes you really need unsigned, even though often you won’t care.

From observing auto‘s effect in case (d), it might seem like auto has helped one problem… but was it at the expense of creating another?

Yes, on the one hand, auto did indeed help us: Using auto ensured we could write portable and correct code where the result wasn’t needlessly narrowed. If we didn’t care about signedness, which is often true, that’s quite sufficient.

On the other hand, using auto might not preserve signedness in a computation like x – y that’s supposed to return something with a sign, or it might not preserve unsignedness when that’s desirable. But this isn’t so much an issue with auto itself as that we have to be careful when combining signed and unsigned arithmetic, and by binding to an exact type auto is exposing this issue with some code that might potentially be already nonportable, or have corner cases the developer wasn’t aware of when he wrote it.

So what’s a good answer? Consider using auto together with the as_signed or as_unsigned conversion helper we saw before, which is used in lieu of a cast to a specific type; the helper is written out more fully in the endnotes. [2] Then we get the best of both worlds—we don’t commit to an explicit type, but we ensure the basic size and signedness in portable code that will work as intended on many different compilers and platforms.

Guideline: Prefer to use auto x = as_signed(integer_expr); or auto x = as_unsigned(integer_expr); to store the result of an integer computation that should be signed or unsigned. Using auto together with as_signed or as_unsigned makes code more portable: the variable will both be large enough and preserve the required signedness on all platforms. (Signed/unsigned conversions within integer_expr may still occur.)

Finally, case (e) brings floating point into the picture:

// (e)
int i = f(1,2,3) * 42.0;

Here we have our by-now-yawnworthy-typical narrowing—and an easy case because it isn’t even hiding, it’s saying int and 42.0 right there in the same breath, which is narrowing almost regardless of what type f returns.

Does auto help? Yes, in making our code self-documenting and more reviewable, as we noted before. If we follow the auto x = type{expr}; declaration style, we would be (happily) forced to write the conversion explicitly, and when we initially use { } we get an error that in fact it’s a narrowing conversion, which we acknowledge (again explicitly) by switching to ( ):

auto i = int( f(1,2,3) * 42.0 );

This code is now free of implicit conversions, including implicit narrowing conversions. If our team’s coding style says to use auto x = expr; or auto x = type{expr}; wherever possible, then in a code review just seeing the ( ) parens can immediately connote explicit narrowing; adding a comment doesn’t hurt either.

But for floating point calculations, can using auto by itself hurt? Consider this example, contributed by Andrei Alexandrescu:

float f1 = /*...*/, f2 = /*...*/;

auto   f3 = f1 + f2;   // correct, but on some compilers/platforms...
double f4 = f1 + f2;   // ... this might keep more bits of precision

As Alexandrescu notes: “Machines are free to do intermediate calculations in a larger precision than the target, and in many cases (and traditionally in C) calculations are done in double precision. So for f3 we have a sum done in double precision, which is then truncated down to float. For f4, the sum is preserved at full precision.”

Does this mean using auto creates a potential flaw here? Not really. In the language, the type of f1 + f2 is still float, and the naked auto maintains that exact type for us. However, if we do want to follow the pattern of switching to double early in a complex computation, we can and should say so:

float f1 = /*...*/, f2 = /*...*/;

auto f5 = double{f1} + f2;

Summary

We’ve seen a number of reasons to prefer to declare variables using auto, optionally with an explicit type if you do want to commit to a specific type.

If you’re observed a pattern in this GotW’s Guidelines, you’ll already have a sense of what’s coming in GotW #94… a Special Edition on, you guessed it, auto style.

Notes

[1] Another reason to prefer using the <cstdint> typedef names is because, due to a quirk in the C++ language grammar, only a single-word type is allowed where uint64_t appears in this example. That’s fine nearly always because it’s all you need for class types and all typedef and using alias names and most built-in types, but you can’t directly name arrays or the multi-word built-in types like unsigned int or long long in that position; for the latter, use the uintNN_t-style typedef names instead. The exact ones, such as uint64_t, are “optional” in the standard, but they are in the standard and expected to be widely implemented so I used them. The “least” and “fast” ones are required, so if you don’t have uint64_t you can use uint_least64_t or uint_fast64_t.

[2] The helpers preserve the size of the type while changing only the signedness. Thanks to Andrei Alexandrescu for this basic idea; any errors are mine, not his. The C++98 way is to provide a set of overloads for each type, but a modern version might look something like the following which uses the C++11 std::make_signed/make_unsigned facilities.

// C++11 version
//
template<class T>
typename make_signed<T>::type as_signed(T t)
    { return make_signed<T>::type(t); }

template<class T>
typename make_unsigned<T>::type as_unsigned(T t)
    { return make_unsigned<T>::type(t); }

Note that with C++14 this gets even sweeter, using auto return type deduction to eliminate typename and repetition, and the _t alias to replace ::type:

// C++14 version, option 1
//
template<class T> auto as_signed  (T t){ return make_signed_t  <T>(t); }
template<class T> auto as_unsigned(T t){ return make_unsigned_t<T>(t); }

or you can equivalently write these function templates as named lambdas:

// C++14 version, option 2
//
auto as_signed   =[](auto x){ return make_signed_t  <decltype(x)>(x); };
auto as_unsigned =[](auto x){ return make_unsigned_t<decltype(x)>(x); };

Sweet, isn’t it? Once you have a compiler that supports these features, pick whichever suits your fancy.

Acknowledgments

Thanks in particular to Scott Meyers and Andrei Alexandrescu for their time and insights in reviewing and discussing drafts of this material. Thanks also to the following for their feedback to improve this article: mttpd, Jim Park, Yuri Khan, Arne, rhalbersma, Tom, Martin Ba, John, Frederic Dumont, Sebastian.

Read Full Post »

Why prefer declaring variables using auto? Let us count some of the reasons why…

 

Problem

JG Question

1. In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
void traverser( const vector<int>& v ) {
for( vector<int>::iterator i = begin(v); i != end(v); i += 2 )
// ...
}

// (b)
vector<int> v1(5);
vector<int> v2 = 5;

// (c)
gadget get_gadget();
// ...
widget w = get_gadget();

// (d)
function<void(vector<int>)> get_size
= [](vector<int> x) { return x.size(); };

Guru Question

2. Same question, subtler examples: In the following code, what actual or potential pitfalls exist in each labeled piece of code? Which of these pitfalls would using auto variable declarations fix, and why or why not?

// (a)
widget w;

// (b)
vector<string> v;
int size = v.size();

// (c) x and y are of some built-in integral type
int total = x + y;

// (d) x and y are of some built-in integral type
int diff = x - y;
if(diff < 0) { /*...*/ }

// (e)
int i = f(1,2,3) * 42.0;

Read Full Post »

What does auto do on variable declarations, exactly? And how should we think about auto? In this GotW, we’ll start taking a look at C++’s oldest new feature.

 

Problem

JG Questions

1. What is the oldest C++11 feature? Explain.

2. What does auto mean when declaring a local variable?

Guru Questions

3. In the following code, what is the type of variables a through k, and why? Explain.

int         val = 0;
auto a = val;
auto& b = val;
const auto c = val;
const auto& d = val;

int& ir = val;
auto e = ir;

int* ip = &val;
auto f = ip;

const int ci = val;
auto g = ci;

const int& cir = val;
auto h = cir;

const int* cip = &val;
auto i = cip;

int* const ipc = &val;
auto j = ipc;

const int* const cipc = &val;
auto k = cipc;

4. In the following code, what type does auto deduce for variables a and b, and why? Explain.

int val = 0;

auto a { val };
auto b = { val };

 

Solution

1. What is the oldest C++11 feature? Explain.

auto x = something; to declare a new local variable whose type is deduced from something, and isn’t just always int.

Bjarne Stroustrup likes to point out that auto for deducing the type of local variables is the oldest feature added in the 2011 release of the C++ standard. He implemented it in C++ 28 years earlier, in 1983—which incidentally was the same year the language’s name was changed to C++ from C with Classes (the new name was unveiled publicly on January 1, 1984), and the same year Stroustrup added other fundamental features including const (later adopted by C), virtual functions, & references, and BCPL-style // comments.

Alas, Stroustrup was forced to remove auto because of compatibility concerns with C’s then-existing implicit int rule, which has since been abandoned in C. We’re glad auto is now back and here to stay.

2. What does auto mean when declaring a local variable?

It means to deduce the type from the expression used to initialize the new variable. In particular, auto local variables deduction is exactly the same as type deduction for parameters of function templates—by specification, the rule for auto variables says “do what function templates are required to do”—plus they can capture initializer_list as a type. For example:

template<class T> void f( T ) { }

int val = 0;

f( val ); // deduces T == int, calls f<int>( val )
auto x = val; // deduces T == int, x is of type int

When you’re new to auto, the key thing to remember is that you really are declaring your own new local variable. That is, “what’s on the left” is my new variable, and “what’s on the right” is just its initial value:

auto my_new_variable = its_initial_value;

You want your new variable to be just like some existing variable or expression over there, and be initialized from it, but that only means that you want the same basic type, not necessarily that other variable’s own personal secondary attributes such as top-level const- or volatile-ness and &/&& reference-ness which are per-variable. For example, just because he’s const doesn’t mean you’re const, and vice versa.

It’s kind of like being identical twins: Andy may be genetically just like his brother Bobby and is part of the same family, but he’s not the same person; he’s a distinct person and can make his own choice of clothes and/or jewelry, go to be seen on the scene in different parts of town, and so forth. So your new variable will be just like that other one and be part of the same type family, but it’s not the same variable; it’s a distinct variable with its own choice of whether it wants to be dressed with const, volatile, and/or a & or && reference, may be visible to different threads, and so forth.

Remembering this will let us easily answer the rest of our questions.

3. In the following code, what is the type of variables a through k, and why? Explain.

Quick reminder: auto means “take exactly the type on the right-hand side, but strip off top-level const/volatile and &/&&.” Armed with that, these are mostly pretty easy.

For simplicity, these examples use const and &. The rules for adding or removing const and volatile are the same, and the rules for adding or removing & and && are the same.

int         val = 0;
auto a = val;
auto& b = val;
const auto c = val;
const auto& d = val;

For a through d, the type is what you get from replacing auto with int: int, int&, const int, and const int&, respectively. The same ability to add const applies to volatile, and the same ability to add & applies to &&. (Note that && will be what Scott Meyers calls a universal reference, just as with templates, and does in some cases bring across the const-ness if it’s binding to something const.)

Now that we’ve exercised adding top-level const (or volatile) and & (or &&) on the left, let’s consider how they’re removed on the right. Note that the left hand side of a through d can be used in any combination with the right hand side of e through k.

int&        ir  = val;
auto e = ir;

The type of e is int. Because ir is a reference to val, which makes ir just another name for val, it’s exactly the same as if we had written auto e = val; here.

Remember, just because ir is a reference (another name for the existing variable val) doesn’t have any bearing on whether we want e to be a reference. If we wanted e to be a reference, we would have said auto& as we did in case b above, and it would have been a reference irrespective of whether ir happened to be a reference or not.

int*        ip  = &val; 
auto f = ip;

The type of f is int*.

const int   ci  = val;
auto g = ci;

The type of g is int.

Remember, just because ci is const (read-only) doesn’t have any bearing on whether we want g to be const. It’s a separate variable. If we wanted g to be const, we would have said const auto as we did in case c above, and it would have been const irrespective of whether ci happened to be const or not.

const int&  cir = val;
auto h = cir;

The type of h is int.

Again, remember we just drop top-level const and & to get the basic type. If we wanted h to be const and/or &, we could just add it as shown with b, c, and d above.

const int*  cip = &val;
auto i = cip;

The type of i is const int*.

Note that this isn’t a top-level const, so we don’t drop it. We pronounce cip‘s declaration right to left: The type of cip is “pointer to const int,” not “const pointer to int.” What’s const is not cip, but rather *cip, the int it’s pointing to.

int* const  ipc = &val;
auto j = ipc;

The type of j is int*. This const is a top-level const, and ipc‘s being const is immaterial to whether we want j to be const.

const int* const cipc = &val;
auto k = cipc;

The type of k is const int*.

4. In the following code, what type does auto deduce for variables a and b, and why? Explain.

As we noted in #2, the only place where an auto variable deduces anything different from a template parameter is that auto deduces an initializer_list. This brings us to the final cases:

int val = 0;

auto a { val };
auto b = { val };

The type of both a and b is std::initializer_list<int>.

That’s the only difference between auto variable deduction and template parameter deduction—by specification, because auto deduction is defined in the standard as “follow those rules over there in the templates clause, plus deduce initializer_list.”

If you’re familiar with templates and curious how auto deduction and template deduction map to each other, the table below lists the main cases and shows the equivalent syntax between the two features. For the left column, I’ll put the variable and the initialization on separate lines to emphasize how they correspond to the separated template parameter and call site on the right.

Not only are the cases equivalent in expressive power, but you might even feel that some of the auto versions feel even slicker to you than their template counterparts.

Summary

Having auto variables really brings a feature we already had (template deduction) to an even wider audience. But so far we’ve only seen what auto does. The even more interesting question is how to use it. Which brings us to our next GotW…

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: davidphilliposter, Phil Barila, Ralph Tandetzky, Marcel Wild.

Read Full Post »

What does auto do on variable declarations, exactly? And how should we think about auto? In this GotW, we’ll start taking a look at C++’s oldest new feature.

 

Problem

JG Questions

1. What is the oldest C++11 feature? Explain.

2. What does auto mean when declaring a local variable?

Guru Questions

3. In the following code, what is the type of variables a through k, and why? Explain.

int         val = 0;
auto a = val;
auto& b = val;
const auto c = val;
const auto& d = val;

int& ir = val;
auto e = ir;

int* ip = &val;
auto f = ip;

const int ci = val;
auto g = ci;

const int& cir = val;
auto h = cir;

const int* cip = &val;
auto i = cip;

int* const ipc = &val;
auto j = ipc;

const int* const cipc = &val;
auto k = cipc;

4. In the following code, what type does auto deduce for variables a and b, and why? Explain.

int val = 0;

auto a { val };
auto b = { val };

Read Full Post »

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #105.

How should you prefer to pass smart pointers, and why?

Problem

JG Question

1. What are the performance implications of the following function declaration? Explain.

void f( shared_ptr<widget> );

Guru Questions

2. What are the correctness implications of the function declaration in #1? Explain with clear examples.

3. A colleague is writing a function f that takes an existing object of type widget as a required input-only parameter, and trying to decide among the following basic ways to take the parameter (omitting const):

void f( widget* );              (a)
void f( widget& );              (b)
void f( unique_ptr<widget> );   (c)
void f( unique_ptr<widget>& );  (d)
void f( shared_ptr<widget> );   (e)
void f( shared_ptr<widget>& );  (f)

Under what circumstances is each appropriate? Explain your answer, including where const should or should not be added anywhere in the parameter type.

(There are other ways to pass the parameter, but we will consider only the ones shown above.)

Solution

1. What are the performance implications of the following function declaration? Explain.

void f( shared_ptr<widget> );

A shared_ptr stores strong and weak reference counts (see GotW #89). When you pass by value, you have to copy the argument (usually) on entry to the function, and then destroy it (always) on function exit. Let’s dig into what this means.

When you enter the function, the shared_ptr is copy-constructed, and this requires incrementing the strong reference count. (Yes, if the caller passes a temporary shared_ptr, you move-construct and so don’t have to update the count. But: (a) it’s quite rare to get a temporary shared_ptr in normal code, other than taking one function’s return value and immediately passing that to a second function; and (b) besides as we’ll see most of the expense is on the destruction of the parameter anyway.)

When exiting the function, the shared_ptr is destroyed, and this requires decrementing its internal reference count.

What’s so bad about a “shared reference count increment and decrement?” Two things, one related to the “shared reference count” and one related to the “increment and decrement.” It’s good to be aware of how this can incur performance costs for two reasons: one major and common, and one less likely in well-designed code and so probably more minor.

First, the major reason is the performance cost of the “increment and decrement”: Because the reference count is an atomic shared variable (or equivalent), incrementing and decrementing it are internally-synchronized read-modify-write shared memory operations.

Second, the less-likely minor reason is the potentially scalability-bustingly contentious nature of the “shared reference count”: Both increment and decrement update the reference count, which means that at the processor and memory level only one core at a time can be executing such an instruction on the same reference count because it needs exclusive access to the count’s cache line. The net result is that this causes some contention on the count’s cache line, which can affect scalability if it’s a popular cache line being touched by multiple threads in tight loops—such as if two threads are calling functions like this one in tight loops and accessing shared_ptrs that own the same object. “So don’t do that, thou heretic caller!” we might righteously say. Well and good, but the caller doesn’t always know when two shared_ptrs used on two different threads refer to the same object, so let’s not be quick to pile the wood around his stake just yet.

As we will see, an essential best practice for any reference-counted smart pointer type is to avoid copying it unless you really mean to add a new reference. This cannot be stressed enough. This directly addresses both of these costs and pushes their performance impact down into the noise for most applications, and especially eliminates the second cost because it is an antipattern to add and remove references in tight loops.

At this point, we will be tempted to solve the problem by passing the shared_ptr by reference. But is that really the right thing to do? It depends.

2. What are the correctness implications of the function declaration in #1?

The only correctness implication is that the function advertises in a clear type-enforced way that it will (or could) retain a copy of the shared_ptr.

That this is the only correctness implication might surprise some people, because there would seem to be one other major correctness benefit to taking a copy of the argument, namely lifetime: Assuming the pointer is not already null, taking a copy of the shared_ptr guarantees that the function itself holds a strong refcount on the owned object, and that therefore the object will remain alive for the duration of the function body, or until the function itself chooses to modify its parameter.

However, we already get this for free—thanks to structured lifetimes, the called function’s lifetime is a strict subset of the calling function’s call expression. Even if we passed the shared_ptr by reference, our function would as good as hold a strong refcount because the caller already has one—he passed us the shared_ptr in the first place, and won’t release it until we return. (Note this assumes the pointer is not aliased. You have to be careful if the smart pointer parameter could be aliased, but in this respect it’s no different than any other aliased object.)

Guideline: Don’t pass a smart pointer as a function parameter unless you want to use or manipulate the smart pointer itself, such as to share or transfer ownership.

Guideline: Prefer passing objects by value, *, or &, not by smart pointer.

If you’re saying, “hey, aren’t raw pointers evil?”, that’s excellent, because we’ll address that next.

3. A colleague is writing a function f that takes an existing object of type widget as a required input-only parameter, and trying to decide among the following basic ways to take the parameter (omitting const). Under what circumstances is each appropriate? Explain your answer, including where const should or should not be added anywhere in the parameter type.

(a) and (b): Prefer passing parameters by * or &.

void f( widget* );              (a)
void f( widget& );              (b)

These are the preferred way to pass normal object parameters, because they stay agnostic of whatever lifetime policy the caller happens to be using.

Non-owning raw * pointers and & references are okay to observe an object whose lifetime we know exceeds that of the pointer or reference, which is usually true for function parameters. Thanks to structured lifetimes, by default arguments passed to f in the caller outlive f‘s function call lifetime, which is extremely useful (not to mention efficient) and makes non-owning * and & appropriate for parameters.

Pass by * or & to accept a widget independently of how the caller is managing its lifetime. Most of the time, we don’t want to commit to a lifetime policy in the parameter type, such as requiring the object be held by a specific smart pointer, because this is usually needlessly restrictive. As usual, use a * if you need to express null (no widget), otherwise prefer to use a &; and if the object is input-only, write const widget* or const widget&.

(c) Passing unique_ptr by value means “sink.”

void f( unique_ptr<widget> );   (c)

This is the preferred way to express a widget-consuming function, also known as a “sink.”

Passing a unique_ptr by value is only possible by moving the object and its unique ownership from the caller to the callee. Any function like (c) takes ownership of the object away from the caller, and either destroys it or moves it onward to somewhere else.

Note that, unlike some of the other options below, this use of a by-value unique_ptr parameter actually doesn’t limit the kind of object that can be passed to those managed by a unique_ptr. Why not? Because any pointer can be explicitly converted to a unique_ptr. If we didn’t use a unique_ptr here we would still have to express “sink” semantics, just in a more brittle way such as by accepting a raw owning pointer (anathema!) and documenting the semantics in comments. Using (c) is vastly superior because it documents the semantics in code, and requires the caller to explicitly move ownership.

Consider the major alternative:

// Smelly 20th-century alternative
void bad_sink( widget* p );  // will destroy p; PLEASE READ THIS COMMENT

// Sweet self-documenting self-enforcing modern version (c)
void good_sink( unique_ptr<widget> p );

And how much better (c) is:

// Older calling code that calls the new good_sink is safer, because
// it's clearer in the calling code that ownership transfer is going on
// (this older code has an owning * which we shouldn't do in new code)
//
widget* pw = ... ; 

bad_sink ( pw );             // compiles: remember not to use pw again!

good_sink( pw );             // error: good
good_sink( unique_ptr<widget>{pw} );  // need explicit conversion: good

// Modern calling code that calls good_sink is safer, and cleaner too
//
unique_ptr<widget> pw = ... ;

bad_sink ( pw.get() );       // compiles: icky! doesn't reset pw
bad_sink ( pw.release() );   // compiles: must remember to use this way

good_sink( pw );             // error: good!
good_sink( move(pw) );       // compiles: crystal clear what's going on

Guideline: Express a “sink” function using a by-value unique_ptr parameter.

Because the callee will now own the object, usually there should be no const on the parameter because the const should be irrelevant.

(d) Passing unique_ptr by reference is for in/out unique_ptr parameters.

void f( unique_ptr<widget>& );  (d)

This should only be used to accept an in/out unique_ptr, when the function is supposed to actually accept an existing unique_ptr and potentially modify it to refer to a different object. It is a bad way to just accept a widget, because it is restricted to a particular lifetime strategy in the caller.

Guideline: Use a non-const unique_ptr& parameter only to modify the unique_ptr.

Passing a const unique_ptr<widget>& is strange because it can accept only either null or a widget whose lifetime happens to be managed in the calling code via a unique_ptr, and the callee generally shouldn’t care about the caller’s lifetime management choice. Passing widget* covers a strict superset of these cases and can accept “null or a widget” regardless of the lifetime policy the caller happens to be using.

Guideline: Don’t use a const unique_ptr& as a parameter; use widget* instead.

I mention widget* because that doesn’t change the (nullable) semantics; if you’re being tempted to pass const shared_ptr<widget>&, what you really meant was widget* which expresses the same information. If you additionally know it can’t be null, though, of course use widget&.

(e) Passing shared_ptr by value implies taking shared ownership.

void f( shared_ptr<widget> );   (e)

As we saw in #2, this is recommended only when the function wants to retain a copy of the shared_ptr and share ownership. In that case, a copy is needed anyway so the copying cost is fine. If the local scope is not the final destination, just std::move the shared_ptr onward to wherever it needs to go.

Guideline: Express that a function will store and share ownership of a heap object using a by-value shared_ptr parameter.

Otherwise, prefer passing a * or & (possibly to const) instead, since that doesn’t restrict the function to only objects that happen to be owned by shared_ptrs.

(f) Passing shared_ptr& is useful for in/out shared_ptr manipulation.

void f( shared_ptr<widget>& );  (f)

Similarly to (d), this should mainly be used to accept an in/out shared_ptr, when the function is supposed to actually modify the shared_ptr itself. It’s usually a bad way to accept a widget, because it is restricted to a particular lifetime strategy in the caller.

Note that per (e) we pass a shared_ptr by value if the function will share ownership. In the special case where the function might share ownership, but doesn’t necessarily take a copy of its parameter on a given call, then pass a const shared_ptr& to avoid the copy on the calls that don’t need it, and take a copy of the parameter if and when needed.

Guideline: Use a non-const shared_ptr& parameter only to modify the shared_ptr. Use a const shared_ptr& as a parameter only if you’re not sure whether or not you’ll take a copy and share ownership; otherwise use widget* instead (or if not nullable, a widget&).

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: mttpd, zahirtezcan, Jon, GregM, Andrei Alexandrescu.

Read Full Post »

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #105.

 

How should you prefer to pass smart pointers, and why?

 

Problem

JG Question

1. What are the performance implications of the following function declaration? Explain.

void f( shared_ptr<widget> );

Guru Questions

2. What are the correctness implications of the function declaration in #1? Explain with clear examples.

3. A colleague is writing a function f that takes an existing object of type widget as a required input-only parameter, and trying to decide among the following basic ways to take the parameter (omitting const):

void f( widget* );              (a)
void f( widget& ); (b)
void f( unique_ptr<widget> ); (c)
void f( unique_ptr<widget>& ); (d)
void f( shared_ptr<widget> ); (e)
void f( shared_ptr<widget>& ); (f)

Under what circumstances is each appropriate? Explain your answer, including where const should or should not be added anywhere in the parameter type.

(There are other ways to pass the parameter, but we will consider only the ones shown above.)

Read Full Post »

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #104.

 

What should factory functions return, and why?

 

Problem

While spelunking through the code of a new project you recently joined, you find the following factory function declaration:

widget* load_widget( widget::id desired );

JG Question

1. What’s wrong with this return type?

Guru Questions

2. Assuming that widget is a polymorphic type, what is the recommended return type? Explain your answer, including any tradeoffs.

3. You’d like to actually change the return type to match the recommendation in #2, but at first you worry about breaking source compatibility with existing calling code; recompiling existing callers is fine, but having to go change them all is not. Then you have an “aha!” moment, realizing that this is a fairly new project and all of your calling code is written using modern C++ idioms, and you go ahead and change the return type without fear, knowing it will require few or no code changes to callers. What makes you so confident?

4. If widget is not a polymorphic type, what is the recommended return type? Explain.

 

Solution

1. What’s wrong with this return type?

First, what can we know or reasonably expect from the two-line problem description?

We’re told load_widget is a factory function. It produces an object by “loading” it in some way and then returning it to the caller. Since the return type is a pointer, the result might be null.

The caller will reasonably expect to use the object, whether by calling member functions on it or passing it to other functions or in some other way. That isn’t safe unless the caller owns the object to ensure it is alive—either the caller gets exclusive ownership, or it gets shared ownership if the factory also maintains an internal strong or weak reference.

Because the caller has or shares ownership, he has to do something when the object is no longer needed. If the ownership is exclusive, he should destroy the object somehow. Otherwise, if the ownership is shared, he should decrement some shared reference count.

Unfortunately, returning a widget* has two major problems. First, it’s unsafe by default, because the default mode of operation (i.e., when the caller writes whitespace) is to leak a widget:

// Example 1: Leak by default. Really, this is just so 20th-century...
//
widget* load_widget( widget::id desired );

:::

load_widget( some_id ); // oops

The Example 1 code compiles cleanly, runs, and (un)happily leaks the widget.

Guideline: Don’t use explicit new, delete, and owning * pointers, except in rare cases encapsulated inside the implementation of a low-level data structure.

Second, the signature conveys absolutely no information other than “a widget? sure, here you go! enjoy.” The documentation may state that (and how) the caller is to own the object, but the function declaration doesn’t—it’s either exclusive ownership or shared ownership, but which? Read and remember the function’s documentation, because the function declaration isn’t telling. The signature alone doesn’t even say whether the caller shares in ownership at all.

2. Assuming that widget is a polymorphic type, what is the recommended return type? Explain your answer, including any tradeoffs.

If widget is a type that’s intended to be used polymorphically and held by pointer or reference, the factory should normally return a unique_ptr to transfer ownership to the caller, or a shared_ptr if the factory shares ownership by retaining a strong reference inside its internal data structures.

Guideline: A factory that produces a reference type should return a unique_ptr by default, or a shared_ptr if ownership is to be shared with the factory.

This solves both problems: safety, and self-documentation.

First, consider how this immediately solves the safety problem in Example 1:

// Example 2: Clean up by default. Much better...
//
unique_ptr<widget> load_widget( widget::id desired );

:::

load_widget( some_id ); // cleans up

The Example 2 code compiles cleanly, runs, and happily cleans up the widget. But it’s not just correct by default—it’s correct by construction, because there’s no way to make a mistake that results in a leak.

Aside: Someone might say, “but can’t someone still write load_widget(some_id).release()?” Of course they can, if they’re pathological; the correct answer is, “don’t do that.” Remember, our concern is to protect against Murphy, not Machiavelli—against bugs and mistakes, not deliberate crimes—and such pathological abuses fall into the latter category. That’s no different from, and doesn’t violate type safety any more than, explicitly calling Dispose early in a C# using block or explicitly calling close early in a Java try-with-resources block.

What if the cleanup should be done by something other than a plain delete call? Easy: Just use a custom deleter. The icing on the cake is that the factory itself knows which deleter is appropriate and can state it at the time it constructs the return value; the caller doesn’t need to worry about it, especially if he takes the result using an auto variable.

Second, this is self-documenting: A function that returns a unique_ptr or a value clearly documents that it’s a pure “source” function, and one that returns a shared_ptr clearly documents that it’s returning shared ownership and/or observation.

Finally, why prefer unique_ptr by default, if you don’t need to express shared ownership? Because it’s the Right Thing To Do for both performance and correctness, as noted in GotW #89, and leaves all options open to the caller:

  • Returning unique_ptr expresses returning unique ownership, which is the norm for a pure “source” factory function.
  • unique_ptr can’t be beat in efficiency—moving one is about as cheap as moving/copying a raw pointer.
  • If the caller wants to manage the produced object’s lifetime via shared_ptr, they can easily convert to shared_ptr via an implicit move operation—no need to say std::move because the compiler already knows that the returned value is a temporary object.
  • If the caller is using any other arbitrary method of maintaining the object’s lifetime, he can convert to a custom smart pointer or other lifetime management scheme simply by calling .release(). This can be useful, and isn’t possible with shared_ptr.

Here it is in action:

// Example 2, continued
//

// Accept as a unique_ptr (by default)
auto up = load_widget(1);

// Accept as a shared_ptr (if desired)
auto sp = shared_ptr<widget>{ load_widget(2) };

// Accept as your own smart pointer (if desired)
auto msp = my::smart_ptr<widget>{ load_widget(3).release() };

Of course, if the factory retains some shared ownership or observation, whether via an internal shared_ptr or weak_ptr, return shared_ptr. The caller will be forced to continue using it as a shared_ptr, but in that case that’s appropriate.

3. […] you go ahead and change the return type without fear, knowing it will require few or no code changes to callers. What makes you so confident?

Modern portable C++ code uses unique_ptr, shared_ptr, and auto. Returning unique_ptr works with all three; returning shared_ptr works with the last two.

If the caller accepts the return value in an auto variables, such as auto w = load_widget(whatever);, then the type will just naturally be correct, normal dereferencing will just work, and the only source ripple will be if the caller tries to explicitly delete (if so, the delete line can rather appropriately be deleted) or tries to store into a non-local object of a different type.

Guideline: Prefer declaring variables using auto. It’s shorter, and helps to insulate your code from needless source ripples due to minor type changes.

Otherwise, if the caller isn’t using auto, then it’s likely already using the result to initialize a unique_ptr or shared_ptr because modern C++ calling code does not traffick in raw pointers for non-parameter variables (more on this next time). In either case, returning a unique_ptr just works: A unique_ptr can be seamlessly moved into either of those types, and if the semantics are to return shared ownership and then the caller should already be using a shared_ptr and things will again work just fine (only probably better than before, because for the original return by raw pointer to work correctly the return type was probably forced to jump through the enable_shared_from_this hoop, which isn’t needed if we just return a shared_ptr explicitly).

4. If widget is not a polymorphic type, what is the recommended return type? Explain.

If widget is not a polymorphic type, which typically means it’s a copyable value type or a move-only type, the factory should return a widget by value. But what kind of value?

In C++98, programmers would often resort to returning a large object by pointer just to avoid the penalty of copying its state:

// Example 4(a): Obsolete convention: return a * just to avoid a copy 
//
/*BAD*/ vector<gadget>* load_gadgets() {
vector<gadget>* ret = new vector<gadget>();
// ... populate *ret ...
return ret;
}

// Obsolete calling code (note: NOT exception-safe)
vector<gadget>* p = load_gadgets();
if(p) use(*p);
delete p;

This has all of the usability and fragility problems discussed in #1. Today, normally we should just return by value, because we will incur only a cheap move operation, not a deep copy, to hand the result to the caller:

// Example 4(b): Default recommendation: return the value
//
vector<gadget> load_gadgets() {
vector<gadget> ret;
// ... populate ret ...
return ret;
}

// Calling code (exception-safe)
auto v = load_gadgets();
use(v);

Most of the time, return movable objects by value. That’s all there is to it, if the only reason for the pointer on the return type was to avoid the copy.

There could be one additional reason the function might have returned a pointer, namely to return nullptr to indicate failure to produce an object. Normally it’s better throw an exception to report an error if we fail to load the widget. However, if not being able to load the widget is normal operation and should not be considered an error, return an optional<widget>, and probably make the factory noexcept if no other kinds of errors need to be reported than are communicated well by returning an empty optional<widget>.

// Example 4(c): Alternative if not returning an object is normal
//
optional<vector<gadget>> load_gadgets() noexcept {
vector<gadget> ret;
// ... populate ret ...
if( success ) // return vector (might be empty)
return move(ret); // note: move() here to avoid a silent copy
else
return {}; // not returning anything
}

// Calling code (exception-safe)
auto v = load_gadgets();
if(v) use(*v);

Guideline: A factory that produces a non-reference type should return a value by default, and throw an exception if it fails to create the object. If not creating the object can be a normal result, return an optional<> value.

Coda

By the way, see that test for if(v) in the last line of Example 4(c)? It calls a cool function of optional<T>, namely operator bool. What makes bool so cool? In part because of how many C++ features it exercises. Here is its declaration… just think of what this lets you safely do, including at compile time! Enjoy.

constexpr explicit optional<T>::operator bool() const noexcept;

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: Johannes Schaub, Leo, Vincent Jacquet.

Read Full Post »

GotW #90: Factories

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #104.

 

What should factory functions return, and why?

 

Problem

While spelunking through the code of a new project you recently joined, you find the following factory function declaration:

widget* load_widget( widget::id desired );

JG Question

1. What’s wrong with this return type?

Guru Questions

2. Assuming that widget is a polymorphic type, what is the recommended return type? Explain your answer, including any tradeoffs.

3. You’d like to actually change the return type to match the recommendation in #2, but at first you worry about breaking source compatibility with existing calling code; recompiling existing callers is fine, but having to go change them all is not. Then you have an “aha!” moment, realizing that this is a fairly new project and all of your calling code is written using modern C++ idioms, and you go ahead and change the return type without fear, knowing it will require few or no code changes to callers. What makes you so confident?

4. If widget is not a polymorphic type, what is the recommended return type? Explain.

Read Full Post »

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #103.

 

There’s a lot to love about standard smart pointers in general, and unique_ptr in particular.

 

Problem

JG Question

1. When should you use shared_ptr vs. unique_ptr? List as many considerations as you can.

Guru Question

2. Why should you almost always use make_shared to create an object to be owned by shared_ptrs? Explain.

3. Why should you almost always use make_unique to create an object to be initially owned by a unique_ptr? Explain.

4. What’s the deal with auto_ptr?

 

Solution

1. When should you use shared_ptr vs. unique_ptr?

When in doubt, prefer unique_ptr by default, and you can always later move-convert to shared_ptr if you need it. If you do know from the start you need shared ownership, however, go directly to shared_ptr via make_shared (see #2 below).

There are three major reasons to say “when in doubt, prefer unique_ptr.”

First, use the simplest semantics that are sufficient: Choose the right smart pointer to most directly express your intent, and what you need (now). If you are creating a new object and don’t know that you’ll eventually need shared ownership, use unique_ptr which expresses unique ownership. You can still put it in a container (e.g., vector<unique_ptr<widget>>) and do most other things you want to do with a raw pointer, only safely. If you later need shared ownership, you can always move-convert the unique_ptr to a shared_ptr.

Second, a unique_ptr is more efficient than a shared_ptr. A unique_ptr doesn’t need to maintain reference count information and a control block under the covers, and is designed to be just about as cheap to move and use as a raw pointer. When you don’t ask for more than you need, you don’t incur overheads you won’t use.

Third, starting with unique_ptr is more flexible and keeps your options open. If you start with a unique_ptr, you can always later convert to a shared_ptr via move, or to another custom smart pointer (or even to a raw pointer) via .get() or .release().

Guideline: Prefer to use the standard smart pointers, unique_ptr by default and shared_ptr if sharing is needed. They are the common types that all C++ libraries can understand. Use other smart pointer types only if necessary for interoperability with other libraries, or when necessary for custom behavior you can’t achieve with deleters and allocators on the standard pointers.

2. Why should you almost always use make_shared to create an object to be owned by shared_ptrs? Explain.

Note: If you need to create an object using a custom allocator, which is rare, you can use allocate_shared. Note that even though its name is slightly different, allocate_shared should be viewed as “just the flavor of make_shared that lets you specify an allocator,” so I’m mainly going to talk about them both as make_shared here and not distinguish much between them.

There are two main cases where you can’t use make_shared (or allocate_shared) to create an object that you know will be owned by shared_ptrs: (a) if you need a custom deleter, such as because of using shared_ptrs to manage a non-memory resource or an object allocated in a nonstandard memory area, you can’t use make_shared because it doesn’t support specifying a deleter; and (b) if you are adopting a raw pointer to an object being handed to you from other (usually legacy) code, you would construct a shared_ptr from that raw pointer directly.

Guideline: Use make_shared (or, if you need a custom allocator, allocate_shared) to create an object you know will be owned by shared_ptrs, unless you need a custom deleter or are adopting a raw pointer from elsewhere.

So, why use make_shared (or, if you need a custom allocator, allocate_shared) whenever you can, which is nearly always? There are two main reasons: simplicity, and efficiency.

First, with make_shared the code is simpler. Write for clarity and correctness first.

Second, using make_shared is more efficient. The shared_ptr implementation has to maintain housekeeping information in a control block shared by all shared_ptrs and weak_ptrs referring to a given object. In particular, that housekeeping information has to include not just one but two reference counts:

  • A “strong reference” count to track the number of shared_ptrs currently keeping the object alive. The shared object is destroyed (and possibly deallocated) when the last strong reference goes away.
  • A “weak reference” count to track the number of weak_ptrs currently observing the object. The shared housekeeping control block is destroyed and deallocated (and the shared object is deallocated if it was not already) when the last weak reference goes away.

If you allocate the object separately via a raw new expression, then pass it to a shared_ptr, the shared_ptr implementation has no alternative but to allocate the control block separately, as shown in Example 2(a) and Figure 2(a).

// Example 2(a): Separate allocation
auto sp1 = shared_ptr<widget>{ new widget{} };
auto sp2 = sp1;

Figure 2(a): Approximate memory layout for Example 2(a).

We’d like to avoid doing two separate allocations here. If you use make_shared to allocate the object and the shared_ptr all in one go, then the implementation can fold them together in a single allocation, as shown in Example 2(b) and Figure 2(b).

// Example 2(b): Single allocation
auto sp1 = make_shared<widget>();
auto sp2 = sp1;

Figure 2(b): Approximate memory layout for Example 2(b).

Note that combining the allocations has two major advantages:

  • It reduces allocation overhead, including memory fragmentation. First, the most obvious way it does this is by reducing the number of allocation requests, which are typically more expensive operations. This also helps reduce contention on allocators (some allocators don’t scale well). Second, using only one chunk of memory instead of two reduces the per-allocation overhead. Whenever you ask for a chunk of memory, the system must give you at least that many bytes, and often gives you a few more because of using fixed-size pools or tacking on housekeeping information per allocation. So by using a single chunk of memory, we tend to reduce the total extra overhead. Finally, we also naturally reduce the number of “dead” extra in-between gaps that cause fragmentation.
  • It improves locality. The reference counts are frequently used with the object, and for small objects are likely to be on the same cache line, which improves cache performance (as long as there isn’t some thread copying the smart pointer in a tight loop; don’t do that).

As always, when you can express more of what you’re trying to achieve as a single function call, you’re giving the system a better chance to figure out a way to do the job more efficiently. This is just as true when inserting 100 elements into a vector using a single range-insert call to v.insert( first, last ) instead of 100 calls to v.insert( value ) as it is when using a single call to make_shared instead of separate calls to new widget() and shared_ptr( widget* ).

There are two more advantages: Using make_shared avoids explicit new and avoids an exception safety issue. Both of these also apply to make_unique, so we’ll cover them under #3.

3. Why should you almost always use make_unique to create an object to be initially owned by a unique_ptr? Explain.

As with make_shared, there are two main cases where you can’t use make_unique to create an object that you know will be owned (at least initially) by a unique_ptr: if you need a custom deleter, or if you are adopting a raw pointer.

Otherwise, which is nearly always, prefer make_unique.

Guideline: Use make_unique to create an object that isn’t shared (at least not yet), unless you need a custom deleter or are adopting a raw pointer from elsewhere.

Besides symmetry with make_shared, make_unique offers at least two other advantages. First, you should prefer use make_unique<T>() instead of the more-verbose unique_ptr<T>{ new T{} } because you should avoid explicit new in general:

Guideline: Don’t use explicit new, delete, and owning * pointers, except in rare cases encapsulated inside the implementation of a low-level data structure.

Second, it avoids some known exception safety issues with naked new. Here’s an example:

void sink( unique_ptr<widget>, unique_ptr<gadget> );

sink( unique_ptr<widget>{new widget{}},
unique_ptr<gadget>{new gadget{}} ); // Q1: do you see the problem?

Briefly, if you allocate and construct the new widget first, then get an exception while allocating or constructing the new gadget, the widget is leaked. You might think: “Well, I could just change new widget{} to make_unique<widget>() and this problem would go away, right?” To wit:

sink( make_unique<widget>(),
unique_ptr<gadget>{new gadget{}} ); // Q2: is this better?

The answer is no, because C++ leaves the order of evaluation of function arguments unspecified and so either the new widget or the new gadget could be performed first. If the new gadget is allocated and constructed first, then the make_unique<widget> throws, we have the same problem.

But while just changing one of the arguments to use make_unique doesn’t close the hole, changing them both to make_unique really does completely eliminate the problem:

sink( make_unique<widget>(), make_unique<gadget>() );  // exception-safe

This exception safety issue is covered in more detail in GotW #56.

Guideline: To allocate an object, prefer to write make_unique by default, and write make_shared when you know the object’s lifetime is going to be managed by using shared_ptrs.

4. What’s the deal with auto_ptr?

auto_ptr is most charitably characterized as a valiant attempt to create a unique_ptr before C++ had move semantics. auto_ptr is now deprecated, and should not be used in new code.

If you have auto_ptr in an existing code base, when you get a chance try doing a global search-and-replace of auto_ptr to unique_ptr; the vast majority of uses will work the same, and it might expose (as a compile-time error) or fix (silently) a bug or two you didn’t know you had.

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: celeborn2bealive, Andy Prowl, Chris Vine, Marek.

Read Full Post »

« Newer Posts - Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 1,981 other followers