GotW #3 Solution: Using the Standard Library (or, Temporaries Revisited)

Effective reuse is an important part of good software engineering. To demonstrate how much better off you can be by using standard library algorithms instead of handcrafting your own, let’s reconsider the previous question to demonstrate how many of the problems could have been avoided by simply reusing what’s already available in the standard library.

 

Problem

JG Question

1. What is the most widely used C++ library?

Guru Question

2. How many of the pitfalls in GotW #2 could have been avoided in the first place, if only the programmer had replaced the explicit iterator-based for loop with:

(a) a range-based for loop?

(b) a standard library algorithm call?

Demonstrate. (Note: As with GotW #2, don’t change the semantics of the function, even though they could be improved.)

To recap, here is the mostly-fixed function:

string find_addr( const list<employee>& emps, const string& name ) {
for( auto i = begin(emps); i != end(emps); ++i ) {
if( i->name() == name ) {
return i->addr;
}
}
return "";
}

 

Solution

1. What is the most widely used C++ library?

The C++ standard library, with its implementations on every platform.

 

2. (a) How many of the pitfalls in GotW #2 could have been avoided with a range-based for loop?

Astute readers of GotW #2 will have been champing at the bit to say: “Why aren’t you using a range-based for loop?” Indeed, why not? That would solve several of the temporaries, never mind be easier to write.

Compare the original unimproved explicit iterator loop:

    for( auto i = begin(emps); i != end(emps); i++ ) {
if( *i == name ) {
return i->addr;
}
}

with the range-based for loop (bonus points if you remembered to write the const auto&):

    for( const auto& e : emps ) {
if( e == name ) {
return e.addr;
}
}

The expressions e == name and return e.addr; are unchanged in terms of their possible or actual temporaries. But the questions in the naked loop code about whether or not the = causes a temporary (recall: it doesn’t), whether or not end() recalculation matters and should be hoisted (recall: probably not, but maybe), and whether or not i++ should be rewritten ++i (recall: it should) all simply don’t arise in the range-for code. Such is the power of clear code, and using a higher level of abstraction.

A key advantage is that using the range-based for loop has increased our level of abstraction, the information density in our code. Consider: What can you say about the following two pieces of code without reading what comes next?

    for( auto i = begin(emps); i != end(emps); i++ ) {   // A

for( const auto& e : emps ) { // B

At first it might seem that lines A and B convey the same information, but they don’t. When you see A, all you know is that there’s a loop of some sort that uses an iterator over emps. Granted, we’re so used to A that our eye’s peripheral vision tends to “autocomplete” it in our heads into “a loop that visits the elements of emps in order” and our autocomplete is often correct—except when it isn’t: was that a ++, or a s+= 2 in a strided loop? is the index modified inside the body? Our peripheral vision might be wrong.

On the other hand, B conveys more information to the reader. When you see B, you know for certain without inspecting the body of the loop that it is a loop that visits the element of emps in order. What’s more, you’ve simplified the loop control because there’s no need for an iterator indirection. Both of these are raising the level of abstraction of our code, and that’s a good thing.

Note that, as discussed in GotW #2, the naked for loop didn’t naturally allow consolidating to a single return statement without resorting to making the code more complex by adding an additional variable and performing extra computation (a default construction followed by an assignment, instead of just a construction). That’s still true of the range-based for loop form, because it still has the two return statements in different scopes.

2. (b) … with a standard library algorithm call?

With no other changes, simply using the standard find algorithm could do everything the range-based for loop did to avoid needless temporaries (and questions about them):

// Better (focusing on internals)
//
string find_addr( /*...*/ ) {
const auto i = find( begin(emps), end(emps), name ); // TFTFY
return i != end(emps) ? i->addr : "";
}

This naturally eliminates the same temporaries as the range-for version, and it further increases our level of abstraction. As with the range-based for loop, we can see at a glance and for certain that the loop will visit the elements of emps in order, but on top of that we also know we’re trying to find something and will get back an iterator to the first matching element if one exists. We do still have an iterator indirection, but only a single-use iterator object and no iterator arithmetic as in the original naked iterator for loop.

Further, we have eliminated a loop nested scope entirely and flattened out the function to a single scope which can simplify this calling function in ways even the range-for couldn’t. To demonstrate still more just how fundamental this point is, note that what else the flattening out of the body buys us: Now, because the return statements are in the same scope (possible only because we eliminated the loop scope), we have the option of naturally combining them. You could still write if( i != end(emps) ) return i->addr; else return “”; here, on one or two or four lines, but there’s no need to. To be clear, the point here is not that reducing return statements should be a goal in itself—it shouldn’t be, and “single exit” thinking has always been flawed as we already saw in GotW #2. Rather, the point is that using an algorithm often simplifies our code more than an explicit loop, even a range-for loop, can do—not only directly by removing extra indirections and extra variables and a loop nested scope, but often also by permitting additional simplifications in nearby code.

The above code might still cause a temporary when comparing an employee with a string, and we can eliminate even that temporary if we go one small step further and use find_if with a custom comparison that compares e.name() == name to avoid a possible conversion, assuming something like a suitable employee::name() is available as we did in GotW #2. Combining this with the other fixes to pass parameters by reference, we get:

// Better still (complete)
//
string find_addr( const list<employee>& emps, const string& name ) {
const auto i = find_if( begin(emps), end(emps),
[&](const auto& e) { return e.name() == name; } );
return i != end(emps) ? i->addr : "";
}

 

Summary

Prefer algorithm calls over explicit loops, when you have or can write a suitable algorithm that does what you want. They raise the level of abstraction and the clarity of our code. Scott Meyers’ advice in Effective STL is still true, and more applicable than even now that lambdas make algorithms much more usable than before:

Guideline: Prefer algorithm calls to explicit loops. Algorithm calls are often clearer and reduce complexity. If no suitable algorithm exists, why not write it? You’ll use it again.

Prefer reusing existing library code to handcrafting your own. The more widely used the library, the more likely it is to come well-designed, pre-debugged, and pre-optimized for many common requirements. And what library is more widely used than the standard library? In your C++ program, your standard library implementation is the most widely used library code you’re likely to use. This helps you both in the library’s design and its implementation: It’s full of code that’s intended to be used and reused, and to that end a lot of thought and care has gone into the design of its features, including its standard algorithms like find and sort. Implementers have also spent hours sweating over efficiency details, and usability details, and all sorts of other considerations so that you don’t have to—including performing optimizations you should almost never resort to in application-level code, such as using nonportable OS- and CPU-target specific optimizations.

So, always prefer to reuse code, especially algorithms and especially the standard library, and escape the trap of “I’ll-write-my-own-just-’cause-I-can.”

Guideline: Reuse code—especially standard library code—instead of handcrafting your own. It’s faster, easier, and safer.

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: Olaf ven der Spek, Sam Kramer.

32 thoughts on “GotW #3 Solution: Using the Standard Library (or, Temporaries Revisited)

  1. Hello.
    >If you mean auto as a lambda parameter type

    yes exactly! Thanks for the answer, I have to refresh my knowledge of auto now :)

  2. Are you sure you can use auto in lambda like this?
    I can not compile the code and I’m pretty sure auto does not work here.

  3. Sorry for the digression, but from that concepts@isocpp thread it seems a big part of the objections is the ambiguity whereby [](e) { … } could mean a function taking an unnamed parameter of type e. Which looks like a fair point.

    I was wondering whether it had been considered to have a specific exception for ref and const-ref parameters, so that you could write

    [](&e) { use(e) }

    but if you wanted pass-by-value, or if you wanted something more complicated like a parameter of type function-taking-auto-returning-bool, you’d have to use the auto placeholder.

    I’d have thought that reference parameters are more useful than by-value parameters in a generic context, so this doesn’t seem to lose much.

  4. Herb, if Bjarne’s expectations for C++14 (https://www.informit.com/articles/article.aspx?p=2080042) are fulfilled wouldn’t the STL provide an overload like the following (if it would work):

    template<Container Cont, Predicate Pred>
    requires Input_iterator<Iterator<Cont>>()
    Iterator<Cont> find_if(Cont& c, Pred p);
    

    If then the fact could be taken into account that a Predicate is not supposed to change its argument, the type of e could default to const auto&. Then one could write:

    const auto i = find_if(emps, [&](e) { return e.name() == name; });
    
  5. With lambdas now available, I use the built in functions as much as I can. I often find I cannot because relationships between elements in a list are often important (remember the index to something, or whatever).

    But the code is still so frustrating to type, and ugly. We have ranges now, wouldn’t it be nice to add it into the standard library for all things that take ranges? We also have auto, it would be nice if we didn’t have to keep typing the names of types that we already know. Consider:

    const auto i = find_if (e:emps, [&](auto) {return e.name() == name});

    I’m sure that syntax would break something in C++, but this is along the lines of what I want in a modern language. Herb said something in a talk about C++11 not being your dad’s C++ (or a metaphor to that effect). Well, it still is, albeit better. I don’t want to type begin(x), end(x) anymore. I don’t want to type types for a lambda when they can be figured out. More importantly, I don’t want to parse through tons of boilerplate code that somebody else typed (or that I typed 6 months ago) to try to figure out what is being done.I find myself moving back to using simple for loops, I find them clearer to read and a heck of a lot easier to type. I don’t want to have to keep in my mental stack the iterator that the find_if introduces. I don’t want to have to parse the ? operator code to see which value is returned in each case. Range based for is readable, the last solution is just a mess.

    Actually, i would love to write something like:

    string find_addr (emps,name) {
       return e.addr if e.name == name in e:emps;
       return "";
    }
    

    and just have the compiler figure out things like name can be passed by const reference, that emps needs to be something that supports ranges and has addr and name fields, and that the auto is implied. That’s true information density.

    C++ has become so unreadable. ‘const std::shared_ptr<std::vector>&’ is a typical signature for a single parameter in my code. (I would love a GotW that focused on using managed pointers in a readable way, btw).

    Sorry if this comes off as a rant, but I think conciseness and readability is far more important than removing any possible creation of a temporary variable, and more important than following a (generally good) rule like use the standard library. Try reading code where a default parameter like less is used, and search through all the headers, and headers included by headers, to figure out what version of less is being used by the standard library call for a user defined struct. A for loop makes it all explicit, often with far fewer characters. Explicit is good. Readability is good. I would not want my engineers writing code like Herb’s last example unless this function was so critical that efficiency came before anything else. It’s horrible, ugly code IMO.

  6. Herb, can you please elaborate on the underneath work that is done in ranged-based loops? As getting higher in the abstraction we also loose contact with the ground. And though we KNOW we are on the safe side and performance is taken into consideration, still, can you please explain more on what the standard says to implementers of range based loops?

  7. Since employee supports find(…, name), instead of e.name() == name the lambda could contain e == name. If name() returns a reference to const string, then I suspect it won’t matter. If name() returns a string, then I suspect we are paying price greater than e == name (on each iteration). Writing the lambda to avoid the potential temporaries yeidlds:

    string find_addr( const list<employee>& emps, const string& name ) {
        const auto i = find_if( begin(emps), end(emps),
                          [&](const auto& e) { return e == name; } );
        return i != end(emps) ? i->addr : "";
    }
    

    On the compiler that I checked, if name() returns a const string&, then e == name optimizes to the same code as e.name() == name. No penalty here. However, if name() returns a string, then the impact is significant (the complexity prevented inlining and temporaries were created).

  8. Dave, I asked the strlen question in GoTW #1, and Herb assures me that it’s not there is a “good” string library. It checks for the first character being 0 before it calls strlen.

    While equality may feel simpler than inequality, I generally prefer to check for success rather than failure when the two options are basically equivalent.

  9. Am I the only person who would prefer:

    return i == end(emps) ? "" : i->addr;
    

    Equality feels slightly simpler than inequality. I’d also prefer to avoid converting the c-string to a std::string:

    return i == end(emps) ? string() : i->addr;
    

    The conversion probably won’t allocate memory, but it’s still doing unnecessary work – possibly including a call to strlen(), for example.

  10. @Olaf: OK, clarified (in GotW #2) along the lines that “I’m assuming something like .name() is available and .name() == name has the same semantics.”

    @Sam: Good point, noted.

    @Adrian: A program that does that has undefined behavior and could achieve the same effect by dereferencing a random int cast to a pointer. Within defined behavior, the range-for can iterate only one way, from the front in order visiting every element exactly once, and the only possibly variation in behavior is that it could stop early if there’s a break or return inside vs. visiting all elements.

  11. “When you see B [the range-based for loop], you know for certain without inspecting the body of the loop that it is a loop that visits the element of emps in order.” Certainly there are some things the body _might_ do that would contradict that. For example, if the body modifies the container in a way that would invalidate outstanding iterators, you no longer can be sure that the loop will proceed normally. Consider:

    for (const auto & e : emps) {
      if (e.name == "Herb") {
        emps.push_front(employee("Bjarne"));
      }
    }
    

    Obviously, this isn’t an advisable thing to do in the loop, but the compiler isn’t likely to notice. Thus a range-based for loop might give you more confidence that it visits each element in order, but, unless emps is const, you cannot be absolutely certain without looking inside the body and without knowing the type of container and that container type’s iterator invalidation rules. Right?

  12. The astute reader will have noticed that the guidance from this tip is derived directly from the book Effective STL, Item 43: Prefer algorithm calls to hand-written loops. It’s good to know that best practices from C++98 are still largely relevant in C++11.

  13. This extra branch;

    return i != end(emps) ? i->addr : "";

    Find just returns an iterator that you still have to check. The range-for version just returns.

  14. One important thing is that abstraction does not always equate to better or more readable code. The use of more familiar constructs is often a better overall solution, errors are more likely to be seen by the casual reader etc. Although the examples are fairly trivial some are certainly more comfortable and easier to read and Jay gets it about right with his comment. The important thing to remember is that code is just a means to an end! If we are thinking at this level of trivia is there something else wrong?

  15. @Herb Sure. IMO you disregarded this note: “Note: As with GotW #2, don’t change the semantics of the function, even though they could be improved.”
    Whether employee == string is valid and has the same semantics as employee::name() == string isn’t given, so you shouldn’t assume it is.

  16. I dont understand what does an extra branch mean in the find_if version. Does everyone refer to a function call? I guess algorithms like find_if would be most likely be inlined as it takes template arguments.

  17. Yes, in this specific example the fact that we’re returning a string by-value dwarfs the extra branch after the find(), but I am more interested in the general case, where we do call some function lots of times, and its return is very light (say it returns a pointer to the found element or 0).

  18. @Jay imao both of your code blocks are uglier than Herbs find version because you have std function that is designed to (tadaaaam) find element in a range. Period. :)
    Beside I would guesstimate all this talk about perf is unrealistic… I mean if you really call this method like a mad man you probably want to go to the dark side and return a reference, also kill std::list with fire . :)

  19. Two issues. One is that I think Barry’s point is relevant even though, in this case, an extra branch is swamped out by dynamic memory allocation. Similarly to using prefix increment, doing things in a consistent manner every time that yields better performance in a subset of cases is a Good Thing if it doesn’t make readability worse (though sometimes being consistent makes for better readability even if it’s slightly more complex).

    That brings us to the second issue, which is that I can’t in good conscience say that

    string find_addr( const list<employee>& emps, const string& name ) {
        const auto i = find_if( begin(emps), end(emps),
                          [&](const auto& e) { return e.name() == name; } );
        return i != end(emps) ? i->addr : "";
    }
    

    is more readable than

    string find_addr( const list<employee>& emps, const string& name ) {
        for (const auto& e : emps) {
            if (e.name() == name) {
                return e.addr;
            }
        }
        return string(); 
    }
    

    Granted it’s fewer lines, but I can reason very intuitively about the bottom. The difference is small, so if the find_if case were faster then I’d use it. But an extra branch /and/ not as readable?

  20. find_if is worse for readability than find because beside being linger find_if says im checking for predicate(aka name.size()>123) not value… btw I wonder if there is any diff in IRL perf between find and find_if

  21. @Barry find_addr returns a string (that needs to be constructed and likely requires dynamic memory allocation), so the extra branch doesn’t matter even if find_if isn’t inlined.

  22. When using find or find_if, doesn’t that incur an extra branch over the normal range-for loop implementation? I’m not sure the algorithm solution is necessarily cleaner enough to merit the hit.

Comments are closed.