GotW #2: Temporary Objects

Unnecessary and/or temporary objects are frequent culprits that can throw all your hard work — and your program’s performance — right out the window. How can you spot them and avoid them?

 

Problem

JG Question

1. What is a temporary object?

Guru Question

2. You are doing a code review. A programmer has written the following function, which uses unnecessary temporary objects in at least three places. How many can you identify, and how should the programmer fix them?

string find_addr( list<employee> emps, string name ) {
for( auto i = begin(emps); i != end(emps); i++ ) {
if( *i == name ) {
return i->addr;
}
}
return "";
}

Do not change the operational semantics of this function, even though they could be improved.

56 thoughts on “GotW #2: Temporary Objects

  1. @Robert,
    No, we don’t want to write code that depends on the caller knowing that they have to store the data differently than how it is returned. That way lies madness.

    There is no reason not to return by value. We have move semantics, we have compiler optimizations if move semantics is not enabled. Anything other than return by value is unsafe.

  2. The return value should never be const std::string& as so many has suggested. In C++11 we have move constructors. The data will be “moved” so there is no copying. All data should as of C++11 be returned by value, as long as you have implemented a move constructor. Also the parameter should be const reference.

  3. Tested with VS2012CTP:

    string find_addr( const list<employee> & emps, const string & name ) {
        for ( auto && i : emps ) {
            if ( i.name == name ) {
                return {i.addr};
            }
        }
        return {""};
    }
    
  4. Keeping original semantic and behavior intact :

    string find_addr( const list<employee>& emps, const string& name ) {
        for( auto i = begin(emps), e = end(emps); i != e; ++i )
          if( i->name == name )
            return i->addr;
        return string();
    }
  5. Taking the parameters by const reference and using boost::find_if solves all the “sought” problems:

    auto find_addr(const list<employee>& emps, const string& name) {
        const auto pos = find_if(emps, [&](const auto& e){ return e.name == name; });
        return pos != end(emps) ? pos->addr : string();
    }
    

    In code review I would probably complain that with the find_addr function you get an empty string if:
    – the employee does not exist, _and_
    – the employee exists but its adress has not been set yet (e.g. addr was default constructed to an empty string).

    If the function is required I’d recommend using boost::optional for the return type. However I would _strongly_ recommend that the best way to improve this function is not to write it in the first place. Is just a one-liner call to find_if. And by using find_if directly the semantics of not found are crystal clear.

  6. Since const is nowhere to be found, it’s possible that the implementation of operator==(Employee,string) takes non-const refs as args. In that case, there isn’t much room for improvement.

    Assuming operator== is better than the sample code, then I see 5 temporaries: the two args, the return val, the value returned by end() and the expression i++.

    I’d fix the args and the 2nd return statement, leaving the rest as-is.

  7. Clarification: returning a std::string() may create the object in-place if the function is called in the declaration/initialization – i.e. auto res = find_string(my_strings, lookup_string);

  8. Please note: If you return string(), you are creating a temporary object.
    returning “” will invoke either the const char* assignment operator or the [CODE]const char*[CODE] constructor – i.e. doesn’t create any temporary object. Also, returning something different than the original “” might seem innocuous, but you should never make assumptions – let’s say in the future an uninitialized std::string is not the same as empty string/zero-length string, or if you decide to replace std::string for an [CODE]std::optional[/CODE], or (very common) decide to use a better string manipulation class that treats empty strings differently…

    If you use the old [b]for[/b], you need a temporary iterator. Use the C++ range [b]for[/b] with a reference, i.e. for( auto& refval : emps) instead.

    Although the original code suggests there is a [CODE]bool employee::operator==(const std::string&)[/CODE] prefer a more explicit code: i.e. [CODE]emp.name == lookup_name[/CODE] (Isn’t that much more obvious?) comparing a string to a user defined object might be confusing – i.e. lead the reviewer to make wrong assumptions or check the implementation of the object’s class to try to understand what is really doing.

  9. Here’s my take.

    string find_addr( list emps, string name ) { // pass by value - 2 temporaries
    for( auto i = begin(emps); i != end(emps); i++ ) { // auto instead of auto& (1 temp), end(emps) possibly emps.size() temps, in practice probably one; i++ possibly emps.size() temps, probably none.
    if( *i == name ) { // Here we don't know enough details. (maybe there's operator==(const employee& , const string&)). In any case, if temporary is created, it can't be avoided

    return i->addr; // same here, if addr is not string (or const char*) temporary cannot be avoided.
    }
    }
    return "";
    }

    My solution

    string find_addr( const list& emps, const string& name ) {
    for( auto& emp: emps) {
    if( emp == name ) {
    return emp.addr;
    }
    }
    return string{};
    }

  10. string find_addr(const list<employee>& emps, const string& name ) {
    	auto it = find_if(begin(emps), end(emps), [&](const employee& emp) {
    		return emp.name == name;
    	});
    	return it != end(emps) ? it->name : string();
    }
    
  11. @Roger: True. The solution for that is to *store* the result as a value, what is not-so-tightly coupled with the return type being value or ref.

    My problem with returning a value is that you have to pay for the copy even if you don’t need it. If you return reference, you still have the choice to save the result in a (non-ref) string, avoiding that error you told, but if you know you are not changing the container, you have a faster code. Yes, you can shoot yourself in the foot if you are reckless. But the whole function almost the same as a find_if and just using it->addr after (tbh, I would drop the whole function because of this!). And if you are using iterators, you are aware they can/will be invalidated if you change the container, aren’t you.

    Question is what do you prefer: Want best performance+possibility of error for reckless users, or absolute safety+performance penalty (necessary evil note: why don’t you write Java or C# or .. then?)

  12. @Herb:Sutter: Have you considered creating a Google+ Community page? (or facebook if you don’t the spam) :)

    (Probably your boss wouldn’t like that, right?)

  13. Thanks for the tip David! :) (Unfortunatelly I don’e find a way to edit my last post, so…)

    // assuming Herb's intention was:
    // employee{std::string name; std::string address};
    
    //....
    string find_addr( const list<employee>& emps, const string& name ) {
       for( const auto& emp : emps) {
           if( emp.name == name ) {
               return emp.addr;
           }
       }
       return “”;
    }
    
  14. Assume struct employee{std::string name; std::string address};

    string find_addr( const list& emps, const string& name ) {
    for( const auto& emp : emps) {
    if( emp.name == name ) {
    return emp.addr;
    }
    }
    return “”;
    }

  15. string& find_addr( const list& emps, const string& name ) {
    for( auto i = begin(emps); i != end(emps); i++ ) {
    if( *i == name ) {
    return i->addr;
    }
    }
    return std::string{ “” };
    }
    Changed:
    * : The return type to a reference
    * : The parameters to be a const reference
    * : The last return statement to return a constant.

  16. @Robert, the concern about returning a reference is not limited to the case where the list is a temporary. The concern is that the caller can store and attempt to use the returned value far past the lifetime of the object:

    const string& addr = find_addr (emp, name);
    …. sometime later….
    cout << addr; //BOOM if emp has been deleted.

  17. @brandon

    end(emps) is valid C++11 code, and is the preferred idiom. Why? Because it is a template function which you can specialize for any type that does not provide an end() member function.

    I personally see no reason to create a local variable to hold end(emps). Any compiler will optimize that call away anyways, and getting ‘clever’ like this just adds more code, makes things less readable, and makes it more likely that you confuse the compiler’s optimization stage. And, of course the find_if solutions also optimize the multiple calss.

  18. 1.) A temporary is a variable that is implicitly created during execution of a function. one example this happens in

    is when returning by value.

    2.) The function parameters should be passed by const reference to eliminate two potentially large variables (although they are not “temporaries” strictly speaking ) to preventing deep copying of the list or string.

    end(emps) (I presume you meant emps.end() ) returns an iterator object into a temporary for comparison with i. It could be put into a const named variable instead of building that temporary each iteration.

    return “” will possibly construct a string object (implicitly) and then assign to the return temporary, unless the compiler optimizes. This can be explicitly avoided by returning string() ( or string(“”) if you prefer ). This utilizes the “return value optimization”

    string find_addr( const list &emps, const string &name ) {
    const auto the_end = emps.end();
    for( auto i = emps.begin(); i != the_end; i++ ) {
    if( *i == name ) {
    return i->addr;
    }
    }
    return string(“”);
    }

  19. //	first temporary list<employee> emps the biggest one	
    //	second temporary string name --	| -----------
    //									\/			\/
    string find_addr( list<employee> emps, string name ) {
    	for( auto i = begin(emps); i != end(emps); i++ ) {
    		// third temporary employee 
    		// if we assume then employee::employee(const string &name) exist
    		// and we only have bool employee::operator==(const employee& other)
    		//			\/
    		if( *i == name ) {
    			return i->addr;
    		}
    	}
    	return "";
    }
    

    Now the question is if bool employee::operator==(const employee& other) compares every employee member and not only name then this code will not find anything at all.

    @Chris: you are returning address of local variable string() !
    I would change it like this but think that range based for loop is still easier to read.

    string find_addr(const list<employee>& emps, const string& name)
    {
    	const auto found = find(cbegin(emps), cend(emps), name); //temporal employee object !
    	if(found == cend(emps)) return string();
    	else return found->addr;
    }
    

    So I would still prefer to use this one.
    Assuming name is not protected.

    string find_addr(const list<employee> &emps, const string &name) {
    	for(const auto& emp : emps) {
    		if( emp.name == name ) { return emp.addr; }
    	}
    	return string();
    }
    
  20. I would use `std::find_if` on the grounds that it’s more immediately obvious what you’re trying to do with your loop. I agree with all the folks who object to returning a `const string &`; it’s asking for trouble and is pretty pointless, performance-wise, in the presence of RVO.

    string find_addr(const list & emps, const string & name)
    {
    auto has_same_name = [&name] (const employee & e) {
    return e.name == name;
    };

    auto found = find_if(begin(emps), end(emps), has_same_name);

    return found != end(emps) ? found->addr : string{};
    }

  21. Well, it might violate “do not change the operational semantics of this function” rule, but… how about this:

    const string& find_addr(const list<employee>& emps, const string& name)
    {
        const auto found = find(cbegin(emps), cend(emps), name);
        return (found == cend(emps)) ? string() : found->addr;
    }
    
  22. @Roger I am not sure exactly what this explains. Isn’t the case we have here totally different as it is about function arguments, not function return type? Or maybe I was not clear that I was talking about the arguments?

  23. Still no solution with std::find_if ? Come on people! That’s the first thing that fails when I do the review, even before temporaries.

    Otherwise – if employee passed code review, this code didn’t even compile :) : What does operator==(employee, string) mean? Or, why isn’t employee::employee(string) explicit? This is a problem borderline to changing operational semantics, but it actually creates a temporary employee object, so..

    namespace {
            static string empty_string;
    };
    const string& find_addr( const list<employee>& emps, const string& name ) {
        auto end = end(emps);
        auto it = std::find_if(begin(emps), end, [&name](const employee& emp) {
            return name == emp.name;
        });
        if (it == end)
        {
            return empty_string;
        }
        return it->addr;
    }
    

    There was some concerns about returning a reference instead of a value (because we have no control over the list’s lifetime here – well, if the first parameter is a temporary, it will still exist when we return: [quote]A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call.[/quote]

  24. Just putting my upper comment to a proper code-blocks (I just didn’t know how to do it).

    string find_addr(const list& emps, const string& name ) {
        for( const auto& emp : emps) 
            if( emp.name == name ) return emp.addr;
        static const string empty;
        return empty;
    }
    

    return {}; seems much better to me now, but inserting it would not be “just putting previous code inside code-blocks”.

  25. In the loop it should have been
    if( i == name ) instead of if ( *i == name )
    and i.addr instead of i->addr

  26. 2) Easy question first:

    “Do not change the operational semantics of this function” -> I’ll try but I am a bit drunk (2am over here, but Ive never trusted myself to answer a guru question here so thanks alcohol, ducks!).

    const string& find_addr( const list& emps, const string& name ) { // #1 #3
    for( const auto&& i : emps ) { // #2
    if( *i == name ) {
    return i->addr; // #3
    }
    }
    return “”;
    }

    #1
    – taking a list by value is a _bad_ idea: it will be copied if its not a temporary, and list of employees sounds like something potentially pretty big.
    – taking the name by value it’s ok I guess (I wouldn’t kill that on code review). If it’s a temporary it will be moved (so no problem). If it’s not a temporary it will be copied. Anyways it’s just a string, if the function is not a hotspot it’ll be ok. I would use const tho, it expresses intent. The lvalue reference might save a copy which would help if the function is a hotspot. So let’s not pessimize and make it const lvalue reference.

    #2 end() was being called at every iteration, and the “i” string was being copied at every iteration.

    #3 Const lvalue reference avoid a copy in the return type: addr is a member of the list, its not a temporary, it will trigger a copy. “” was fine I guess. If it constructs a string it’ll just trigger a move or some fancy compiler optimization later.

    So that makes hopefully three of them, if only I could count the amount of beers I’ve had…

    1) Now the hard question What is a temporary object?
    – My answer: you don’t want to know. As a rule of thumb:
    – if you cannot refer to an object by a name, then it’s a temporary.
    – If you can refer to an object by it’s name, and the name was declared with const& or T&& then it might be a temporary.
    – otherwise, it _probably_ isn’t a temporary (i’m actually not too sure about the probably).

    If you want the truth, I guess the real truth is that no-one should need to know about glvalues, rvalues, lvalues, xvalues, prvalues… I’d can recommend http://stackoverflow.com/questions/3106110/what-are-move-semantics There is a nice answer by FredOverflow reviewed by Stephan that is pretty detailed and one that gives an overview. Read the overview, fight through the detailed one. Then watch Scott’s videos about universal references. Then read the detailed one again. And if you ever feel lost with the terminology here is a nice reference: http://stackoverflow.com/questions/3601602/what-are-rvalues-lvalues-xvalues-glvalues-and-prvalues and http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3055.pdf

  27. string find_addr(const list<employee> &emps,const string &name ) {
        for(const auto& emp : emps) {
            if( emp == name ) { return emp.addr; } //copy of addr will be made.
        }
        return {}; // use copy-list-initialization, resolves like direct-initialization.
    }
    
  28. I’m with Tom, just an slight detail:

    string find_addr(const list &emps,const string &name ) {
    for(const auto& emp : emps) {
    if( emp == name ) { return emp.addr; }
    }
    return {}; // use copy-list-initialization, resolves like direct-initialization.
    }

  29. Here is the proper way to use range based for loop.
    emp need to be reference to avoid unnecessary copy.

    string find_addr(const list<employee> &emps,const string &name ) {
    	for(const auto& emp : emps) {
    		if( emp == name ) { return emp.addr; } //copy of addr will be made.
    	}
    	return string(); //default empty constructor will be called.
    }
    

    Please use “[“code”]” “[“/code”]” tags.

  30. Wow, seems my code was broken (probably due to the <).
    here again:
    const string& find_addr(const list& emps, const string& name )
    {
    for(int i = 0; emps.size() > i; ++i) {
    if( emps[i] == name ) { //here no copy happens
    return i->addr;
    }
    }
    return “”;
    }

  31. Something few people seems to note is the copy who happens when i is dereferenced :

    string find_addr( list emps, string name ) {
    for( auto i = begin(emps); i != end(emps); i++ ) {
    if( *i == name ) { //here *i will create a copy of the employe contained in the vector.
    return i->addr;
    }
    }
    return “”;
    }

    Using a const iterator could fix this, but I suspect that this depend of the implementation of std::vector.
    Another more reliable solution is to use the [] operator. Since it returns a reference, we are sure that no extra copies will be done.
    So, the code becomes:

    const string& find_addr(const list& emps, const string& name ) {
    for( int i = 0; i addr;
    }
    }
    return “”;
    }

    Also Note that I do return a reference because I consider that it is up to the caller to know about the life time of the parameters he gives me and the return value, so it’s up to him to know if a copy of the returned value is necessary or not.

  32. How about this?

    http://rise4fun.com/Vcpp/1F

    string find_addr(const list<employee> &emps,const string &name ) {
    	for( auto i = emps.cbegin(); i != emps.cend(); ++i ) {
    		if( *i == name ) { return i->addr; }
    	}
    	return string();
    }
    

    The funny thing is that this code call string destructor for every iteration at least if using VS2012CTP.

    string find_addr(const list<employee> &emps,const string &name ) {
    	for(auto emp : emps) {
    		if( emp == name ) { return emp.addr; }
    	}
    	return string();
    }
    
  33. JG1:
    Used by older ANSI C89 zealots to prove that C++ language is bad and only C (better asm) never does tricks behind the scenes.
    2:
    The list and the string should be passed with const reference. ANSI C89 zealots (when you force them to use STL) will use a pointer here because references are evil.

    The loop iteration could be better. prefix increment would help. Making iterator const with cbegin will help. Range based for loop is the recommended solution.

    It is possible (see GotW#1) to use
    return {};
    to return default constructed string. This avoids running constructor that accepts const char*. This will aggravate ANSI C89 zealots seriously enough to make them never touch your C++14 project again.

  34. string find_addr(const list& emps, const string& name ) {
    for( auto i = begin(emps); i != end(emps); ++i ) {
    if( *i == name )
    {
    return i->addr;
    }
    }
    return “”;
    }

    My vote also goes to passing const refs (not returning one), and simply turning the post-increment into a pre-increment (although the impact there should really be minimal).

  35. @Klaim,

    consider:
    const string& address = find_addr (employees, “joe”);
    fire_employee (employees, “joe”); // deletes joe from employees
    print_final_check (“joe”, address); // KABOOM

    The problem is that what ‘address’ is referring no longer exists after fire_employee is called, so it is no longer valid to refer to it.

    Besides, all decent compilers implement return copy optimization (http://en.wikipedia.org/wiki/Return_value_optimization), so in practice no temporary will be created.

  36. Why is nobody suggesting using a for-range-loop? AFAIK it’s the exact same semantic but will make sure to avoid unnecessary copies.

    Also, I am not sure about the const refs because in multi-threaded context it might be voluntary that the semantic of the function is to copy the list before working on it. I suppose Herb would answer that const suggests that the access is safe, but still I’m not sure it doesn’t change the semantic.

  37. OK, one that REALLY keeps the operational semantics and DOES NOT produce any new temporary objects)))

    string find_addr(const list& emps, const string& name ) {
    for( const auto& emp : emps) {
    if( emp.name == name ) {
    return emp.addr;
    }
    }
    static const string empty; // I usually have one of these already somewhere in util.hpp
    return empty;
    }

    I would also suggest a version based on find_if and for_each, but I’m really lazy.

  38. OK, one that REALLY keeps the operational semantics

    string find_addr(const list& emps, const string& name ) {
    for( auto emp : emps) {
    if( emp.name == name ) {
    return emp.addr;
    }
    }
    static const string empty; // I usually have one of these already somewhere in util.hpp
    return empty;
    }

    I would also suggest a version based on find_if and for_each, but I’m really lazy.

  39. Guys, please … dont return by reference from this function. It DOES change operational semantics of this function and _will_ lead to UB.

  40. 1) i++
    2) employee == string (see why below)
    3) return value can be written {“”}
    4) const reference to the arguments
    5) why not just use for(auto& employee : emps) instead of for(begin, !=end, ++) to avoid temp iterators

    see code example at: http://ideone.com/d54ARz

    Explanation of operator ==
    —————————————————————————————
    This depends a little on employee: if it for example is defined as

    struct employee {
    string addr;
    // no explicit keyword makes the implicit conversion possible
    employee(std::string str) : addr(str) {}
    bool operator==(const employee& other) const { return (addr.compare(other.addr) == 0); }
    };

    In this case the == operator will trigger an implicit conversion to a temporary employee object

  41. Returning const string& seems like a very bad idea. We have no guarantee on the lifetime of the emps data that is passed in. Returning a reference could end up with the caller trying to use a reference to data that has been deleted.

  42. I’m not sure that returning a const ref to a static string is a good idea: it removes the temporary value that is returned from the function but also introduce a test in the function to ensure that the string is initialized only the first time the function is called, but I maybe wrong…
    I would write:

    string find_addr(const list& emps, const string& name) {
    for(auto emp: emps) {
    if(emp.name == name) {
    return emp.addr;
    }
    }
    return string();
    }

  43. Caching the end() iterator or returning a static string seem like things that are of dubious value. Any decent compiler will recognize that end() cannot change since no non-const functions on emps are called within the loop, and emps is not volatile.
    Similarly returning a string by value, even a C++98 compiler will optimize out a copy by RVO

    The only semantically guaranteed improvements are by using const& and preincrement.

    If any compiler makes a foreach loop faster than a regular loop, the compiler writers need to change their line of work.

  44. struct employee {
    	string name;
    	string addr;
    };
    const string& find_addr(const list<employee>& emps, const string& name ) {
    	auto emps_end   = emps.cend();
    	for( auto i = emps.cbegin(); i != emps_end; ++i ) {
    		if( i->name == name ) { return i->addr; }
    	}
    	static const string s_empty_string;
    	return s_empty_string;
    }
    
  45. ok another try:

    struct employee {
    string name;
    string addr;
    };

    const string& find_addr(const list& emps, const string& name ) {
    auto emps_end = cend(emps);
    for( auto i = emps.cbegin(); i != emps_end; ++i ) {
    if( i->name == name ) { return i->addr; }
    }
    static const string s_empty_string;
    return s_empty_string;
    }

  46. const string& find_addr(const list& emps, cont string& name ) {
    for( auto i = cbegin(emps); i != cend(emps); i++ ) {
    if( *i == name ) { return i->addr; }
    }
    static const string s_empty_string;
    return s_empty_string;
    }

  47. 1. A temporary object is an object that is created implicitly to be destroyed immerdiatelly after the reason for its creation has faded. The examples are: tos created by implicit conversions, by receiving a function argument by value, by returning object from function by value (usually thrown out by the compiler). They can also be created explicitly by the programmer by creating an unnamed object.
    2.
    1) accepting emps by value;
    2) accepting name by value;
    3) i++ returns and immediatelly discards a temporary;
    4) *i == name creates a temporary employee from string (or maybe a string from employee?);
    5) employee::addr is not a string, returning it results in an additional temp object;
    6) a temporary object created by returning the char[1] implicitly converted to string;
    7) the very string that is returned by value creates a temp object at the place the func is called;
    8) calling begin() and end() creates temp iterators, but they are designed for this so the effect is miniscure;

    If the contract of the function can accept returning reference (it almost always can as it is immediatelly copied to a variable):
    const string& find_addr(const list& emps, const string& name ) {
    for( auto emp : emps) {
    if( emp.name == name ) {
    return emp.addr;
    }
    }
    static const string empty; // I usually have one of these already somewhere in util.hpp
    return empty;
    }

    or, for people allergic to “for each” loop:

    const string& find_addr(const list& emps, const string& name ) {
    auto eend = emps.end();
    auto ebegin = emps.begin();
    for( auto i =ebegin ; i !=eend ;++i ) {
    if( i->name == name ) {
    return i->addr;
    }
    }
    static const string empty; // I usually have one of these already somewhere in util.hpp
    return empty;
    }

  48. string find_addr( list emps, string name ) {
    for( auto i = begin(emps); i != end(emps); i++ ) {
    if( *i == name ) {
    return i->addr;
    }
    }
    return “”;
    }

    (1) Pass parameters by const ref
    (2) cache end iterator
    (3) ++i
    (4) return {} <- likely trivial difference, if any

    However, best yet might be to just use std::find_if with a lambda.

  49. string const &find_addr(list const &emps, string const &name)
    {
    auto const last = end(emps);
    for (auto i = begin(emps); last != i; ++i) {
    if (name == *i) {
    return i->addr;
    }
    }
    static string const empty_string;
    return empty_string;
    }

  50. The two parameters should be passed by const reference, for efficiency as well as specifying that the function does not change anything.

    i++ may result in a temporary, ideally we write ++i although I think most compilers will realize the temporary return value from the postincrement version is unused and remove it.

    Both these should be reflex action for a C++ programmer, always be eager to use const and always eager to use const references. Always use pre-increment and pre-decrement unless you specifically need the old value.

    It might make sense to

    return string();

    rather than

    return “”;

    if this function is being called billions of times with unknown peoples names.

  51. const string& find_addr(const list& emps, cont string& name ) {
    for( auto i = begin(emps); i != end(emps); i++ ) {
    if( *i == name ) {
    return i->addr;
    }
    }
    static const string empty_string;
    return empty_string;
    }

Comments are closed.