GotW #91: Smart Pointer Parameters

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #105.

 

How should you prefer to pass smart pointers, and why?

 

Problem

JG Question

1. What are the performance implications of the following function declaration? Explain.

void f( shared_ptr<widget> );

Guru Questions

2. What are the correctness implications of the function declaration in #1? Explain with clear examples.

3. A colleague is writing a function f that takes an existing object of type widget as a required input-only parameter, and trying to decide among the following basic ways to take the parameter (omitting const):

void f( widget* );              (a)
void f( widget& ); (b)
void f( unique_ptr<widget> ); (c)
void f( unique_ptr<widget>& ); (d)
void f( shared_ptr<widget> ); (e)
void f( shared_ptr<widget>& ); (f)

Under what circumstances is each appropriate? Explain your answer, including where const should or should not be added anywhere in the parameter type.

(There are other ways to pass the parameter, but we will consider only the ones shown above.)

14 thoughts on “GotW #91: Smart Pointer Parameters

  1. Even if i’m away a few years of C++ hardcore programming, i’d guess the first one (#1) has implications due fact that it’s a template, thus you do not now (first) how many cocmplex stuff is created in the templated class’ constructor, second because its a parameter-by-value, resulting in bad stuff in the C++ universe, depending on deep/flat copy of the object?
    Nontheless, a good interview questsion :-)

  2. Sorry for off topic.
    But what about VS 2013 and C++11 support?
    Does this mean that Nov. CTP C++ compiler is now part of VS 2013 and thre will be no update for VS 2012 ?
    *dreaming* Or will VS 2013 finally get full C++11 support ?

  3. @David Thornley: “If the idea is to make sure the Widget goes away, it might be just as useful to have the unique_ptr in the calling function, and call f using (a) or (b), since this will save the small amount of time to copy the unique_ptr.”

    If you do that, you delay the destruction of widget after the end of the calling scope, what might not be what you wanted, and also, you save nothing, as a unique_ptr is just a zero overhead wrapper around a plain old pointer, copying (or rather, moving/creating, as the whole idea is to disable copy) costs exactly the same as copying the plain pointer. You might say that “but I also need to set the source to nullptr on move!” Well, check the disassembly after an optimized build, you probably will see this being skipped.

  4. 3. (a) and (b) are the simplest ways to pass the Widget in. Since this is a required read-only parameter, the ability of (a)

    void f(const Widget const *)

    to pass

    nullptr

    in is irrelevant, and its only advantage is making it very obvious that there’s indirection going on, which is usually not all that important. It might be useful as a reminder if the programmer wants to be reminded to microoptimize memory references, but this is almost always pointless. One potential disadvantage is that somebody could put a

    delete widget;

    in the function, but if the shop uses the coding standard that a raw pointer never owns, that should be caught in code review. (b)

    void f(const & Widget)

    reads better, captures the fact that the Widget is required, and doesn’t lend itself to accidental deletion.

    I have been unable to imagine any use for (d) and (f). Semantically, a reference is another name for something, and (without bothering to try figuring it out) it seems to me that having two names for a unique_ptr or a shared_ptr is a bad idea. Over the years, I’ve learned to stay out of the dark corners of the C++ standard, because things lurk there that bite.

    (c)

    void f(unique_ptr<const Widget>)

    unequivocally transfers ownership. This is the right thing to use if and only if that’s what’s desired. For example, we may want to create a Widget and transfer it to a wrapper that is the Widget’s only link to the outside world, possibly if we’re writing a multiplatform interface or otherwise need consistent calls to inconsistent Widgets. If the idea is to make sure the Widget goes away, it might be just as useful to have the unique_ptr in the calling function, and call f using (a) or (b), since this will save the small amount of time to copy the unique_ptr.

    (e)

     void f(shared_ptr<const Widget>)

    gives f shared ownership. This is useful if and only if the Widget might go away while the Widget is still in use. The obvious example would be f copying the Widget to a class data member, since the class may well outlive the calling function.

    If the program is single-threaded, or the calling function calls f in a thread that it later necessarily joins, f’s execution is going to be over before the calling function returns, and there’s no real need to pass a unique_ptr or shared_ptr unless f transfers its ownership share to a longer-lasting variable or object. If f is called in a thread that may outlive the calling function, then it needs at least partial ownership, and so (c) or (e) is necessary.

    Since the Widget is an input parameter, it should not be changed, and therefore every example of Widget should be replaced by

    const Widget

    or

    Widget const

    . This means that the calling function can pass a const Widget to f, and serves as a reminder that the Widget should not be changed. In (a), the pointer itself can be a const, and that establishes the pointer itself as input-only.

  5. It would be (a) or (b) for no other reason than the need for compatibility with as many c++ compilers as possible. Maybe not the correct Guru/cool answer but one that is a practical real world answer!

    Even just targeting latest there is still the advantage of keeping things simple and familar unless there are overriding reasons for extra complexity. The KISS principle should be stamped on every engineers forehead…..

  6. 1. Performance: There’s a little performance overhead of calling the function with

     shared_ptr<widget> [\code] compared to using [code] const widget & 

    as parameter type. The function call is probably the most expensive thing for the caller here. However, passing a

     shared_ptr 

    requires a thread-safe counter increment and a decrement at destruction which is definite overhead. Also, if the caller has a widget reference, but no shared_ptr to that widget, then a shared_ptr with a null deleter must be created unnecessarily. This requires another memory allocation for the deleter and reference counter, which can be expensive on the calling side.

    2. Correctness: Usually a function with this signature would only need a

     shared_ptr 

    , if it takes shared ownership of it or needs to pass shared ownership to someone else. This would only be useful, if the function stores a

     shared_ptr 

    or a

     weak_ptr 

    in some kind of global variable. If that isn't the case (and it most often isn't), the the function should prefer to take a

     widget & 

    or

     const widget & 

    . Even if there is a factory function like

    shared_ptr<widget> load_widget( widget::id id );
    

    then a const reference would still make sense, since

    f( *load_widget( id ) );
    

    would still work. The code would also still work and be correct, if the return type of the

     load_widget() 

    function would be changed to

     unique_ptr<widget> 

    .

    3. Since the parameter widget is an input-only parameter and hence should not be modified, the type

     widget 

    should be replaced by

     const widget 

    in all cases, including template argument types. Now to the question what each declaration would mean:

     
    void f( const widget* );              (a)
    void f( const widget& );              (b)
    

    This should be prefered, since the function does not need to take ownership of the widget (at least most typically). Still (a) and (b) differ in the fact that (a) can be passed a null pointer to. Yes, it is possible to pass null references with code like this

    widget * p = nullptr;
    f( *p );
    

    but normally people don't do that. Remember, we should write code that is easy to use correctly and hard to use incorrectly. Also (b) has the advantage, that a temporary object can be passed. For example

    f( create_widget() );
    

    which wouldn't be possible with (a).

    Now let's consider

    void f( unique_ptr<const widget> );   (c)
    void f( unique_ptr<const widget>& );  (d)
    

    In case (c) the function

     f 

    will eat the widget. This unnessesarily restricts the caller, since the caller cannot do anything with the widget afterwards. The same effect could be accomplished with the function declaration (b) by writing

    unique_ptr<widget> p = ...;
    /* ... */
    f( *p );
    p.reset();
    

    The only advantage is that the callee (c) could use the unique_ptr to the const widget somehow where only a

     unique_ptr<const widget> 

    can be used. However this is highly unlikely. The same applies to (d) only that it MIGHT eat the unique_ptr to the const widget. That would be even worse, since it's not clear from the declaration when this happens. The most surprising thing yet is that the callee can change the pointer so that it points to another object afterwards. Ouch! Hence we need to add another const to the declaration to prevent this:

    void f( const unique_ptr<const widget>& );
    

    The unique_ptr cannot be moved then. But there's no advantage of passing a unique_ptr then over variant (a). It just makes the code a little more obscure.

    Last but not least, let's consider

    void f( shared_ptr<const widget> );   (e)
    void f( shared_ptr<const widget>& );  (f)
    

    I already discussed (e) in 2. In (f) incrementing the reference counter can be prevented. However, as in (d), the callee may change the pointer, hence we should add another const:

     
    void f( const shared_ptr<const widget>& );
    

    Now this is almost the same as (e). Again, shared_ptr implies that shared ownership is being used. That is almost never necessary for simple input arguments. It's unnecessarily demanding to the caller to create a shared_ptr, if the shared ownership isn't even used. The parameter type is much more complicated compared to (a) and (b) and there's no significant advantage to (a). And (b) is probably even better than (a).

    Bottom line: Prefer plain pointers and references as function paramter types unless unique or shared ownership must be transferred. If ownership should be transfered pass a unique_ptr or shared_ptr by value. This clearly states the intent in the declaration and it is easy to read and efficient.

    I would prefer to use (b) as a function declaration, if the parameter is not optional. Otherwise, I would use (a) and pass a nullptr, if I want to omit the argument.

  7. There’s extra shared count manipulation in

    void f( shared_ptr<widget> );

    . C++98 way would be to use

    void f( const shared_ptr<widget>& );

    . I’m still not familiar with C++11, but I suspect it will be something with two ampersands :-), like

    void f( shared_ptr<widget>&& );

    . That would handle possible extra copy and shared count update in

     f( make_shared<widget>() ) 

    I would think of

    void f( shared_ptr<widget> );

    variants as equivalent to

     void f( widget* );

    whereas

    void f( unique_ptr<widget> );

    is equivalent to

    void f( widget*& pw ) {...; delete pw; pw=nullptr;} ;

    . So, for the question 3, I would use

     f( unique_ptr<widget> ); 

    if f is a sink, ie we will not use that pointer again, and variant from question 1 otherwise. Also, I think that it only makes sense passing unique_ptr parameter by value.

  8. Maybe this should be renamed to GotD (for Guru of the Day). I can’t almost keep up with so much good material. Thanks!

  9. The difference between (c) and (d) is that (c) MUST take ownership of the object (or possibly destroy it), whereas (d) might not touch it. Use (d) if the function needs to think about whether or not to take ownership.

    There is a similar difference between (e) and (f). (e) says that f will own a copy, and (f) says it might or might not.

  10. 1.

    void f( shared_ptr&lt;widget&gt; );

    This will make a copy of the shared_ptr, which will only increment the reference counter, which is, unless I’m wrong, an atomic variable (to make sure two threads could safely copy shared_ptrs pointing to the same object, as they would copy raw pointers).

    The thing is, we usually don’t even want to pay this cost, so, as we know the object is owned by the previous stack level, we can instead pass the shared_ptr object by const reference, which changes (almost… see below) nothing to the semantics, and makes the parameter passing costless.

    2. What are the correctness implications of the function declaration in #1? Explain with clear examples.

    The pointed widget object is owned by the stack at two levels: f, and its caller, which can be overkill in most codes.

    Also, as f has shared ownership, it means it is ok for this function to share it again with other parts of the code (which opens the way to ownership cycles if not monitored).

    3.

    void f( widget* ); (a)
    void f( widget&amp; ); (b)
    void f( unique_ptr&lt;widget&gt; ); (c)
    void f( unique_ptr&lt;widget&gt;&amp; ); (d)
    void f( shared_ptr&lt;widget&gt; ); (e)
    void f( shared_ptr&lt;widget&gt;&amp; ); (f)

    (a) and (b) are cases where f is not given ownership to the widget parameter, so it can’t, in turn, give or share it.

    (a) means the parameter being nullptr is a valid value, so f is expected to behave.
    (b) means the parameter is supposed to always exist, so in case of playing with pointers, the caller is supposed to verify that before dereferencing the pointer and calling f.

    (c) has a pure ownership transfer semantic: f will take over the resource, no negociations. The caller must know that its own unique_ptr (or resource) won’t be owned by it anymore the moment f is called.

    (d) has conditional ownhership transfer IF the unique_ptr is not const. That is, the function fa can decide (or not) to take ownership of the resource, and thus empty the pointer. IF the unique_ptr is const, then this is just passing a smart pointer parameter, without transfer of ownerhisp (similar to case a)

    (e) has ownership sharing semantics. This means that both the caller and f will share the resource… And potentially other parts of the code, as f can freely share it, too. Having this kind of prototypes means your object is out in the wild… Isn’t that where we should wonder about potential ownership cycles that will leak memory? (and where weak_ptr could help break them?) IF the shared_ptr is const, it changes nothing (I see right now the copy constructor of shared_ptr accepts “const shared_ptr &”). Unless I’m wrong, the shared_ptr could be constructed from another shared_ptr of a compatible type (derived-to-base conversion)

    (f) IF the shared_ptr is const, this is like e, but we avoid one counter increment. The const reference means that, the shared_ptr have again derived-to-base conversion. If the parameter is passed as non-const reference, then no conversion is possible… and we fall back into e’s case.

  11. (a) and (b) are likely to cause holy wars. If there is any kind of transfer of ownership (including passing off to other threads), then (a) and (b) are not appropriate. If (a) and (b) don’t change ownership, then either can be fine. Many people prefer (a) to (b), because it makes it more obvious at the call site that a level of indirection is involved. Other people will prefer (b), because it makes it very difficult to have NULL pointers.

    If f() is a “sink” function, then (c) is appropriate. By “sink”, I mean a function that accepts a resource, and destroys it as part of its execution. const unique_ptr ref doesn’t buy much, but if your code stores the resource as a unique_ptr, and you have lots of calls to f(), then you can avoid some syntactic overhead by passing const unique_ptr & instead of widget*.

    Similar arguments for (c) and (d) apply to (e) and (f), except instead of discussing “sinks”, we get to discuss extra owners. If f() needs to become an additional owner for the widget, then (e) is appropriate. Taking the shared_ptr by value here is preferable to taking a const shared_ptr ref and copy constructing a shared_ptr, as the value parameter may get to take advantage of move semantics (perhaps from a make_shared call). (f) has roughly the same semantics as (a) and (b), and like const unique_ptr &, it can make sense when you have lots of callers of your function. (f) makes a little more sense than (d) though, as you can decide several layers down that you need a copy / additional owner of your widget, and (f) permits that. With a unique_ptr, you can’t really decide several layers down that you should move from your const unique_ptr &.

    One of the downsides of (d) and (f) is that these signatures often overspecify your needs. (a) and (b) can accept widgets from unique_ptr and shared_ptr, but (d) can only (easily) accept widgets from unique_ptr. (f) can accept a widget from a unique_ptr, but it “steals” the widget, so it only works well with other shared_ptrs.

    There was a “dumb_ptr” spec at one point, but I haven’t followed standardization enough to know if it got accepted to C++14. I’m guessing not, as it would probably be mentioned by now if it had.

Comments are closed.