GotW #90: Factories

NOTE: Last year, I posted three new GotWs numbered #103-105. I decided leaving a gap in the numbers wasn’t best after all, so I am renumbering them to #89-91 to continue the sequence. Here is the updated version of what was GotW #104.

 

What should factory functions return, and why?

 

Problem

While spelunking through the code of a new project you recently joined, you find the following factory function declaration:

widget* load_widget( widget::id desired );

JG Question

1. What’s wrong with this return type?

Guru Questions

2. Assuming that widget is a polymorphic type, what is the recommended return type? Explain your answer, including any tradeoffs.

3. You’d like to actually change the return type to match the recommendation in #2, but at first you worry about breaking source compatibility with existing calling code; recompiling existing callers is fine, but having to go change them all is not. Then you have an “aha!” moment, realizing that this is a fairly new project and all of your calling code is written using modern C++ idioms, and you go ahead and change the return type without fear, knowing it will require few or no code changes to callers. What makes you so confident?

4. If widget is not a polymorphic type, what is the recommended return type? Explain.

16 thoughts on “GotW #90: Factories

  1. 1. The problem is it’s a raw pointer. The pointer is, presumably, dynamically allocated and it is not clear who is responsible for its deletion.

    2. Because it’s a widget I would say std::shared_ptr is the best. This way we could pass it to other classes, functions, which we couldn’t do with std::unique_ptr. If we weren’t meant to pass it around std::unique_ptr would be preferable.

    3. Because the callers are all using

    auto w = load_widget(some_ID)

    And because std::unique_ptr and std::shared_ptr both are written to be used like a pointer would it wouldn’t make a difference anywhere (except we couldn’t use delete on them!).

    4. Probably the recommended type would be a straight up object, and we could move it out of the function. We could make the load_widget be a template function over the type of widget we wish to create:

    template <typename WidgetType>
    WidgetType load_widget()

    Because the type is not polymorphic there is no reason to bind it to a pointer to a base object.

  2. Even though I brought it up earlier, why is using auto the preferred style in this situation? I prefer to stick with auto only when dealing with nested dependent typenames (in which cases writing out the type probably doesn’t add any clarity, if it’s even possible to do) and lambdas.

    Plus don’t you WANT to be made aware if somebody changed the underlying return type from a raw pointer to a smart pointer? Especially since (as other’s pointed out), you probably should have a delete somewhere…

  3. Pretty good answers sofar, regarding polymorphism and ownership. I would add a question :

    Does the absence of “noexcept” tell the caller that failure management is performed through throwing an exception? In that case, all “widget*” solutions should be turned into “widget&” since the normal course of things guarantees that an object will be returned. And the smart pointers cannot be null, so they can be used safely without checking.

    IOW, any widget* solution should also say “noexcept”.

  4. I think Ben Craig’s initial instinct was more right.
    If callers currently use unique_ptr, code just keeps working.
    If callers currently use auto, then the previous version would deduce a raw pointer for that, and callers would have an explicit delete somewhere, which doesn’t work on unique_ptr. So you’d still have to change ever caller.

  5. I Agree with Ben except for point 3.
    #3.

    load_widtget

    must return

    std::unique_ptr<widget>

    where widget is an pure abstract class.

  6. For #3 if callers are using auto, they are using delete as well and you still have to remove them all.

  7. 1. It. Returns. A pointer. Nothing here says if this pointer is to some internal data, or static variable, or a new-allocated structure or (even!) to a malloc-allocated structure. What are we supposed to do with it? We are supposed to read the source code. That breaks the entire encapsulation thing factories were originally ment to bring. And with all the variants I supposed, none of them really needs a pointer returned.

    Too bad LLVM exploits exactly this approach a lot.
    2. If it is a pointer to some static, global or internal state (that is not really what a factory should do, but it happens), pass it by reference. It is the single unambigious way to do it without any tradeoffs. Use a reference to const if you don’t want users to change it.

    If it is a real newly-allocated polymorphic pointer, use std::unique_ptr. It has minimal tradeoffs (you just actually need the new-allocation) and fully expresses your intent. Even if some guy comes buy and calls release(), it’s not your fault. If someone needs to share it, he can always create a shared_ptr manually.

    If you need to share the pointer (for example, you keep track of all values elsewhere or your method is supposed to return an exact same pointer for a sequence of calls), use the shared_ptr. You can also consider using this if you are stuck with a need to create a uniform interface for both new-allocated, static and need-to-be-deleted-by-other-means values (that actually happened to me) and provide a dynamic deleter for each with a lambda. unique_ptr cannot do that as its deleter is statically-bounded.

    4. If it is not really a polymorphic type, just return it by value and let compiler handle the rest unless you know for sure that the type cannot be efficiently move-constructed. Then we are back at #3.

  8. Re #4, the recommendation of returning Widget by value assumes that (N)RVO will kick-in or Widget’s move semantics are more optimized than its copy semantics. However, if Widget is composed of non-movable members (e.g. an aggregate/POD-type) then move will be no better than copy, so if RVO does not apply then the cost of copying Widget may be greater than new-ing up a Widget* and returning it via unique_ptr.

  9. WRT #4, I have had plenty of cases where a type was not polymorphic but did have an inheritance tree. The object factory returns a shared_ptr to the base for the reasons I described.

    If there is no sublcass of Widget, then, yes, I’d probably return a simple object.

  10. For #4 you just return a Widget. If the class in non-polymorphic there is very little reason to go through the expense of a new and delete.

  11. After reading other people’s comments, I’ll agree with what I said, except for (3) :). Auto is the more likely answer, with unique_ptr being a runner up.

  12. for #4 you return a shared_ptr. This is because the shared_ptr holds a pointer to the destruction function which is determined at construction time. So, if the code creates a WidgetSubclass, then when the shared_ptr reference count goes to 0, the destructor function is called, and that function is ~WidgetSubclass, even if widget* is not polymorphic.

  13. For a hierarchy of widget classes, the overridden load_widget function can return the specific widget type. The requirements for a return type by overridden functions is that they return covariant types (see n3242.pdf section 10.3 #7)

  14. I agree with most what Ben Craig said, except with answer to 3.

    3. What makes me so confident is the fact that in most places callers use `auto` instead of explicit types.

  15. I think Ben has it right, though I wonder if for (3) Herb meant that callers would be using:

    auto my_widget = load_widget(42);
    

    And that’s why it wouldn’t matter.

  16. 1. Using a raw pointer doesn’t express the ownership semantics of the widget resource. Owning raw pointers should be avoided.

    2. If widget is polymorphic, then

    unique_ptr<widget>

    is preferred. unique_ptr is preferable to a raw pointer or raw reference because of the automatic cleanup, but you still need some kind of pointer-like type for virtual functions to work.

    3. If all callers were using unique_ptr already, then changing the return type from widget* to unique_ptr will not require source changes in client code. The unique_ptr constructor can take a returned unique_ptr just fine thanks to the move constructor.

    4. If widget isn’t polymorphic, you should just return a widget value, and let move constructors do their thing. This lets you avoid the heap.

Comments are closed.