GotW #100: Compilation Firewalls

image

JG Questions

1. What is the Pimpl Idiom, and why is it useful?

Guru Questions

2. What is the best way to express the basic Pimpl Idiom in C++11?

3. What parts of the class should go into the impl object? Some potential options include:

  • put all private data (but not functions) into impl;
  • put all private members into impl;
  • put all private and protected members into impl;
  • put all private nonvirtual members into impl;
  • put everything into impl, and write the public class itself as only the public interface, each implemented as a simple forwarding function (a handle/body variant).

What are the advantages/drawbacks of each? How would you choose among them?

4. Does the impl require a back pointer to the public object? If yes, what is the best way to provide it? If not, why not?

23 thoughts on “GotW #100: Compilation Firewalls

  1. If memory allocations are a bottleneck, you can use a custom allocator in the internal unique_ptr.

  2. Simple comments (I am sorry, I don’t have the time for a full-blown answer):

    1) I would note that the compilation firewall goal is different than the ABI compatibility goal. We are ready to pay the cost of compilation when adding a virtual method in the former case, even if it would run the risk of breaking the ABI.

    2) Implementation using a `unique_ptr`, with out of line definition of all the special members (= default is awesome)

    3) Anything that is private, *apart from virtual functions* goes into the Impl. protected elements and virtual elements are logically part of the interface as far as derived classes are concerned, and “lifting” them does not solve anything. Of course, they should still delegate to the Impl class.

    4) A back pointer (if necessary) should not be embedded, it cripples the automated move generation. It’s simpler to have the main class pass a pointer to itself during forwarding when it’s actually necessary.

  3. 2) unique_ptr.
    Interestingly enough, C++11 allows you to do this:

    struct foo { ~foo(); };
    foo::~foo() = default;

    The interesting part is that this “body” is semantically equivalent and longer to type than just {}. (Yes, it really is fully semantically equivalent.)

    3+4) Putting only the data in the impl has the advantage that you never need a back pointer, since all code runs with ‘this’ being the outer object. The disadvantage is that adding private members to the outer class will force a recompile on clients (even though binary compatibility would be preserved).
    Putting private functions in the impl has the advantage that you can add as many as you want without forcing a recompile; it has the disadvantage of needing a back pointer when I want to call some public function that expects the type (or a supertype) of the outer class, as well as when I want to call a public function or a virtual function on the outer class.
    Putting all functions in the impl with the outer class only containing wrappers is strictly better (except for the repetitive code) than putting only the private functions there, since it means you can call other public functions without a back pointer.
    This leaves back pointers necessary in two situations:
    1) When passing ‘this’ to somewhere else.
    2) When calling virtual functions.
    Note that putting virtual functions into the impl class is not an option. Overriding classes would have to override the impl class for this to work, and you really, really don’t want to expose the impl class to overriders.

  4. The question of how to handle protected members is an interesting one. Pimpl and inheritance are a mix that I’ve never had to use. I suspect that the “right” answer depends highly on context. Random thoughts:

    1. If you find yourself wanting to do inheritance in the “handle” class, using an abstract base class could be a better solution.
    2. If you find yourself wanting to do inheritance in the “body” class, consider that this might be an envelope/letter situation rather than handle/body.
    3. If you find yourself wanting to do both, that this might be a bridge pattern.

  5. The answers to this question (and the whole pimpl idiom) is strongly related to runtime in-memory characteristics of objects (and vtables).

    #1 )Let’s actually answer #1 as the right answer helps with the other ones. The pimpl idiom is a workaround for the fact that usage of a class by value needs to know the object size. While this helps with recompilation the main purpose is to maintain stable ABI’s for libraries.

    #2 )Best way to express the idiom: “unique_ptr pimpl;”. PIMPLTYPE is forward declared in the header, but specified in the source file.

    #3 )What should go into the pimpl object:
    – First of all, private virtual functions make no sense (unless you have friend declarations).
    – Non-virtual functions are not called through the vtable (but through normal symbol resolution) so adding one anywhere in a class does not matter.
    – You may not want to expose private methods to usage outside the unit of compilation. As doing so through declaring them on the pimpl also avoids unneeded indirection (pimpl->member) that is a straightforward approach.
    – However, using non-member private functions with the pimpl as the first parameter in the anonymous namespace would also be a solution. (needs more “friend” delarations though, unless you declare the members of the pimpl as public (which should not be an issue))
    – Private implementation means private, not protected. While it may be possible (with pain) to make the private implementation protected that is a whole other can of worms and probably requires multiple header files, so protected members have no place in a pimpl (as the pimpl itself should be private, and no child could access the pimpl to get to the protected members).
    – As pimpls are private, they should not have virtual members.
    – For inheritance, each child that wanted to add members would have its own pimpl. The child would use the class through the public part only, and only call public or protected (virtual) methods on the public class. For public (or protected) methods, they are part of the API (and ABI) and their implementation never is. Therefore there is no need to forward their implementation to the pimpl (if optimizing make a local variable containing the pimpl, so you don’t need this->pimpl->member, just lPimpl->member).
    – Forwarding (public / protected) methods to the pimpl is possible but induces errors in case of inheritance.

    #4 ) Back pointers.
    – Back pointers are needed for classes with public/protected virtual methods. As any child can only implement them on the public class whenever their functionality is needed, they are called by the implementation in the public parent class (even if they just forward to the pimpl, that’s why you don’t want two methods to chose from, one in the pimpl and one in the public class).
    – Any method on the pimpl that needs to invoke a method on the public class (any virtual method) must do so through a pointer (or pointer wrapper) to the public class.
    – As said by others, there are two ways to get a pointer to the public class. The first one is as a member attribute of the pimpl (with corresponding possible issues regarding move semantics, stale pointers etc.). The second one is as a parameter to any method on the pimpl that needs to (indirectly) call such a virtual method on the public class.
    – If the pimpl does not contain methods, only attributes (which is completely acceptable from the point of view of what a pimpl does, as private nonvirtual methods do not change ABI) then obviously there is no need for back pointers (or related headaches) at all.

  6. I haven’t looked at the new language spec so may be talking out of turn here.

    In good ol’ c++ use of smart pointers for pimpl ran the risk of emitting a ‘delete p’ before the type of p had been defined which is either undefined or defined not to do anything (can’t remember which). This is often caught at compile time by methods such as boost’s checked_delete().

    Has this been fixed in the new C++?

  7. I see what you mean Ben. I was kinda assuming that whenever you involve pointers in your class, you pretty much have to supply your own copy/move constructor/assignment. Thanks for the explanation.

  8. And now I go back and see that all of my template brackets have been eaten :(

    Hopefully the intent is clear enough without them.

  9. Re: Roman Kutlak
    Say I have this setup:
    class Outer {
    std::unique_ptr pimpl_;
    };
    class Inner {
    std::weak_ptr backPtr_;
    };

    Then, I have some code like this:
    Outer make_outer()
    {
    Outer old_outer;
    return old_outer;
    }
    //…
    current_outer = make_outer();

    That code will use the move assignment operator. When backPtr was initially created, it pointed at old_outer. After the assignment to current_outer, the backPtr is now pointing at a “dead” object. This can be fixed by writing your own move constructor and move assignment operator, and having those operations dig into the Inner implementation to “patch up” backPtr, but it is more effort. You can’t just use the defaults anymore.

  10. Hello guys, I am a newbie so this answer is basically just testing my understanding of the concepts. Any comments highly appreciated!

    1. My understanding of pimpl was that it’s main purpose is to protect the client from recompilation of the code when you change some class’s internal working. The indirection (pointer to implementation) shields the clients of the class from internal changes. It is not supposed to be a replacement for a pure virtual interface (not implying that it can’t be).

    2. I would use std::unique_ptr.

    3. This in turn suggests that private (and protected I guess) members should be in the pimpl and only the public interface can be in the actual class. I think that for the firewall to work as intended the class using the pimpl should provide forwarding functions (for the public part) that call the actual function implemented in the pimpl and the pimpl class should essentially be the intended class (i.e. implement all functions).

    I am not sure how this would work with the inheritance. Is it why one would need the back pointer?

    4. If you need pointer to the main class, use std::weak_ptr. (Not sure I understood Ben Craig’s argument about “dead” wrapper object – the pimpl will get destroyed when the object dies so this should not happen, right?)

  11. 2. I prefer unique_ptr, but that does not compile cleanly on warning level 4 for classes exported from a dll. Quite annoying.

    3.
    a) put all private data (but not functions) into impl;
    + Functions are declared in a single class. You do not have to look in two different classes to find function implementation.
    – Need to expose function parameter types in the header file. This may cause you to include more header files than necessary.
    – Implementation details are exposed.

    b) put all private members into impl;
    + Gives the cleanest class signature, hiding everything that can be hidden.
    + Gives the lowest number of header dependencies in the class header.
    – Template function pattern can not be implemented.

    c) put all private and protected members into impl;
    – Protected members have to be exposed in the class declaration, or they are not accessible by derived types.
    (+) Avoids use of protected members.

    d) put all private nonvirtual members into impl;
    + See b)
    + Template function pattern can be implemented.

    e) put everything into impl, and write the public class itself as only the public interface, each implemented as a simple forwarding function (a handle/body variant).
    + Can be nice if you want to isolate/highlight certain functionality of a class. Implementation of the monitor object can be done by putting the lock acquisition/release in the “forwarding” function, and forward everything else to the pimpl.
    – Duplicates every function, produces a lot of code that does not give any value.
    – Makes refactoring cumbersome.

    As a side note, what made the difference for me was to call the impl object “m”, so that instead of accessing member variables by impl->foo, i just type m->foo. Then it looks very similar to how I usually spell member variables: m_foo.

  12. I am with Dave here: One should decide whether an interface (pure abstract class) or a pimpl/forwarder combination suites better. As a rule of thumb I am using the first whenever binary safety and/or seamless interoperability with other platforms is required. Clearly, some sort of infrastructure (factories, kind of ‘IUnknown’, etc) is required then. I am using the latter, whenever I just want to hide implementation details inside of a library or inside of a statically linked program, where virtual function calls and factories would be unnecessary overhead.

  13. Is the pimpl idiom something that should be encouraged?
    Why not use a pure virtual class, as an interface?

    I’ve never seen a good use of this idiom, when its needed, its probably a code smell.

  14. (2) TLDR; std::unique_ptr

    (3) TLDR; I move private non-virtual methods+data.

    I don’t use protected members and try to minimize protected methods.
    Private non-virtual methods can (should?) be detached and written in an anonymous namespace anyway (there are exceptions) so adding them to the Impl class is just for convenience (so you don’t need to befriend or pass references around).
    Any time I use public data, I’m not using pimpl.
    I wouldn’t ever put everything into the pimpl — and when I use handle/body idiom that way, I don’t call it Pimpl since the body is not “mine”. I would call that proxy or delegate. (And I do think it would be interesting to have an easier way to implement proxy classes in c++..)

    (4) TLDR; I pass back pointer as the first argument to those methods that need it (hopefully rarely).

    First off, I don’t waste space storing a back pointer.
    Second, and more importantly, I don’t need to write a custom move constructor for my “Handle” classes this way. If you store back pointer, you have to manage when your Handle moves (yuk) (and if you are using shared_ptr (for whatever reason), this should help too)!
    Either way, in my experience it is rare to need back pointers — unless your classes have grown monolithic ;)

  15. I am now realizing that back pointers with a std::unique_ptr have the same problem as those with a std::shared_ptr. The back pointer could point to a moved from, and already destroyed object. I think that a const std::unique_ptr would be the pointer of choice in situations where you want a back pointer. I don’t think the language allows a const std::unique_ptr to be moved from.

  16. On 3/4, I think there’s a nice dovetailing of hardline pimpl (everything but forwarders in the impl) and what used to be called the “non-virtual interface” idiom. (I haven’t seen it mentioned for years, so maybe it’s called something else now.)

    All virtual functions are non-public, and can thus be put into the impl, probably removing any need for a backreference. The public forwarder functions do argument/precondition/postcondition checking, tidily removing the risk of duplication or inconsistency across base and override implementations.

    In practice I’m not wild about pimpl, mostly because of the extra allocations involved. In an ideal world, better encapsulation wouldn’t cost you performance.

  17. On further reflection, I am pretty sure you can leave all of the “automatic” copy and move operations alone, and just write an out of line, trivial destructor, like so:

    PublicInterface::~PublicInterface() {}

    You need the out of line destructor because the destructor eventually calls delete, and you don’t want to do that on an incomplete type. You don’t need the move operations because those just twiddle pointers, and that is safe to do on an incomplete type.

  18. Skipping the JG question, Herb already has chapters on that :)

    2. I’m not sure how “basic” basic is. I see two main options.
    a) std::unique_ptr
    b) std::shared_ptr
    Most of my classes would use unique_ptr. With shared_ptr, I had to “out of line” all of my automatic functions (copy ctor, assignment op). I suspect I have to do the same thing for unique_ptr, except I would probably delete the copy variants and default the move variants. So some basic code:

    class PublicInterfaceImpl;

    class PublicInterface {
    public:
    PublicInterface();
    PublicInterface(const PublicInterface &) = delete;
    PublicInterface &operator =(const PublicInterface &) = delete;
    PublicInterface(PublicInterface &&) = default; //maybe? I don’t recall if you can do an =default on the definition in a cpp file.
    PublicInterface &operator =(PublicInterface &&) = default; //same concern here
    ~PublicInterface() = default;
    private:
    std::unique_ptr pimpl_;
    };
    //cpp
    class PublicInterfaceImpl { /* … */ };
    PublicInterface::PublicInterface() : pimpl_(new PublicInterfaceImpl) {}

    3. I almost always put everything into the impl except for the public interface, and write a bunch of forwarding functions. This gives maximum protection. One downside is the amount of boilerplate. It also doesn’t play nice with protected members, but I’m not a huge fan of protected members other than destructors. You have one extra indirection with this scheme, but your compiler should still be able to inline a lot of it away within your cpp.

    a) Putting private data into the pimpl is easy and straightforward, and you get “size” protection, but you can still cause recompiles because of new helper private methods.
    b) Only putting private stuff in the impl often causes awkward splits in functionality, but you cause fewer recompiles this way. Also, you can’t get rid of the private virtual functions in the outer / wrapper class.
    c) There isn’t a good reason to put protected members in the pimpl
    d) The “template” Gang of Four pattern likes having private virtual functions. One way or another, virtual functions are part of the public interface, even if they aren’t in “public” visibility. So you can’t get rid of the private virtual functions. You still get some of the awkward splits that I mentioned in b) above though.
    e) This is my preferred solution.

    4. When implementing the GoF “template” pattern as a pimpl, you generally need a back pointer to actually call the virtual function. If my pimpl pointer is a std::unique_ptr, I would use a non-owning raw pointer as my back pointer. If my pimpl pointer is a std::shared_ptr, I would probably make a sad face. I can’t use a std::shared_ptr as that creates a cycle. I don’t want to use a std::weak_ptr or raw pointer, because it could point at a “dead” wrapper object, even when there are other “live” object wrappers.. So yeah, std::shared_ptr pimpls and back pointers probably don’t mix.

  19. The name “compilation firewall” tells you that it has to do with tucking away some ugly things, such as the zillions of macros in Microsoft’s , or say a C header that uses the name “class”. Thus what to put in the implementation, while often correlating with the public/private distinction, is fundamentally an independent issue. In the implementation class, put anything that depends on (for example) the ugly header, then add whatever’s natural.

    By the way, nice that you’re doing GOTW’s again, and in particular this one; there’s less chance of recommending “auto_ptr” this time since it’s now deprecated. :-)

    Instead of a smart pointer I’d just make the visible class non-copyable, and implement constructor and destructor.

    Cheers & hth.

    – Alf

Comments are closed.