Now that the unnecessary headers have been removed, it’s time for Phase 2: How can you limit dependencies on the internals of a class?
Problem
JG Questions
1. What does private mean for a class member in C++?
2. Why does changing the private members of a type cause a recompilation?
Guru Question
3. Below is how the header from the previous Item looks after the initial cleanup pass. What further #includes could be removed if we made some suitable changes, and how?
This time, you may make changes to X as long as X‘s base classes and its public interface remain unchanged; any current code that already uses X should not be affected beyond requiring a simple recompilation.
// x.h: sans gratuitous headers
//
#include <iosfwd>
#include <list>
// None of A, B, C, or D are templates.
// Only A and C have virtual functions.
#include "a.h" // class A
#include "b.h" // class B
#include "c.h" // class C
#include "d.h" // class D
class E;
class X : public A, private B {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;
private:
std::list<C> clist;
D d;
};
std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}
Solution
1. What does private mean for a class member in C++?
It means that outside code cannot access that member. Specifically, it cannot name it or call it.
For example, given this class:
class widget {
public:
void f() { }
private:
void f(int) { }
int i;
};
Outside code cannot use the name of the private members:
int main() {
auto w = widget{};
w.f(); // ok
w.f(42); // error, cannot access name "f(int)"
w.i = 42; // error, cannot access name "i"
}
2. Why does changing the private members of a type cause a recompilation?
Because private data members can change the size of the object, and private member functions participate in overload resolution.
Note that accessibility is still safely enforced: Calling code still doesn’t get to use the private parts of the class. However, the compiler gets to know all about them at all times, including as it compiles the calling code. This does increase build coupling, but it’s for a deliberate reason: C++ has always been designed for efficiency, and a little-appreciated cornerstone of that is that C++ is designed to by default expose a type’s full implementation to the compiler in order to make aggressive optimization easier. It’s one of the fundamental reasons C++ is an efficient language.
3. What further #includes could be removed if we made some suitable changes, and how? … any current code that already uses X should not be affected beyond requiring a simple recompilation.
There are a few things we weren’t able to do in the previous problem:
- We had to leave a.h and b.h. We couldn’t get rid of these because X inherits from both A and B, and you always have to have full definitions for base classes so that the compiler can determine X‘s object size, virtual functions, and other fundamentals. (Can you anticipate how to remove one of these? Think about it: Which one can you remove, and why/how? The answer will come shortly.)
- We had to leave list, c.h and d.h. We couldn’t get rid of these right away because a list<C> and a D appear as private data members of X. Although C appears as neither a base class nor a member, it is being used to instantiate the list member, and some have compilers required that when you instantiate list<C> you be able to see the definition of C. (The standard doesn’t require a definition here, though, so even if the compiler you are currently using has this restriction, you can expect the restriction to go away over time.)
Now let’s talk about the beauty of Pimpls.
The Pimpl Idiom
C++ lets us easily encapsulate the private parts of a class from unauthorized access. Unfortunately, because of the header file approach inherited from C, it can take a little more work to encapsulate dependencies on a class’ privates.
“But,” you say, “the whole point of encapsulation is that the client code shouldn’t have to know or care about a class’ private implementation details, right?” Right, and in C++ the client code doesn’t need to know or care about access to a class’ privates (because unless it’s a friend it isn’t allowed any), but because the privates are visible in the header the client code does have to depend upon any types they mention. This coupling between the caller and the class’s internal details creates dependencies on both (re)compilation and binary layout.
How can we better insulate clients from a class’ private implementation details? One good way is to use a special form of the handle/body idiom, popularly called the Pimpl Idiom because of the intentionally pronounceable pimpl pointer, as a compilation firewall.
A Pimpl is just an opaque pointer (a pointer to a forward-declared, but undefined, helper class) used to hide the private members of a class. That is, instead of writing this:
// file widget.h
//
class widget {
// public and protected members
private:
// private members; whenever these change,
// all client code must be recompiled
};
We write instead:
// file widget.h
//
#include <memory>
class widget {
public:
widget();
~widget();
// public and protected members
private:
struct impl;
std::unique_ptr<impl> pimpl; // ptr to a forward-declared class
};
// file widget.cpp
//
#include "widget.h"
struct widget::impl {
// private members; fully hidden, can be
// changed at will without recompiling clients
};
widget::widget() : pimpl{ make_unique<widget::impl>(/*...*/) } { }
widget::~widget() =default;
Every widget object dynamically allocates its impl object. If you think of an object as a physical block, we’ve essentially lopped off a large chunk of the block and in its place left only “a little bump on the side”—the opaque pointer, or Pimpl. If copy and move are appropriate for your type, write those four operations to perform a deep copy that clones the impl state.
The major advantages of this idiom come from the fact that it breaks the caller’s dependency on the private details, including breaking both compile-time dependencies and binary dependencies:
- Types mentioned only in a class’ implementation need no longer be defined for client code, which can eliminate extra #includes and improve compile speeds.
- A class’ implementation can be changed—that is, private members can be freely added or removed—without recompiling client code. This is a useful technique for providing ABI-safety or binary compatibility, so that the client code is not dependent on the exact layout of the object.
The major costs of this idiom are in performance:
- Each construction/destruction must allocate/deallocate memory.
- Each access of a hidden member can require at least one extra indirection. (If the hidden member being accessed itself uses a back pointer to call a function in the visible class, there will be multiple indirections, but is usually easy to avoid needing a back pointer.)
And of course we’re replacing any removed headers with the <memory> header.
We’ll come back to these and other Pimpl issues in GotW #24. For now, in our example, there were three headers whose definitions were needed simply because they appeared as private members of X. If we instead restructure X to use a Pimpl, we can immediately make several further simplifications:
#include <list>
#include "c.h" // class C
#include "d.h" // class D
One of these headers (c.h) can be replaced with a forward declaration because C is still being mentioned elsewhere as a parameter or return type, and the other two (list and d.h) can disappear completely.
Guideline: For widely-included classes whose implementations may change, or to provide ABI-safety or binary compatibility, consider using the compiler-firewall idiom (Pimpl Idiom) to hide implementation details. Use an opaque pointer (a pointer to a declared but undefined class) declared as struct impl; std::unique_ptr<impl> pimpl; to store private nonvirtual members.
Note: We can’t tell from the original code by itself whether or not X had (default) copy or move operations. If it did, then to preserve that we would need to write them again ourselves since the move-only unique_ptr member suppresses the implicit generation of copy construction and copy assignment, and the user-declared destructor suppresses the implicit generation of move construction and move assignment. If we do need to write them by hand, the move constructor and move assignment can be =defaulted, and the copy constructor and copy assignment will need to copy the Pimpl object.
After making that additional change, the header looks like this:
// x.h: after converting to use a Pimpl
//
#include <iosfwd>
#include <memory>
#include "a.h" // class A (has virtual functions)
#include "b.h" // class B (has no virtual functions)
class C;
class E;
class X : public A, private B {
public:
~X(); // defined out of line
// and copy/move operations if X had them before
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;
private:
struct impl;
std::unique_ptr<impl> pimpl; // ptr to a forward-declared class
};
std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}
Without more extensive changes, we still need the definitions for A and B because they are base classes, and we have to know at least their sizes in order to define the derived class X.
The private details go into X‘s implementation file where client code never sees them and therefore never depends upon them:
// Implementation file x.cpp
//
#include <list>
#include "c.h" // class C
#include "d.h" // class D
using namespace std;
struct X::impl {
list<C> clist;
D d;
};
X::X() : pimpl{ make_unique<X::impl>(/*...*/) } { }
X::~X() =default;
That brings us down to including only four headers, which is a great improvement—but it turns out that there is still a little more we could do, if only we were allowed to change the structure of X more extensively. This leads us nicely into Part 3…
Acknowledgments
Thanks to the following for their feedback to improve this article: John Humphrey, thokra, Motti Lanzkron, Marcelo Pinto.