Now that the unnecessary headers have been removed, it’s time for Phase 2: How can you limit dependencies on the internals of a class?
Problem
JG Questions
1. What does private mean for a class member in C++?
2. Why does changing the private members of a type cause a recompilation?
Guru Question
3. Below is how the header from the previous Item looks after the initial cleanup pass. What further #includes could be removed if we made some suitable changes, and how?
This time, you may make changes to X as long as X‘s base classes and its public interface remain unchanged; any current code that already uses X should not be affected beyond requiring a simple recompilation.
// x.h: sans gratuitous headers
//
#include <iosfwd>
#include <list>
// None of A, B, C, or D are templates.
// Only A and C have virtual functions.
#include "a.h" // class A
#include "b.h" // class B
#include "c.h" // class C
#include "d.h" // class D
class E;
class X : public A, private B {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;
private:
std::list<C> clist;
D d_;
};
std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}
d_ and clist are not part of the public interface or part of X’s base classes. These members are unreferenced in the example. We can conclude that these members are unused. Therefore we can forward declare C, remove these unused members, and eliminate , “c.h” and “d.h”.
That is unless you consider the size of an object to be part of that object’s interface.
It seems that everyone seems to ignore that class A has virtual members. Class X also has virtual members anyway (the print function), and therefore needs a *virtual* destructor. This should also solve the unique_ptr pimpl issue and allow a pimpl to reduce implementation detail dependencies (all private members can be hidden).
Additionally the private inheritance of B can be replaced by encapsulation instead (inside the pimpl)
void X::swap_with( X& other )
{
swap( *this, other );
swap( *this, other );
swap( p_impl_, other.p_impl_ );
}
And having corrected that it now occurred to me,
what if
A
andB
have a common virtual base class?In that case, since the above code swaps the virtual base sub-object an even number of times, an extra swap of it needs to be added to bring the total number of swaps of this sub-object to an odd number:
void X::swap_with( X& other )
{
swap( *this, other );
swap( *this, other );
swap( *this, other );
swap( p_impl_, other.p_impl_ );
}
But if that swap is done via copy assignments and is costly, then this is needlessly inefficient.
For, instead – and I didn’t think of this earlier – to cater for the possibility of a common virtual base class for
A
andB
, theX
assignment operator can be expressed in terms of assignment, like this:auto X::operator=( X const& other )
-> X
I don’t think I’d do that optimization, but then, I don’t think I’d use PIMPL to get rid of an
#include
in the first place.Oh well. Hm. @Thibaud reminded me to add an
operator=
, which I did (the only answer so far to do that crucial thing), but I did that with an incomplete implementation ofswap_with
: theA
andB
base class sub-objects were not swapped in that code. I intended to, but I forgot. I think I forgot it mainly because I do such things by testing the code, and with just empty dummyA
andB
classes the code worked nicely…So, here’s an implementation using
std::swap
, assuming thatA
andB
are no-throw swappable:If the no-throw swappable assumption does not hold, then a
noexcept
would give the compiler incorrect information, so I have removed it here.In the case where swapping throws one can end up with an inconsistent state, which is the same general effect as with the compiler-generated assignment for the original class
X
. In more detail, the compiler generated assignment can yield an inconsistent state when copy assignment of a sub-object throws, and the above can yield an inconsistent state when swapping throws, which is most likely due to assignment to that sub-object throwing. Anyway, depending on the properties of the classes one can get an inconsistency.Summing up so far, to implement a class
X
copy assignment operator, as opposed to letting the compiler generate one, one has to assume that• classes
A
,B
,C
andD
are copy-constructible, and• the base classes
A
andB
are swappable (possibly just by providing copy assignment, possibly by specializingstd::swap
), but not necessarily non-throwing swappable.If instead of a user-defined
X
copy assignment operator one uses a cloning smart-pointer forp_impl_
, then it’s not strictly necessary to know anything aboutA
.andB
. For then the compiler does the job of considering the properties ofA
.andB
when it attempts to generateX
copy constructor and copy assignment operator, However, in the case where not all the classes are copy-constructible there is not much point in the cloning smart pointer (it will never be doing its thing), so the code will in practice depend on the properties of the classes.thefatredguy: in the current standard, a trailing return type is mandated if you use auto as a function’s return type. This is not because return type deduction isn’t possible, but simply because there were multiple concerns among the core working group that couldn’t be resolved in time for C++11.
Therefore, return type deduction has been proposed for C++14 and partially implemented in g++-4.8 by Jason Merrill. However, doing so will emit a warning that you’re not providing a trailing return type and thus write non-standard (C++11) code which will either not compile or only ne available as a vendor specific compiler extension. g++-4.7 or less properly emit a compile-time error.
The corresponding paper is here: http://isocpp.org/files/papers/N3638.html
All this insanity because C++ is still in the dark ages when it comes to organizing/compiling source code.
I see that a lot of people write functions like this
What would be the advantage over the more traditional way of doing it
?
One additional thought that I have not seen up to now in another post is that private inheritance can be considered as “is implemented in terms of”. That means we could add a B member in the Impl class and remove the
private B
inheritance. Then we could remove the#include "b.h"
and replace it with the proper forward declarationclass B;
. All that can only be done, if we don’t need access to protected parts of B (only a look at the implementation could tell us). Another reason to have to stick to inheritance would be the need to override a virtual function, but the comment says that B has none.As I write this I see that we could just let the internal class Impl inherit publicly from B. This way we would also have access to B’s protected members. Impl also would not need the mentioned B member.
But I would also object against ‘pimpl’ing the class. As it is now, the class has valid copy and move constructors and valid copy and move assignment operators. If we change the implementation to a pimpl based one, we would have to provide them manually to keep it externally unchanged. I don’t think this is justified just to be able to remove some header files.
@Thibaud:
“1. You can now remove #include [missing] in “x.h” and put it in the implementation file.”
Assuming that you meant
#include <list<>
it is in the implementation file (unfortunately the line in the header avoided deletion, it’s redundant).“2. You could also put B in Impl and replace #include “b.h” with a forward declaration. Doing this, B and X’s public interface remain unchanged.””
No.
The requirement is “as long as X‘s base classes and its public interface remain unchanged”. Which means: not only the public interface, for whatever interpretation of “public”, but also the base classes.
Private base classes can with some effort be regarded as part of the effective public interface because (1) they participate in name look-up, and (2) the old C-style cast is explicitly permitted to cast to an inaccessible base class (it’s the only cast that can be used for this with well-defined effect).
I find that interpretation a bit unnatural, but it or a similar practical consideration might be the reason why Herb wants to preserve base classes in addition to the normal public interface.
“3. I think operator= has to be defined?”
Right, I forgot that.
It is however rather difficult to do correctly with the information given, because we don’t know whether the original public interface provided copy constructor, or copy assignment operator, for that depends crucially on the nature of classes
C
andD
.But somehow I don’t think that Herb meant to withhold the copyability information!
So, just assuming that
C
andD
are fully copyable, and making the new methodswap_with
protected in order to preserve the public interface unchanged, the copying-supporting header can go like this:The implementation is now
“4. I don’t understand your first argument of not using std::unique_ptr. When I forget to define the destructor, the compiler tells me so (incomplete type) and give me an error.”
Sorry, mindless editing. :-) The argument was originally about smart pointers in general, such as now deprecated
auto_ptr
. Forstd::default_delete
(used byunique_ptr
) the code is “ill formed” (by C++11 §20.7.1.1.2/4) if the type is incomplete, so that with a C++11-conforming compiler one does get a diagnostic – for this particular smart pointer..Now I hope there are no more details to fix! <g>
Also, how exactly do you expect your pimpl_ptr to implement deep copying or destruction without the definition of Impl? That’s simply impossible.
And are you sure you want to force “my pimpl is null” as a moved-from state on every class using the pointer? Because that’s the only way to implement moving without the Impl definition, and the only way to implement nothrow move when Impl itself doesn’t have it.
Face it, if you use PIMPL, you *have* to define your own special members out of line. That’s part of how the compilation barrier works. Luckily you can use =default in the implementation file, so the duplication stays manageable.
unique_ptr is safe to use for PIMPL: the default deleter’s operator () will not compile if the type is incomplete.
1. Private means that only member functions and friends (including methods in friend classes) may access a member directly. Overridden functions in derived classes do not have this privilege.
2. Because the file defining the class is changed and therefore the file and all files #including it need to be recompiled. The reason for recompilations is the way C++ code is organized.
But there’s another reason that shows why this cannot be changed that easily. Changing private members affects the object size of a class. Moreover, some private member might have customized default constructors, copy or move constructors or assignment or destructor. If the respective member function of the class containing it is implicitly defined or defaulted, then every compilation unit that uses any of these member functions calls the member function *inline*. The behavior of these inlined functions can already change when only the order of private data members is changed. A prominent example where this happens is when a data member of a scoped thread type should be listed as last data member, but isn’t. In that case some important other data members that might be accessed by the thread get destroyed first in the destructor while the thread is still executing. Kaboom!
Therefore it makes sense to recompile code like this when the order of private data members changes.
3. The solution it to use the pimpl idiom. It’s actually possible to remove all the header except two: iosfwd and a.h. It’s not necessary to keep
B
as a base ofX
since it is not part of the public or protected interface of the class. You can put that base class into theImpl
class and (re)implement any inherited virtual functions fromB
there. I’ve used this a lot in some of my code. It has proven to work wonderfully, especially when a class would otherwise inherit a big bunch of interfaces privately. (The use case of implementing private interfaces is that the class itself can implicitly cast itself to such an interface when connecting to code it uses.)If the answer should be conforming to the exact exercise, then X must inherit privately from B
Instead of
std::unique_ptr
other more suitable smart pointer types could be used that are specifically crafted for being pimpl pointers. In particular this smart pointer type should support– propagating constness
– a deep copying copy constructor and copy assignment
– a destructor which calls the correct destructor of the held object
– cheap move constructor and move assignment with nothrow guarantee
– nothrow swap
Furthermore, the said copy and move constructors and assignments and the destructor should work without the definition of the
Impl
class. Otherwise the class containing the pimpl pointer will have to declare these operations for itself and implement them in a standard way in the implementation file which is a lot of code duplication and simply annoying. It would be really nice to have a standard solution to this, but neither boost nor the standard library provide one (as far as I know).@ Alf P. Steinbach :
1. You can now remove
#include
in “x.h” and put it in the implementation file.2. You could also put B in
Impl
and replace#include "b.h"
with a forward declaration. Doing this, B and X’s public interface remain unchanged.3. I think operator= has to be defined?
4. I don’t understand your first argument of not using std::unique_ptr. When I forget to define the destructor, the compiler tells me so (incomplete type) and give me an error.
An oversight: in the PIMPL’ed header the
#include <list>
directive should be removed. It’s in the implementation file. Somehow the original directive managed to avoid deletion.A solution to just the technical aspects of the guru question.
The original header as given above suffers from some coding errors:
• No include guard or
#pragma once
• Declaration of class
E
missing.• No
inline
for theoperator<<
(in practice this means a linking error when the header is included in morethan one translation unit).
The original header with the above defects fixed (
#pragma once
is a de facto standard, supported by all compilers that I care about, which includes Visual C++ and g++):First, since base classes have to be left as-is, classes
A
andB
must be complete.Classes
C
and
E
are used for arguments and function results, so declarations of these classes are needed.Class D can however be completely removed from this header by moving the state to an implementation object, the PIMPL idiom.
A good way to implement the PIMPL idiom is to use a raw pointer and `new` in each `X` constructor, and `delete` in the `X` destructor.
A bad way to implement the PIMPL idiom is to use `std::unique_ptr`, because:
• It’s easy to get W R O N G. For if it’s not done painstakingly correct, then the default implementation will rely on a C++ “feature” where it can
delete
a pointer to incomplete type, by just assuming that it has a trivial do-nothing destructor. Then the implementation object destructor is not invoked.• It brings in an extra header include, namely
<memory>
, just in order to get rid of a header include…A simple way to do a
unique_ptr
-based implementation correct is to declare the `X` destructor, and define it in the implementation file, which brings the instantiation of `default_delete` to a point where the implementation object class is complete.However, rather than rely on such subtlety, and rather than bringin in an extra header, just use a raw pointer for this, whence the header can look like …
And the implementation file can look like this:
I have commented earlier on why I don’t consider it a generally good idea to use PIMPL to get rid of header includes, mainly that it introduces an inefficiency (dynamic allocation) and subtlety (such as inadvertently introducing UB by incorrect use of
unique_ptr
) just in order to treat a symptom, whose main causes should ideally instead be addressed. Also, the money might be better spent on hardware for build-servers than on programmers’s salaries. But sometimes one may not have a choice about it.I found some web references for the source concatenation technique (which addresses the archaic tool problem):
* http://en.wikipedia.org/wiki/Single_Compilation_Unit
* http://gamesfromwithin.com/physical-structure-and-c-part-2-build-times
Now I see people downvoting comments here. A downvote is an anonymous social argument, which only makes sense in a social community such as SO (e.g., I very much doubt that Herb will be influenced by the votes: the voting functionality just an artifact of the WordPress blogging system). Please, in order to express disagreement or opinion or perceived fact,, write it up as a comment so that readers can know or guess what it’s about.
Little correction, it’s supposed to be
std::shared_ptr d_;
– for the same reason it’s used as the template parameter for the list.Although the intent of the author obviously wasn’t to allow sharing of
d_
, IMHO it’s not important. D is completely internal (i.e. neither part of a public interface, nor accessible via friends) and except for slightly increased memory usage and the indirection, it solves the problem of wanting to get rid of the include.Again, including
d.h
might not be a problem in the first place …I’d go as far as saying that using a PIMPL simply to be able to throw out some includes is kind of a premature optimization. Get hard numbers in terms of compile times and really compare two implementations (with and w/o the PIMPLs) – if you get a significant speedup you can go ahead and find a way to deal with the run-time/maintenance caveats already mentioned by Alf. Also, the indentation of the “public” access specifier is a little off. ;)
I’m not sure if this was mentioned in [More] Exc. C++ or an older GotW, but one purpose of a PIMPL (or whatever you wanna call it) tends to be overlooked in favor of this uncertain compile-time advantage: helping binary compatibility. Still, employing a PIMPL for that purpose is a completely different design choice than using the PIMPL merely to reduce compile-time dependencies – and using it just because you get both “treats” at the same time simply isn’t the way to go …
Also, for
D d_;
there is absolutely no reason why this cannot be declared asAlso, for the purpose of reducing dependencies, why wouldn’t a
std::list
ofstd::shared_ptr
(sincestd::unique_ptr
tends to be a pain when it comes to incomplete types, i.e. default-deleter … ) not be a viable alternative?Yes, we get an additional indirection on access (just like when using a PIMPL) but unless binary compat. matters, it’s IMHO much easier to read, understand and maintain than a private class. and boilerplate delegation code.
As John Humphrey noted the “class E” declaration appears to have gone missing.
Also, there’s still no include guard or
.
Also, the
is still not declared
.
The above are coding errors, ungood in example code (not to mention production code)..
And with this new part II question (I’m happy that I didn’t overlook anything for part I) this GOTW is no longer about fixing things. It is now about introducing inefficiences (for PIMPL) and/or a maintainance nightmare (for duplication) alleviate symptoms of ungoodness, which IMHO is misguided.
The ungoodness: ungood physical packaging of the code (distribution of code in files), and ungood tools (archaic compiler technology, e.g. with Visual C++ and g++).
The physical packaging is too course grained and/or the software design does not properly support physical packaging, iif the headers drag in enough stuff to make a successful build too slow.
The tools, well we can’t do much about them, but a reasonable modern approach would be for a compiler to by default compile several C++ translation units in one go, checking each header (except for [assert.h]) only once, Also, for a compiler to not waste time on storing context info in order to be able to continue after error with an avalanche of misleading and outright incorrect messages. nstead, as (IIRC) Turbo C++ did, it should stop at first error, and be taking advantage of that so that it’s fast about it.
The archaic tool problem here is so much a pain in the a[censored][censored] that people have, seriously!, proposed to merge all headers and source code into one big file before compiling. They have even described this approach as an “idiom’, which presumably means that in some shops or projects it’s regularly used, and it has a name, which alas I can’t recall ATM. Given that the tools should better be fixed, so that people don’t have to do hazardious things like that!
Introducting inefficiences and/or maintainance problems, as this part II of the GOTW is about, is of course an alternative to just suffering build times and/or doing the risky source concatenation thing, but employing these part II techniques negates much of the purpose of using C++ in the first place…
Here’s my take on it :
1. When a member (either data or function) is declared as private, only
member and friend functions of that class have access to it.
2. Because of the #include model. Even though the private part of the class definition
is not accessible to clients, its still visible to anyone including the header
(basically the implementation details of the class end up leaking into client code).
3. We move the implementation details out of the class interface
(the PIMPL idiom – I first saw this in Exceptional C++, a few years ago) :
In the implementation file :
Its not without any drawbacks though. There’s an extra allocation for each
object of class X and, from the performace point of view,
another level of indirection.
1) private means nothing except methods (static or non-static) of the class itself, and befriended functions may access the member. befriended functions include free functions declared as friends, as well as methods of classes declared as friends.
2) changing *any* member, including private members, might cause a change in the memory layout of a class, or in the vtable, function overload resolution and a number of other things that affect compilation of nonfriends, thereore any entity using the class has to be recompiled.
3) b.h, c.h and d.h can be gotten rid of by using the pimpl idiom: declare a struct in the header and let the only data member be a pointer to that struct. move all implementation details (private data members and privatly inherited classes) to the impl-struct. Define that struct in the .cpp.
The logic of the class’ methods can either stay in the class’ implementation (pure data pimpl) or be forwarded completely to the pimpl (handle body idiom) or a mixture of both
The “class E” declaration appears to have gone missing.