GotW #7c: Minimizing Compile-Time Dependencies, Part 3

Herb Sutter C++ 2013-12-31 1 Minute

Now the unnecessary headers have been removed, and avoidable dependencies on the internals of the class have been eliminated. Is there any further decoupling that can be done? The answer takes us back to basic principles of solid class design.

Problem

JG Question

1. What is the tightest coupling you can express in C++? And what’s the second-tightest?

Guru Question

2. The Incredible Shrinking Header has now been greatly trimmed, but there may still be ways to reduce the dependencies further. What further #includes could be removed if we made further changes to X, and how?

This time, you may make any changes at all to X as long as they don’t change its public interface, so that existing code that uses X is unaffected. Again, note that the comments are important.

//  x.h: after converting to use a Pimpl to hide implementation details
//
#include <iosfwd>
#include <memory>
#include "a.h"  // class A (has virtual functions)
#include "b.h"  // class B (has no virtual functions)
class C;
class E;

class X : public A, private B {
public:
       X( const C& );
    B  f( int, char* );
    C  f( int, C );
    C& g( B );
    E  h( E );
    virtual std::ostream& print( std::ostream& ) const;

private:
    struct impl;
    std::unique_ptr<impl> pimpl;   // ptr to a forward-declared class
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
    return x.print(os);
}

Published by Herb Sutter

Herb Sutter is an author and speaker, and a technical fellow at Citadel Securities. He serves as chair of the Standard C++ Foundation and its conference CppCon, and served as chair of the ISO C++ standards committee from 2002 to 2025. View all posts by Herb Sutter

Published 2013-12-31

14 thoughts on “GotW #7c: Minimizing Compile-Time Dependencies, Part 3”

@Roman M:
Timing can be enabled by setting:
Tools->Options->Projects and Solutions->VC++ Project Settings->Build Timing
(IIRC at least since VS 2010, but might be even earlier)

Output looks like

12>Project Performance Summary:
12>    326932 ms  C:\Dan\Klimax-tools\tools\net\EmuleMorphXT\MorphXT\emule100.vcxproj   1 calls
12>              326932 ms  Rebuild                                    1 calls

12>Target Performance Summary:
12>        0 ms  AfterRebuild                               1 calls
12>        0 ms  ResolveReferences                          1 calls
12>        0 ms  SelectClCompile                            1 calls
12>        0 ms  _CheckForCompileOutputs                    1 calls
12>        0 ms  GetResolvedWinMD                           1 calls
12>        0 ms  CleanPublishFolder                         1 calls
12>        0 ms  CleanReferencedProjects                    1 calls
12>        0 ms  _PrepareForBuild                           1 calls
12>        0 ms  _Midl                                      1 calls
12>        0 ms  BuildCompile                               1 calls
12>        0 ms  ComputeManifestGeneratedLinkerInputs       1 calls
12>        0 ms  BeforeClean                                1 calls
12>        0 ms  CreateCustomManifestResourceNames          1 calls
12>        0 ms  AfterCppClean                              1 calls
12>        0 ms  BeforeLink                                 1 calls
12>        0 ms  AfterBuild                                 1 calls
12>        0 ms  AfterBuildCompileEvent                     1 calls
12>        0 ms  _BuildLinkAction                           1 calls
12>        0 ms  BeforeResourceCompile                      1 calls
12>        0 ms  AfterMidl                                  1 calls
12>        0 ms  PreLinkEvent                               1 calls
12>        0 ms  _ClCompile                                 1 calls
12>        0 ms  AfterClean                                 1 calls
12>        0 ms  SelectCustomBuild                          1 calls
12>        0 ms  AfterClCompile                             1 calls
12>        0 ms  _CopySourceItemsToOutputDirectory          1 calls
12>        0 ms  AfterResourceCompile                       1 calls
12>        0 ms  ComputeCustomBuildOutput                   1 calls
12>        0 ms  _Xsd                                       1 calls
12>        0 ms  AfterBuildGenerateSources                  1 calls
12>        0 ms  Rebuild                                    1 calls
12>        0 ms  MakeDirsForBscMake                         1 calls
12>        0 ms  BeforeRebuild                              1 calls
12>        0 ms  MakeDirsForResourceCompile                 1 calls
12>        0 ms  _ResourceCompile                           1 calls
12>        0 ms  ComputeLinkInputsFromProject               1 calls
12>        0 ms  BeforeCppClean                             1 calls
12>        0 ms  _Link                                      1 calls
12>        0 ms  BuildLinkTraverse                          1 calls
12>        0 ms  PrepareForRun                              1 calls
12>        0 ms  _XdcMake                                   1 calls
12>        0 ms  CreateSatelliteAssemblies                  1 calls
12>        0 ms  MakeDirsForManifest                        1 calls
12>        0 ms  SelectResourceCompile                      1 calls
12>        0 ms  ExpandSDKReferences                        1 calls
12>        0 ms  BuildCompileTraverse                       1 calls
12>        0 ms  _GenerateSatelliteAssemblyInputs           1 calls
12>        0 ms  _Deploy                                    1 calls
12>        0 ms  ComputeMIDLGeneratedCompileInputs          1 calls
12>        0 ms  CppClean                                   1 calls
12>        0 ms  PrepareResourceNames                       1 calls
12>        0 ms  AfterBuildGenerateSourcesEvent             1 calls
12>        0 ms  _SplitProjectReferencesByFileExistence     1 calls
12>        0 ms  BeforeClCompile                            1 calls
12>        0 ms  _SelectedFiles                             1 calls
12>        0 ms  AfterResolveReferences                     1 calls
12>        0 ms  BeforeResolveReferences                    1 calls
12>        0 ms  AfterLink                                  1 calls
12>        0 ms  _ALink                                     1 calls
12>        0 ms  PreBuildEvent                              1 calls
12>        0 ms  MakeDirsForMidl                            1 calls
12>        0 ms  _BscMake                                   1 calls
12>        0 ms  ResolveSDKReferences                       1 calls
12>        0 ms  _BuildGenerateSourcesAction                1 calls
12>        0 ms  _Appverifier                               1 calls
12>        0 ms  ComputeMASMOutput                          1 calls
12>        0 ms  Clean                                      1 calls
12>        0 ms  ComputeRCGeneratedLinkInputs               1 calls
12>        0 ms  BuildGenerateSourcesTraverse               1 calls
12>        0 ms  MakeDirsForXdcMake                         1 calls
12>        0 ms  Build                                      1 calls
12>        0 ms  BeforeBuildGenerateSources                 1 calls
12>        0 ms  ResolvedXDCMake                            1 calls
12>        0 ms  _PrepareForRebuild                         1 calls
12>        0 ms  GetInstalledSDKLocations                   1 calls
12>        0 ms  _BuildCompileAction                        1 calls
12>        0 ms  BuildLink                                  1 calls
12>        0 ms  ComputeLegacyManifestEmbedding             1 calls
12>        0 ms  BuildGenerateSources                       1 calls
12>        1 ms  PGInstrumentedClean                        1 calls
12>        1 ms  ResolveAssemblyReferences                  1 calls
12>        1 ms  CoreClean                                  1 calls
12>        1 ms  ComputeLinkSwitches                        1 calls
12>        1 ms  FinalizeBuildStatus                        1 calls
12>        1 ms  CopyFilesToOutputDirectory                 1 calls
12>        1 ms  AssignProjectConfiguration                 1 calls
12>        1 ms  ComputeRCOutputs                           1 calls
12>        1 ms  ResolveProjectReferences                   1 calls
12>        1 ms  ComputeReferenceCLInput                    1 calls
12>        1 ms  ComputeCLCompileGeneratedSbrFiles          1 calls
12>        1 ms  GetCopyToOutputDirectoryXamlAppDefs        1 calls
12>        1 ms  _PrepareForClean                           1 calls
12>        1 ms  _CheckForInvalidConfigurationAndPlatform   1 calls
12>        1 ms  ComputeManifestInputsTargets               1 calls
12>        1 ms  GetFrameworkPaths                          1 calls
12>        1 ms  GetReferenceAssemblyPaths                  1 calls
12>        1 ms  GetCopyToOutputDirectoryItems              1 calls
12>        1 ms  PlatformPrepareForBuild                    1 calls
12>        1 ms  SplitResourcesByCulture                    1 calls
12>        2 ms  _PrepareForReferenceResolution             1 calls
12>        2 ms  MakeDirsForCl                              1 calls
12>        2 ms  ComputeCLInputPDBName                      1 calls
12>        2 ms  DoLinkOutputFilesMatch                     1 calls
12>        2 ms  MakeDirsForLink                            1 calls
12>        3 ms  ComputeCLGeneratedLinkInputs               1 calls
12>        3 ms  ComputeCLCompileGeneratedXDCFiles          1 calls
12>        3 ms  CheckInstalledVCLibsIPP                    1 calls
12>        3 ms  SetBuildDefaultEnvironmentVariables        1 calls
12>        3 ms  InitializeBuildStatus                      1 calls
12>        4 ms  ComputeLinkImportLibraryOutputsForClean    1 calls
12>        5 ms  RegisterOutput                             1 calls
12>        5 ms  AssignTargetPaths                          1 calls
12>        8 ms  SetCABuildNativeEnvironmentVariables       1 calls
12>       10 ms  ComputeCLOutputs                           1 calls
12>       16 ms  PrepareForBuild                            1 calls
12>       53 ms  WarnCompileDuplicatedFilename              1 calls
12>      154 ms  PostBuildEvent                             1 calls
12>      295 ms  CoreCppClean                               1 calls
12>      301 ms  Manifest                                   1 calls
12>      302 ms  _Manifest                                  1 calls
12>      565 ms  ResourceCompile                            1 calls
12>      651 ms  _MASM                                      2 calls
12>     4252 ms  Link                                       1 calls
12>    320559 ms  ClCompile                                  1 calls

Output is per-project.

Roman M. says:

2014-01-06 at 3:43 am

@Loïc Joly: I am aware of the tool you’ve mentioned. Our codebase is Visual Studio C++ – only. Some time ago I tried it and couldn’t get it running because of Microsoft’s implementation of STL. Current “README.txt” of “Include What You Use”-Project states:

“IWYU, like Clang, does not yet handle some of the non-standard constructs in Microsoft’s STL headers.”

On the other hand, I found recent message: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-October/032629.html in CLANG mailing list that states that support for MS-STL has improved, so maybe now it the time to give it another try. I just wish I had something similar from Microsoft.
Loïc Joly says:

2014-01-05 at 1:41 pm

@Roman “- we have absolutely _no_ tool support, so a developer have to _guess_ which header has greatest impact and is worth her time;”

There is a tool whose puspose is to automate someway header files removal: http://code.google.com/p/include-what-you-use/ I’ve not used it myself, so I cannot say how good I think it is. It has some quite complex heuristics to select which files should be included, and which fiels should not (it does not try to get to the absolute minimal). This article describes some of them: http://code.google.com/p/include-what-you-use/wiki/WhyIWYUIsDifficult.
Roman M. says:

2014-01-04 at 5:30 am

@Jussi: nice script you have there! I am not aware of something similar Visual C++. Unfortunately your blog-post didn’t reveal whether good compilation throughput is due to PIMPL or just because those CPPs define base classes and naturally have no or small dependencies. the mere fact of such big compile throughput variation in diverse projects suggests that solving compile time issues is hard. Your measurement, while offering interesting insights, lacks information where is make sense to introduce Pimpl – the closer to class hierarchy root we get, the bigger impact can be expected. One would need some kind of directed header include graph weighted by compilation time of corresponding CPP.

On the side note: in my projects about 60% of build time is spent by the linker, but thats another story ;)
Jussi Pakkanen says:

2014-01-03 at 5:47 am

@roman, the difference in compile speeds between pimpl’d and non-pimpl’d code bases can be huge. Not 10%. Not even 50%. The difference can be an order of magnitude. I made a measurement of different code bases a while ago:

http://voices.canonical.com/jussi.pakkanen/2013/08/23/comparing-build-speeds-of-different-code-bases/

This speed increase alone has massive customer value: programmers can get stuff done faster so you can create better products faster, fix bugs faster and so on. Once you have experienced near-instantaneous incremental build times in C++ you never want to go back to spending minutes at a time watching scrolling text in a terminal.
Roman M. says:

2014-01-02 at 7:50 am

Hello Herb,
in my experience, using your advice at the current language state in large projects is unfortunately utopical and rarely pays off:

– dependency rules are often not obvious, so a conservative developer will rather include too much instead of risking a broken build;
– we have to sacrifice performance for a compile-time issue which has arguably zero customer value;
– we have absolutely _no_ tool support, so a developer have to _guess_ which header has greatest impact and is worth her time;
– even if some issues were somehow located and fixed using PIMPL, a regression is just a matter of time;
– templates make sutuation much worse;

Some time ago I’ve create an issue on User-Voice (http://visualstudio.uservoice.com/forums/121579-visual-studio/suggestions/2680137-make-header-include-refactoring-possible). I understand that creating good solution for compile-time issues is far from trivial, but currently I would probably invest in better hardware instead of wasting my time on PIMPL
Phil Nash says:

2014-01-02 at 4:47 am

@Herb. This is not a response to your questions, but is pertinent to the topic.

I’ve been following this series with interest to see if I can pick up any more tips in my war against build-time sapping dependencies. While it’s all been good stuff and covered well I’ve not seen anything new – yet I still suffer from productivity killing compile times in several projects I’m working on (to be clear, these are mostly legacy code bases where it’s too late for the large scale refactoring necessary to take advantage of some of these techniques on enough of the code-base – or the runtime cost is too high – e.g. with the pImpl idiom).
We’ve had some success with pre-compiled headers – but also constantly have problems with them (VS2008 – has it finally got any better in more recent incarnations?).

In the first post you alluded to modules – which I think should be a huge win in this area. But when I asked Bjarne about it earlier this year he seemed to think they weren’t even on the radar for C++17 at this point. Is that still the case? Can you give any more insight here? Will you really be covering it in an upcoming post?
Phil Nash says:

2014-01-02 at 4:39 am

@Matthieu (& @herb) the fixed sized block of storage with placement new (and explicit destructor) is the usual solution. As Herb says adding a bit of extra headroom is usually a good idea.
Another good idea is to put a static assert in the implementation to make sure the compiler tells you if that size ever becomes too small.

However, it may also be worth considering using a pool allocator, or some other more efficient allocator (I’ve found tbb’s scalable_allocator to be a very good general purpose allocator) to keep the size dynamic (and thus always correct), but with a much smaller overhead. You can’t beat the near zero overhead of the obtrusive storage block but a decent allocator may be good enough for a lot of cases where the std allocator isn’t.
PiotrP says:

2014-01-01 at 9:45 am
1) I think tightest coupling in cpp is friendship(it gives private members access to non related class/methods). Inheritance is second.

2) Relationship with B can be moved to impl, since B has no virtual functions and it’s private base class of X.
This modification doesn’t change public interface of X.
```
//x.h

#include "a.h"  // class A (has virtual functions)
class B;  // class B (has no virtual functions) - forward declared
class C;
class E;

class X : public A{
public:
       X( const C& );
    B  f( int, char* );
    C  f( int, C );
    C& g( B );
    E  h( E );
    ~X();
    virtual std::ostream& print( std::ostream& ) const;

private:
    struct impl;
    std::unique_ptr<impl> pimpl;   // ptr to a forward-declared class
};
```
```
//x.cpp
#include "b.h"
struct X::impl: public B{
    /* ... */
}
X::X( const C& ) : pimpl{ make_unique<X::impl>(/*...*/) } { }
X::~X() = default;
```
After changes all calls of B methods in X should be redirected to impl.
To make it work B should be public base of X::impl. If B has protected members, such calls should be enclosed by public functions to allow access from X. Another way to do it is to add friendship between X::impl and X (B can remain private base of X::impl).
Herb Sutter says:

2014-01-01 at 7:31 am

@Matthieu: Yes, that’s basically what you’d need to do to avoid the allocation overhead in Pimpl — have a block of storage in the class of sufficient size, perhaps with extra space so that you can add some extra members in the future without causing an ABI breakage or recompilation.

Edited to add: BTW, I covered this topic in GotW #28 back in 1997. That one will (eventually) get updated too.
mjklaim says:

2014-01-01 at 5:55 am

1. Inheritance.
2. X::impl could inherit from B, in the cpp file, which would allow us to move #include “b.h” in the cpp file.
This is because B is not part of the public interface, so it’s an implementation detail, AND it don’t have virtual functions, which is a kind of externally available interface.
For the same reasons, we can’t remove A from the interface so it’s include needs to stay here.
Matthieu M. says:

2014-01-01 at 5:45 am

JG: Friendship is the tightest coupling, followed by inheritance; thus the guideline, prefer composition over inheritance.

Guru: Given the big clue (read the comments) and the previous appetizer, it seems obvious that X need not inherit from B. Since we are already using PIMPL, all uses of B should thus move to the source file and therefore the include of B can be removed. B will still need to be forward declared since it is mentioned in the interface.

Question: I see here the use of unique_ptr, which requires a separate allocation. I have wondered a couple times how to get rid of this separate allocation (it slows down the construction and hurts cache friendliness) and I could only come up with the use of aligned_storage for a raw block of memory (of predetermined size) in X. However the need to suitably size this block implies the issue of ABI breakage if the size need ever change, so this is less stable ABI wise (but still insulates the client from most headers). Any thoughts ?
Matt Fioravante says:

2013-12-31 at 10:00 pm

Hi Herb, this is a great topic. I’ve submitted a proposal to further reduce compile times and increase encapsulation by allowing the programmer to define additional non-virtual private methods outside of class scope. It’s an artificial limit that just makes class design less flexible and requires unnecessary recompilation.

https://github.com/fmatthew5876/stdcxx-privext

Do you have any opinion about this idea?
germinolegrand says:

2013-12-31 at 7:23 pm
1) Inheritance is the tightest coupling you can ever find in C++ (inheritance with virtual overriding really is the most complicated dependency to hide).
The sencond-tightest is membership. But you already talked about it in part #7b, we can get rid of it with pimpl idiom if it’s private.

These are the ones that determine the size of the class, which is what the compiler always need. Interestingly, these also are the ones that can’t be recursive, thus it makes it a solvable problem (no cyclic inclusion) without any indirection.

2) The private inheritance to class B, that has no virtual functions, can be transformed into a private member. It has no consequence on the size, unless sizeof B is zero (as a member it is required to be at least 1 if i make no mistake). Anyway, if it is private member, it can become a member of the pimpl and simply disappear from class X.

Another way of having class B disappear from class X, but keeping inheriting from class B (perhaps to access some protected things, or to keep its zero size), is to simply make struct impl publicly inherit from class B (so that it can be accessed from class X, but needing some proxy functions in struct impl to access the protected part of class B).

Once any of those two solutions are applied, the #include “B.h” can be removed from x.h and added in x.cpp.

It’s worth mentionning that the order in which class X inherits from class A then from class B permits to preserve construction order while sending class B to the pimpl. If class B really needed to be constructed before (or deleted after) class A, this wouldn’t be possible. It’s so easy to mess up the construction/destruction order by playing with dependancies.

The code for x.h now looks like this (a forward-declaration to class B is needed for the f function):
```
//  x.h: after converting to use a Pimpl to hide implementation details
//
#include <iosfwd>
#include <memory>
#include "a.h"  // class A (has virtual functions)
class B;
class C;
class E;

class X : public A {
public:
       X( const C& );
    B  f( int, char* );
    C  f( int, C );
    C& g( B );
    E  h( E );
    virtual std::ostream& print( std::ostream& ) const;

private:
    struct impl;
    std::unique_ptr<impl> pimpl;   // ptr to a forward-declared class
};

std::ostream& operator<<( std::ostream& os, const X& x ) {
    return x.print(os);
}
```
and the x.cpp like that :
```
//  Implementation file x.cpp
//
#include <list>
#include "b.h"  // class B (has no virtual functions)
#include "c.h"  // class C
#include "d.h"  // class D
using namespace std;

struct X::impl: public B {
    std::list<C> clist;
    D            d;
};

X::X() : pimpl{ make_unique<X::impl>(/*...*/) } { }
X::~X() =default;
```
The last trick is not needed in the GotW question, but might be interesting. What if class B had virtual functions overrided in class X ? The solution may have an extra cost, with crossed dependancies between class X and struct impl.

First, the inheritance is changed, same as before, instead of class X, struct impl will inherit from class B. The overriding functions will be located in struct impl.

Then, everything depends on what is in the overriding functions. If the scope of struct impl is enough to implement them (for example it only needs to play with the members of the pimpl idiom), all is fine, nothing to do, no extra cost.

On the contrary, if it needs some access to class A or class X (eg. call a function in the (virtual) interface, or access to a member not in struct impl), there will come the necessary dependancy to class X. An extra member of impl will have to be added : a reference to an X (the X constructor will pass *this to the impl constructor). If needed, class X will be added a friend class impl; declaration so that class impl can access private things of class X it needs.

One of the drawbacks (not often a problem) is that a reference to a B can no longer be dynamicly casted into a reference to an A or an X.

I wish you all a happy new year =).