GotW #7a: Minimizing Compile-Time Dependencies, Part 1
Managing dependencies well is an essential part of writing solid code. C++ supports two powerful methods of abstraction: object-oriented programming and generic programming. Both of these are fundamentally tools to help manage dependencies, and therefore manage complexity. It’s telling that all of the common OO/generic buzzwords—including encapsulation, polymorphism, and type independence—along with the lion’s share of design patterns, are really about describing ways to manage complexity within a software system by managing the code’s interdependencies.
When we talk about dependencies, we usually think of run-time dependencies like class interactions. In this Item, we will focus instead on how to analyze and manage compile-time dependencies. As a first step, try to identify (and root out) unnecessary headers.
Problem
JG Question
1. For a function or a class, what is the difference between a forward declaration and a definition?
Guru Question
2. Many programmers habitually #include many more headers than necessary. Unfortunately, doing so can seriously degrade build times, especially when a popular header file includes too many other headers.
In the following header file, what #include directives could be immediately removed without ill effect? You may not make any changes other than removing or rewriting #include directives. Note that the comments are important.
// x.h: original header
//
#include <iostream>
#include <ostream>
#include <list>
// None of A, B, C, D or E are templates.
// Only A and C have virtual functions.
#include "a.h" // class A
#include "b.h" // class B
#include "c.h" // class C
#include "d.h" // class D
#include "e.h" // class E
class X : public A, private B {
public:
X( const C& );
B f( int, char* );
C f( int, C );
C& g( B );
E h( E );
virtual std::ostream& print( std::ostream& ) const;
private:
std::list<C> clist;
D d_;
};
std::ostream& operator<<( std::ostream& os, const X& x ) {
return x.print(os);
}
@Gennaro: Yes, I’ll add a note to be clear it’s okay to replace #include with a forward declaration.
All: Remember to use for code blocks, or escape those pesky characters using “& l t ;” and “& g t ;”. Sorry, it’s a WordPress thing.
It appears something is drastically removing things from peoples comments.
A lot of them are saying remove #include and nothing after.
like this
To the includes:
#include
#include
Guru question #2.
* Replace iostream and ostream include with an include of iosfwd. I forgot where I read this, but this neat trick avoids including the heavy iostream header.
* Remove c.h and e.h, as previously stated. These can be replaced with forward declarations.
* d.h and list includes can be removed if one uses Herb’s own Private Implementation Pattern, but from the wording of the question, this seems out of scope of the question. It would require a change in the class itself.
There’s another detail for class “C”. Since “A” has virtual functions, and we don’t know which, we must assume that “A::g( B )” might be virtual. If so, for the worstcase scenario we need to include “C” because the return type in the overridden function might differ from “C&” (if C is a derived class of it, covariance).
You can get rid of all of them if you parameterise the entire class.
Sorry, in my first sentence, i wanted to say: we remove header , we leave header
First, i assume all header files use proper include guards.
*Here, the header can be removed, because the function std::ostream& operator<<( std::ostream& os, const X& x ) uses std::ostream object (typedef basic_ostream ostream, actually) so we can leave only the header.
*From the class E, the header x.h uses the result and argument type; therefore its header file can be removed.
*A virtual table is a global variable declared implicitly for each class with at least one virtual function. So i think we should leave there the header “c.h”, otherwise the virtual table can’t be created.
*Definitely we ween the header file , because of the private object clist.
*The same applies for the header “d.h”. (we need it, we construct the object d_ inside class X
*Class A has at least one virtual function, i understand it has one function declaration in it: virtual std::ostream& print( std::ostream& ) const; it is overloaded by the function std::ostream& class X::print( std::ostream& ) const;
Moreover, function std::ostream& operator<<( std::ostream& os, const X& x ) calls function: print( std::ostream& ); and object x can be of type either "class X" or "class A", so i would leave the header a.h
Agree too many includes is a bad thing but does the use of precompiled headers containing a lot of headers perhaps favour more than less? i.e you might have a lot of includes precompiled but is this better than fewer but not precompiled. Spending a lot of time working out what to remove is also costly in time.
Guru Question: In the following header file, what #include directives could be immediately removed without ill effect? You may not make any changes other than removing or rewriting #include directives. Note that the comments are important.
About std headers:
1. Including automatically includes also , , , and . So, we don’t need both and , simple is enouth. However, there is references only for ostream classes in function declarations – so we don’t need complete ostream type here. (Remember about function declarations). Just use forward declaration from instead of and .
2. Including is needed – we use std::list as data member, so compiler needs complete type information here.
About custom headers:
1. class X is derived from A and B. So, compiler heeds complete type information here, leave “a.h” and “b.h” as is.
2. Class C is a great candidate for further removal, but it’s used in function declaration as parameter with passing by-value. It’s sad, but compiler needs to know about what class C is right here, so complete type is needed, and we leave #include “c.h” in code. Additionally, it is strange that class C has virtual functions, but is passed by-value. Schedule a meet with this code author :)
3. Class D is a data member. So, we need to leave “d.h”
4. Class E is used as return type in function declarations only. It’s strange, but compiler doesn’t need complete type here. AFAIR, it was side effect of C calling convention on x86 compilers. Maybe I’m wrong about why it’s not not needed (please correct me?) but I think, we could change #include “e.h” to forward declaration on class E;
The only thing preventing us from this simple optimization is this exercise statement: We may not make any changes other than removing or rewriting #include directives. So, leave as is.
So, there is result:
I think, another optimization will result in redesign, so we can’t omit more headers right now.
Guru Question #2. Many programmers habitually #include many more headers than necessary. Unfortunately, doing so can seriously degrade build times, especially when a popular header file includes too many other headers.
1. Including automatically includes also , , , and .
So, is not needed. However, don’t fool ourself – is excess. Actually, is enough, don’t include .
2. Next step about io streams usage: there is references to streams only. Compiler initially knows reference size for any class, so forward declaration is enough here. Replace streams implementation header with forward declaration: .
That’s all about std headers – unfortunately, we use std::list as member, so full std::list class definition is needed. Lease as is.
Class X is derived from A and B, so we need to know what A and B is – leave “a.h” and “b.h”.
Class C is candidate for further header removal, unfortunatelly, it’s passed into function declarations by value, so compiler need to know what is C, what constructors it have, etc. So, draw a sigh and leave “c.h”.
Class D is a data membe – compiler needs D declaration. Leave “d.h”
Class E is used in return type only – that’s good news. Return type might be incomplete type. This may be surprise, AFAIR, it’s good side effect of C calling conversions. I may be wrong in describing why it may be omitted (please correct me?) but I think we can remove “e.h”.
So, we have:
JQ #1: For a function or a class, what is the difference between a forward declaration and a definition?
For function:
Forward declaration specifies function’s return type and signature (parameters, cv-qualifiers if memeber, template parameters if any). It tells compiler about how to call this function, so parameters sizes should be known. Parameter size if known for any pointers, references or for complete types.
Function definition may be treated as declaration, but additionally should contain function body: zero or more statements which will be executed during function call.
For class:
Class forward declaration provides incomplete type, including class name and template parameters (if any). Note, it doesn’t provide information about inheritance from base classes.
Class definition provides complete type, including instance size (so, data member sizes and definition of base class are needed), declaration of all class methods and template parameters. So, definition provides all necessary information to compiler about what class is and how to use it, with or without function definitions.
// x.h: original header
//
#ifndef X_H
#define X_H
#endif
#if defined(_MSC_VER) && (_MSC_VER >= 1020)
#pragma once
#endif
#include
#include
// None of A, B, C, D or E are templates.
// Only A and C have virtual functions.
#include “a.h” // class A
#include “b.h” // class B
#include “c.h” // class C
#include “d.h” // class D
class E;
class X : public A, private B {
public:
X(const C&);
B f(int, char*);
C f(int, C);
C& g(B);
E h(E);
virtual std::ostream& print(std::ostream&) const;
private:
std::list clist;
D d_;
};
inline std::ostream& operator<<(std::ostream& os, const X& x) {
return x.print(os);
}
#endif // !defined(X_H)
[/ecode]
@Gennaro rewriting #include direcitves includes their replacement by forward declarations.
@Alf I am not 100% sure, but wasn’t the requirement on the completely defined template argument type relaxed for auto_ptr? It is for shared_ptr and unique_ptr in C++11
To the includes:
#include
#include
–> replace them by #include
#include “a.h” // class A
#include “b.h” // class B
#include “d.h” // class D
–> these are needed, as definitions of base classes and direct member variables are always needed (direct as compared to pointer/reference members and instantiated template members that have the class in their argument list)
#include “e.h” // class E
–> rewrite this one as forward declaration to class E; as its only used in function parameters and return types
#include “c.h” // class C
–> needed. Formally, because the standard demands so for std::list template arguments. Technically at least, because X’s destructor is not declared and thus will be implicitly defined by the compiler, wich will instantiate the destructor for the std::list, wich in turn will destroy its elements, needing access to C::~C
In accordance with $3.2/4 –
-‘A’ and ‘B’ need to be complete types.
-Classes C and E need not be complete, because X does not define (as opposed to declaring) any of it’s member functions,
-D needs to be complete because class ‘X’ has a member object of type ‘D’.
-iostream can also be removed as we already include ostream
Remove #include , it’s not needed.
Remove #include . Replace with #include.
Forward declare class C.
I’m probably missing a lot of this.
Hi Herb, a little note just in case there’s an unintended limitation:
> You may not make any changes other than removing or rewriting #include directives
Is the prohibition to forward declare intentional or was the wording just changed at the latest moment resulting too strict?
2. With only modifying the #include statements the only obvious inclusion I can get rid of is iostream (alt. can get rid of ostream, but I like including ostream instead of iostream because it’s more specific, and this doesn’t rely on iostream including ostream internally).
With forward declarations of class C and class E, c.h and e.h can be safely moved to x.cpp.
Also, shouldn’t the
operator<<
overload be extern with the actual implementation defined in x.cpp (alt.: declare it inline)?Remove #include – removes (often costly) instantiation of cin, cout, cerr and so on and ostream is enough here.
Replace #include with #include – removes definition of ostream since only the declaration is needed here.
Replace #include “e.h” with class E; – E is only used as a argument and return type.
A, B and D are needed to calculate the size of X.
C is needed since instantiation of std::list where C is an incomplete type is undefined behaviour.(17.6.4.8§2)
(operator<< should be inline to avoid ODR violations)
In C++03 the <ostream> header could not be formally removed if one used e.g. std::endl, or the << operator. However, in practice one could use just <iostream> (as all examples in the C++03 standard erroneously did), and with C++11 that’s also formally supported. In the above code it’s not an issue, so <ostream> can just be removed. I would also replace full <iostream> with <iosfwd>.
Class E is only used result and argument type, and so its header can be removed. The situation for class C is more complicated. The C++03 standard (I’m not 100% sure of C++11 and don’t have the time to check now, sorry) required the item type of a standard library container such as the std::list in this code, to be a complete type. However, this formal requirement was regularly sinned against with e.g. std::auto_ptr, by ensuring that the code conformed to the actual requirements of the particular C++ implementation. The most infamous case of this lets-be-practical-about-it approach was one of your own PIMPL GOTWs.. :-) But anyway, at least for the formal class C’s header is needed, since C is used as a container item type, so, only the header for E can be omitted.
I would put a de facto standard #pragma once at the of that header.