GotW-ish: The ‘clonable’ pattern

Yesterday, I received this question from a distinguished C++ expert who served on the ISO C++ committee for many years. The email poses a decades-old question that still has the classic roll-your-own answer in C++ Core Guidelines #C.130, and basically asks whether we’ve made significant progress toward automating this pattern in modern C++ compared to what we had back in the 1990s and 2000s.

Before I present my own answer, I thought I would share just the question and give all of you readers the opportunity to propose your own candidate answers first — kind of retro GotW-style, except that I didn’t write the question myself. Here is the email, unedited except to fix one typo…


In trying to wrap my mind around all the new stuff for C++20, I see that there is one idea that has been around for quite a while but I still don’t see implemented. I’m wondering whether I’m missing something obvious, or whether it’s still not there.

Suppose B is a base class and I have a shared_ptr<B> that might point to a B or to any class that might be derived from B. Assuming that B and all of its derived classes are CopyConstructible, I would like to create a new shared_ptr<B> bound to a newly created object that is a copy of the object to which my original shared_ptr points. I cannot write this:

shared_ptr<B> b1 = /* whatever */;
shared_ptr<B> b2 = make_shared<B>(*b1);    //NO

because that will bind b2 to a copy of the B part of *b1.

I think I can make this work by putting a member function in B along the following lines:

class B {
    shared_ptr<B> clone() const { return make_shared<B>(*this); }
    // ...
};

and then for each derived class, write something like

class D: public B {
public:
    shared_ptr<B> clone() const {
        return make_shared<D>(*this);   // not make_shared<B>
    }
    // ...
};

and then I can write

shared_ptr<B> b1 = /* as before */;
shared_ptr<B> b2 = b1->clone();

and b2 will now point to a shared_ptr<B> that is bound to an object with the same dynamic type as *b1.

However, this technique requires me to insert a member function into every class derived from B, with ugly bugs resulting from failure to do so.

So my question is whether there some way of accomplishing this automatically that I’ve missed?


So here’s your challenge:

JG Question

1. Describe as many approaches as you can think of that could let us semi- or fully-automate this pattern, over just writing it by hand every time as recommended in C++ Core Guidelines #C.130. What are each approach’s advantages and drawbacks?

Guru Question

2. Show a working Godbolt.org link that shows how class authors can write as close as possible to this code with the minimum possible additional boilerplate code:

class B {
};

class C : public B {
};

class D : public C {
};

and that still permits the class’ users to write exactly the following:

shared_ptr<B> b1 = make_shared<D>();
shared_ptr<B> b2 = b1->clone();

Hints: Feel free to use any/all mechanism that some production or experimental compiler supports (if it’s somewhere on Godbolt, it’s legal to try), including but not limited to:

or any combination thereof.

Q&A: Does string::data() return a pointer valid for size() elements, or capacity() elements?

A reader asked:

In C++17, for std::string::data(), is the returned buffer valid for the range [data(), data() + size()), or is it valid for [data(), data + capacity())?

The latter seems more intuitive and what I think most people would expect reserve() to create given the non-const version of data() since C++17.

… and then helpfully included the answer, but in fairness clearly they were wondering whether cppreference.com was correct:

Relevant quote from cppreference.com: … “Returns a pointer to the underlying array serving as character storage. The pointer is such that the range [data(); data() + size()) is valid and the values in it correspond to the values stored in the string.”

Yes, cppreference.com is correct. Here’s the quote from the current draft standard:

  • 2 A specialization of basic_string is a contiguous container (22.2.1).
  • 3 In all cases, [data(), data() + size()] is a valid range, data() + size() points at an object with value charT() (a “null terminator”), and size() <= capacity() is true.

Regarding this potential alternative:

or is it valid for [data(), data + capacity())?

No, that would be strange, because it would mean intentionally supporting reading uninitialized characters in any extra raw memory at the end of the string’s current memory block.

Note that the first part of the above quote from the standard hints at the consistency issue: A string is a container, and we want containers to be consistent. We certainly wouldn’t want vector<widget>::data() to behave that way and let callers see raw memory with unconstructed objects.

The latter [… is …] what I think most people would expect reserve() to create

c/reserve/resize/ and I’ll agree :)

Any container’s size()/resize() is about the data you stored in it and that it’s holding for you. Any container’s capacity()/reserve() is about the underlying raw memory buffer just to let you help the container optimize its raw memory management, but it isn’t intended to give you access to the allocated-but-unused memory.

My favorite work-week of 2019

I just can’t get enough of this short video, combining interviews shot at last year’s CppCon with shots of our new “home” location that we’ll be enjoying for the first time two weeks from now.

Please enjoy it — this is an excellent representation of what CppCon is like.

Those of you who know me personally know that I’m a homebody, and that two of my least-liked things are airplanes and hotel rooms. But even so, I wish it was already the afternoon of 15 September, and that I was sitting in a cramped plane seat on the way to Colorado.

If you’ve been waffling on whether to register for this year, you know what to do. I’m looking forward to seeing many of you there soon.