Reader Q&A: Generic lambdas

Tim just added this comment on the GotW #3 Solution blog post from last year:

Are you sure you can use auto in lambda like this?
I can not compile the code and I’m pretty sure auto does not work here.

If you mean auto as a lambda parameter type, such as

[](auto& s){ use(s); }

then yes, it’s (now) legal): That’s a new feature in currently-being-finalized C++14 standard, and it’s called “generic lambdas.” It means that the compiler-generated closure object’s

operator()

is a template, so you can call the same closure object multiple times with different types and get the templated operator stamped out for each set of types it’s called with.

Major compilers are now adding support for this. As of this writing, all of GCC, Clang, and Visual C++ have implemented the basic feature and you can get it in CTP/preview/alpha releases of each, such as GCC or Clang trunk, or Visual C++ November 2013 CTP. I can’t remember offhand which of those compilers have shipped an official release since adding it (VC++ has not) but they’ll all have it in their next released versions.

By the way, isn’t it wonderful that, for the first time in the history of C++, multiple major compilers are in pretty good sync like this, both with each other and with the standard? I think that’s awesome.

Build talk tomorrow: Modern C++ — What you need to know

If you’re at Build in San Francisco tomorrow afternoon, I invite you to swing by and spend an hour with us in session 2-661:

Modern C++: What you need to know

by Herb Sutter

Build 2014, Room 2005
2:30-3:30 pm, Thursday April 3, 2014

If you’re new to C++, this talk is aimed directly at you. I was asked to give a “foundational talk” about C++, and I decided that meant I should focus on addressing two questions that I get a lot these days:

FAQ #1 (1-2 slides): When should I use C++ compared to another language – on all platforms in general, and on Microsoft platforms in particular?
FAQ #2 (lots of slides): What should I know about C++ if I’m a {Java|C#|JavaScript|Python|…} developer?

Even if you’re a seasoned C++ developer, there are some nuggets and data points in the middle of the talk that I think you will find useful in your own work, and I hope that the talk as a whole will be helpful to you in providing a way to explain C++’s value proposition and give (or link to) an answer when someone asks you FAQ #2.

I think it will be recorded, and will post a link here when the recording is available.

I look forward to seeing many of you there tomorrow afternoon.

CppCon 2014 Call for Submissions

More news about the first annual CppCon that was announced last week:

CppCon 2014 Call for Submissions

CppCon is the annual, week-long face-to-face gathering for the entire C++ community. The conference is organized by the C++ community for the community and so we invite you to present.

Have you learned something interesting about C++, maybe a new technique possible in C++11? Or perhaps you have implemented something cool related to C++, maybe a new C++ library? If so, consider sharing it with other C++ enthusiasts by giving a talk at CppCon 2014. Submissions deadline is May 15 with decisions sent by June 13. For topic ideas, possible formats, and submission instructions, see the Submissions page.

Note that speakers get free registration to attend the whole conference.

I strongly encourage you to present – not “even if” you’ve never presented before, but “especially if” you haven’t. At 5 days x ~5 tracks x ~6 full-length talks per day, this is a big conference with a lot of room for half-length (30 min), full-length (60 min) and multi-hour formal talks (this is in addition to Lightning Talks, which will be arranged later).

For an idea of talks, from the Submissions page:

We are open to any topic that will be of interest to a mainstream C++ audience. Below are some ideas.

C++11

C++ libraries and frameworks of general interest

C++14 and new standardization proposals

Parallelism/multi-processing

Concepts and generic programming

Functional programming

High performance computing

Software development tools, techniques, and processes for C++

Practical experiences using C++ in real-world applications

Industry-specific perspectives: mobile and embedded systems, game development, high performance trading, scientific programming, robotics, etc.

I know a number of people who are already planning to submit talks, and I am certain we will get talks on all these topics, and likely more.

As for me, I’m going to go propose a talk on lock-free programming now… everyone should have fun with lock-free mail slots and linked lists, and know when to worry about the ABA problem (and know how to solve it in portable C++11 code).

What should you do next?

If you have a talk idea, run don’t walk to submit a talk – but don’t register for the conference yet as you will get free registration for the conference if your talk is accepted.

Otherwise, register today! The first 100 to register get the Super Early Bird rate of $695 for the whole conference, and registration got off to a strong start since it opened last week – a good number of first-100 places are still available. This is the coolest and most informative event for C++ in nearly 20 years, and whether you’re a C++ novice or an expert you are going to have a great time and learn a lot of practical information and skills you can use on your project today.

We have CppCon…

I’m really excited about this event!

Note that the first 100 registrations get a big discount – pasting from the “registration” page:

Regular registration fee is $995 but the first 100 attendees can take advantage of Super Early Bird registration and pay only $695. After that, the Early Bird registration fee is $845 and is valid until the 1st of June. …

The announcement went live four hours ago, and the first registrations have already started to come in.

The full text of today’s announcement follows:

CppCon 2014 Registration Open

Opening Keynote by Bjarne Stroustrup
September 7–12, 2014
Bellevue, Washington, USA

Registration is now open for CppCon 2014 to be held September 7–12, 2014 at the Meydenbauer Center in Bellevue, Washington, USA. The conference will start with the keynote by Bjarne Stroustrup titled “Make Simple Tasks Simple!”

CppCon is the annual, week-long face-to-face gathering for the entire C++ community. The conference is organized by the C++ community for the community. You will enjoy inspirational talks and a friendly atmosphere designed to help attendees learn from each other, meet interesting people, and generally have a stimulating experience. Taking place this year in the beautiful Seattle neighborhood and including multiple diverse tracks, the conference will appeal to anyone from C++ novices to experts.

What you can expect at CppCon:

Invited talks and panels: the CppCon keynote by Bjarne Stroustrup will start off a week full of insight from some of the world’s leading experts in C++. Still have questions? Ask them at one of CppCon’s panels featuring those at the cutting edge of the language.
Presentations by the C++ community: What do embedded systems, game development, high frequency trading, and particle accelerators have in common? C++, of course! Expect talks from a broad range of domains experts focused on practical C++ techniques, libraries, and tools.
Lightning talks: Get informed at a fast pace during special sessions of short, less formal talks. Never presented at a conference before? This is your chance to share your thoughts on a C++-related topic in an informal setting.
Evening events and “unconference” time: Relax, socialize, or start an impromptu coding session.

CppCon’s goal is to encourage the best use of C++ while preserving the diversity of viewpoints and experiences, but other than that it is non-partisan and has no agenda. The conference is a project of the Standard C++ Foundation, a not-for-profit organization whose purpose is to support the C++ software developer community and promote the understanding and use of modern, standard C++ on all compilers and platforms.

S&S Postscript

PS on the previous post regarding Stroustrup & Sutter:

I had asked the organizers whether it would be possible to get a piano in the room.

I just learned a few minutes ago that they will be able to arrange a baby grand. Sweet!

This is going to be fun…

Stroustrup & Sutter on C++: Mar 31 – Apr 1, San Jose, CA

It has occurred to me that I never announced this event here…

In two weeks, Bjarne and I will be doing a two-day Stroustrup & Sutter on C++ seminar in the San Francisco Bay area. It has been several years since the last S&S event, so Bjarne and I are really looking forward to this.

Super C++ Tutorial: Stroustrup & Sutter on C++

EE Live!
March 31 – April 1, 2014
McEnery Convention Center
San Jose, CA, USA

Registration

The organizers have kindly made sure they can expand the room, so seats are still available. Sorry for the short notice here on this blog.

Using C++ atomics: Lock-Free Algorithms and Data Structures in C++

Here’s one interesting content update: The sessions page lists our talks, including a talk by me with the title “Three Cool Things in C++ Concurrency,” which is pronounced “I hadn’t decided what exactly I wanted to talk about by the time I had to submit the session description.”

I have now decided, and the talk will be entirely on how to design and write lock-free algorithms and data structures using C++ atomic<> – something that can look deceptively simple, but contains very deep topics. (Important note: This is not the same as my “atomic<> Weapons” talk; that talk was about the “what they are and why” of the C++ memory model and atomics, and did not cover how to actually use atomics to implement highly concurrent algorithms and data structures.)

This talk is about the “how to use them successfully” part of atomics, including:

Best practices and style for using atomic<>s
Three examples, including lock-free mail slots and iterations of a lock-free linked list, all in portable Standard C++
Defining and applying the different levels of “lock-freedom” (wait-free, lock-free, obstruction-free) to develop highly concurrent algorithms and data structures
Explaining and applying key concepts including especially: linearizability; trading off concurrency vs. promptness; and trading off concurrency vs. throughput/utilization
Demonstrating and solving the ABA problem, with clean and simple code – that’s right, did you know that in C++11 you can solve this without relying on esoterica like double-wide CAS (which isn’t always available) or hazard pointers (which are deeply complex) or garbage collection (which isn’t in the C++ standard… yet)?

A few of you may have seen part of this material in the few times I’ve taught the extended four-day version of my Effective Concurrency course. This version of the material is significantly updated for C++11/14 and also contains new material never before seen even if you did take in the four-day EC course – including that instead of leaving the slist example as a cliffhanger, I present an actual complete solution for the slist example that is (a) correct and (b) can be entirely written using portable Standard C++11. It’s always nice to end with a solution you can actually use, instead of just an open problem cliffhanger…

I’m looking forward to seeing many of you in San Jose in two weeks!

Reader Q&A: Is std::atomic_compare_exchange_* implementable?

Updated 8/26: Duncan’s question is actually correct and compare_exchange should have the semantics he asks for. However, the answer to ‘is it implementable’ is I think still Yes.

Quick answer: Yes.

I see there was also a thread about this on StackOverflow, so I’ll echo this Q&A publicly for others’ benefit and hopefully to dispel confusion.

Duncan Forster asked:

I’m quite alarmed the C++ committee chose such a bad interface for std::atomic compare_exchange, i.e.:

bool compare_exchange_???(T& expected, T desired, …);

I notice you have mentioned (here reader-qa-how-to-write-a-cas-loop-using-stdatomics) that the committee had doubts whether it was a good idea.
Your quote below:

Usage note: In the code at top we save an explicit reload from ‘a’ in the loop because compare_exchange helpfully (or “helpfully” – this took me a while to discover and remember) stores the actual value in the ‘expected’ value slot on failure. This actually makes loops simpler, though some of us are still have different feelings on different days about whether this subtlety was a good idea… anyway, it’s in the standard.

The reason I think it’s not only bad but also dangerous is that we now have a race condition baked into the standard. Race condition you say? All hardware CAS implementations that I know of only return 1 value (the old value). Yet the C++ version has 2 returns (success/failure as a boolean return and the old value by reference). So how can an atomic class which is suppose to implement atomic methods do this? Answer is it can’t, the boolean result is calculated after the atomic exchange has occurred. That leaves us with a method which is only partially atomic and with the bonus of a built-in race condition!

Perhaps I haven’t convinced you, so here’s some code to help. I have simulated the hardware CAS with simple C++ code to help demonstrate the problem. The crux of the problem is this statement: while(!atomic_bool_compare_and_swap(&head, new_node->next, new_node))
By creating a 1-line while loop and passing new_node->next as the expected value, if someone is also consuming data the new_node will temporarily be visible by 2 threads. The other thread may process and delete the node before atomic_bool_compare_and_swap has calculated success/failure. This would result in a spurious failure and the new_node actually being pushed twice onto the queue. As you can image this should lead to double delete and possibly the process aborting.

template<typename _T>
bool atomic_bool_compare_and_swap(_T *value, _T& expected, _T new_value)
{
_T old_value;

// Here be atomic
{
old_value = *value;
if(old_value == expected)
*value = new_value;
}

// Here be race conditions
return (old_value == expected);
}

[… more code that exercises this function …]

I don’t believe there is an implementability bug in the standard. ~~Rather, your code is incorrect.~~

[Update: …off-point stuff omitted…]

Let’s say you have a CAS that returns only the old value, but doesn’t set “expected,” as you describe below. Then you should just be able to implement the standard one in terms of that – quick sketch (untested code):

    template<typename _T>
    bool atomic_compare_exchange(_T *value, _T& expected, _T new_value)
    {
        _T old_value;
        _T old_expected = expected;

        // If all you have is a CAS that returns the old value, use that:
        old_value = CAS(value, expected, new_value);

        bool result = old_value == old_expected;
        expected = old_value;
        return result;
    }

Now that there’s no use of “expected” after the CAS and so no timing window.

If I’m misunderstanding the question, or have a bug in my thinking, please let me know in the comments. [Update: Thanks to Duncan in particular for pointing out my original answer did have a bug in my thinking.]

Trip report: Winter ISO C++ meeting

I just posted my trip report from last week’s ISO C++ meeting over on isocpp.org. The meeting just wrapped up about 48 hours ago, on Saturday afternoon.

This is a real milestone for C++. Not only did we finish C++14 (we think, assuming this coming ballot comes back clean so that we can skip the final extra ballot step), but we made strong progress on all seven (7) of the Technical Specifications in flight… and approved starting an eighth (8^th)!

Good times.

Thanks, everyone who worked so hard to make this happen.

GotW #96: Oversharing

Following on from #95, let’s consider reasons and methods to avoid mutable sharing in the first place…

Problem

Consider the following code from GotW #95’s solution, where some_obj is a shared variable visible to multiple threads which then synchronize access to it.

// thread 1
{
    lock_guard hold(mut_some_obj);     // acquire lock
    code_that_reads_from( some_obj );  // passes some_obj by const &
}

// thread 2
{
    lock_guard hold(mut_some_obj);     // acquire lock
    code_that_modifies( some_obj );    // passes some_obj by non-const &
}

JG Questions

1. Why do mutable shared variables like some_obj make your code:

(a) more complex?

(b) more brittle?

Guru Questions

2. Give an example of how the code that uses a mutable shared variable like some_obj can be changed so that the variable is:

(a) not shared.

(b) not mutable.

3. Let’s say we’re in a situation where we can’t apply the techniques from the answers to #2, so that the variable itself must remain shared and apparently mutable. Is there any way that the internal implementation of the variable can make the variable be physically not shared and/or not mutable, so that the calling code can treat it as a logically shared-and-mutable object yet not need to perform external synchronization? If so, explain. If not, why not?

GotW #95 Solution: Thread Safety and Synchronization

This GotW was written to answer a set of related frequently asked questions. So here’s a mini-FAQ on “thread safety and synchronization in a nutshell,” and the points we’ll cover apply to thread safety and synchronization in pretty much any mainstream language.

Problem

JG Questions

1. What is a race condition, and how serious is it?

2. What is a correctly synchronized program? How do you achieve it? Be specific.

Guru Questions

3. Consider the following code, where some_obj is a shared variable visible to multiple threads.

// thread 1 (performs no additional synchronization)
code_that_reads_from( some_obj );  // passes some_obj by const &

// thread 2 (performs no additional synchronization)
code_that_modifies( some_obj );    // passes some_obj by non-const &

If threads 1 and 2 can run concurrently, is this code correctly synchronized if the type of some_obj is:

(a) int?

(b) string?

(d) shared_ptr<widget>?

(e) mutex?

(f) condition_variable?

(g) atomic<unsigned>?

Hint: This is actually a two-part question, not a seven-part question. There are only two unique answers, each of which covers a subset of the cases.

4. External synchronization means that the code that uses/owns a given shared object is responsible for performing synchronization on that object. Answer the following questions related to external synchronization:

(a) What is the normal external synchronization responsibility of code that owns and uses a given shared variable?

(b) What is the “basic thread safety guarantee” that all types must obey to enable calling code to perform normal external synchronization?

5. Full internal synchronization (a.k.a. “synchronized types” or “thread-safe types”) means that a shared object performs all necessary synchronization internally within that object, so that calling code does not need to perform any external synchronization. What types should be fully internally synchronized, and why?

Solution

Preface

The discussion in this GotW applies not only to C++ but also to any mainstream language, except mainly that certain races have defined behavior in C# and Java. But the definition of what variables need to be synchronized, the tools we use to synchronize them, and the distinction between external and internal synchronization and when you use each one, are the same in all mainstream languages. If you’re a C# or Java programmer, everything here applies equally to you, with some minor renaming such as to rename C++ atomic to C#/Java volatile, although some concepts are harder to express in C#/Java (such as identifying the read-only methods on an otherwise mutable shared object; there are readonly fields and “read-only” properties that have get but not set, but they express a subset of what you can express using C++ const on member functions).

Note: C++ volatile variables (which have no analog in languages like C# and Java) are always beyond the scope of this and any other article about the memory model and synchronization. That’s because C++ volatile variables aren’t about threads or communication at all and don’t interact with those things. Rather, a C++ volatile variable should be viewed as portal into a different universe beyond the language — a memory location that by definition does not obey the language’s memory model because that memory location is accessed by hardware (e.g., written to by a daughter card), have more than one address, or is otherwise “strange” and beyond the language. So C++ volatile variables are universally an exception to every guideline about synchronization because are always inherently “racy” and unsynchronizable using the normal tools (mutexes, atomics, etc.) and more generally exist outside all normal of the language and compiler including that they generally cannot be optimized by the compiler (because the compiler isn’t allowed to know their semantics; a volatile int vi; may not behave anything like a normal int, and you can’t even assume that code like vi = 5; int read_back = vi; is guaranteed to result in read_back == 5, or that code like int i = vi; int j = vi; that reads vi twice will result in i == j which will not be true if vi is a hardware counter for example). For more discussion, see my article “volatile vs. volatile.”

1. What is a race condition, and how serious is it?

A race condition occurs when two threads access the same shared variable concurrently, and at least one is a non-const operation (writer). Concurrent const operations are valid, and do not race with each other.

Consecutive nonzero-length bitfields count as a single variable for the purpose of defining what a race condition is.

Terminology note: Some people use “race” in a different sense, where in a program with no actual race conditions (as defined above) still operations on different threads could interleave in different orders in different executions of a correctly-synchronized program depending on how fast threads happen to execute relative to each other. That’s not a race condition in the sense we mean here—a better term for that might be “timing-dependent code.”

If a race condition occurs, your program has undefined behavior. C++ does not recognize any so-called “benign races”—and in languages that have recognized some races as “benign” the community has gradually learned over time that many of them actually, well, aren’t.

Guideline: Reads (const operations) on a shared object are safe to run concurrently with each other without synchronization.

2. What is a correctly synchronized program? How do you achieve it? Be specific.

A correctly synchronized program is one that contains no race conditions. You achieve it by making sure that, for every shared variable, every thread that performs a write (non-const operation) on that variable is synchronized so that no other reads or writes of that variable on other threads can run concurrently with that write.

The shared variable usually protected by:

(commonly) using a mutex or equivalent;
(very rarely) by making it atomic if that’s appropriate, such as in low-lock code; or
(very rarely) for certain types by performing the synchronization internally, as we will see below.

3. Consider the following code… If threads 1 and 2 can run concurrently, is this code correctly synchronized if the type of some_obj is: (a) int? (b) string? (c) vector<map<int,string>>? (d) shared_ptr<widget>?

No. The code has one thread reading (via const operations) from some_obj, and a second thread writing to the same variable. If those threads can execute at the same time, that’s a race and a direct non-stop ticket to undefined behavior land.

The answer is to synchronize access to the variable, for example using a mutex:

// thread 1
{
    lock_guard hold(mut_some_obj);     // acquire lock
    code_that_reads_from( some_obj );  // passes some_obj by const &
}

// thread 2
{
    lock_guard hold(mut_some_obj);     // acquire lock
    code_that_modifies( some_obj );    // passes some_obj by non-const &
}

Virtually all types, including shared_ptr and vector and other types, are just as thread-safe as int; they’re not special for concurrency purposes. It doesn’t matter whether some_obj is an int, a string, a container, or a smart pointer… concurrent reads (const operations) are safe without synchronization, but the shared object is writeable, then the code that owns the object has to synchronize access to it.

But when I said this is true for “virtually all types,” I meant all types except for types that are not fully internally synchronized, which brings us to the types that, by design, are special for concurrency purposes…

… If threads 1 and 2 can run concurrently, is this code correctly synchronized if the type of g+shared is: (e) mutex? (f) condition_variable? (g) atomic<unsigned>?

Yes. For these types, the code is okay, because these types already perform full internal synchronization and so they are safe to access without external synchronization.

In fact, these types had better be safe to use without external synchronization, because they’re synchronization primitives you need to use as tools to synchronize other variables! And its turns out that that’s no accident…

Guideline: A type should only be fully internally synchronized if and only if its purpose is to provide inter-thread communication (e.g., a message queue) or synchronization (e.g., a mutex).

4. External synchronization means that the code that uses/owns a given shared object is responsible for performing synchronization on that object. Answer the following questions related to external synchronization:

(a) What is the normal external synchronization responsibility of code that owns and uses a given shared variable?

The normal synchronization duty of care is simply this: The code that knows about and owns a writeable shared variable has to synchronize access to it. It will typically do that using a mutex or similar (~99.9% of the time), or by making it atomic if that’s possible and appropriate (~0.1% of the time).

Guideline: The code that knows about and owns a writeable shared variable is responsible for synchronizing access to it.

(b) What is the “basic thread safety guarantee” that all types must obey to enable calling code to perform normal external synchronization?

To make it possible for the code that uses a shared variable to do the above, two basic things must be true.

First, concurrent operations on different objects must be safe. For example, let’s say we have two X objects x1 and x2, each of which is only used by one thread. Then consider this situation:

// Case A: Using distinct objects

// thread 1 (performs no additional synchronization)
x1.something();                   // do something with x1

// thread 2 (performs no additional synchronization)
x2 = something_else;              // do something else with x2

This must always be considered correctly synchronized. Remember, we stated that x1 and x2 are distinct objects, and cannot be aliases for the same object or similar hijinks.

Second, concurrent const operations that are just reading from the same variable x must be safe:

// Case B: const access to the same object

// thread 1 (performs no additional synchronization)
x.something_const();              // read from x (const operation)

// thread 2 (performs no additional synchronization)
x.something_else_const();         // read from x (const operation)

This brings us to the case where there might be a combination of internal and external synchronization required…

(c) What partial internal synchronization can still be required within the shared variable’s implementation?

In some classes, objects that from the outside appear to be distinct but still may share state under the covers, without the calling code being able to tell that two apparently distinct objects are connected under the covers. Note that this not an exception to the previous guideline—it’s the same guideline!

Guideline: It is always true that the code that knows about and owns a writeable shared variable is responsible for synchronizing access to it. If the writeable shared state is hidden inside the implementation of some class, then it’s simply that class’ internals that are the ‘owning code’ that has to synchronize access to (just) the shared state that only it knows about.

A classic case of “under-the-covers shared state” is reference counting, and the two poster-child examples are std::shared_ptr and copy-on-write. Let’s use shared_ptr as our main example.

A reference-counted smart pointer like shared_ptr keeps a reference count under the covers. Let’s say we have two distinct shared_ptr objects sp1 and sp2, each of which is used by only one thread. Then consider this situation:

// Case A: Using distinct objects

// thread 1 (performs no additional synchronization)
auto x = sp1;                      // read from sp1 (writes the count!) 

// thread 2 (performs no additional synchronization)
sp2 = something_else;              // write to sp2 (writes the count!)

This code must be considered correctly synchronized, and had better work as shown without any external synchronization. Okay, fine …

… but what if sp1 and sp2 are pointing to the same object and so share a reference count? If so, that reference count is a writeable shared object, and so it must be synchronized to avoid a race—but it is in general impossible for the calling code to do the right synchronization, because it is not even aware of the sharing! The code we just saw above doesn’t see the count, doesn’t know the count variable’s name, and doesn’t in general know which pointers share counts.

Similarly, consider two threads just reading from the same variable sp:

// Case B: const access to the same object

// thread 1 (performs no additional synchronization)
auto sp3 = sp;                     // read from sp (writes the count!)

// thread 2 (performs no additional synchronization)
auto sp4 = sp;                     // read from sp (writes the count!)

This code too must be considered correctly synchronized, and had better work without external synchronization. It’s not a race, because the two threads are both performing const accesses and reading from the shared object. But under the covers, reading from sp to copy it increments the reference count, and so again that reference count is a writeable shared object, and so it must be synchronized to avoid a race—and again it is in general impossible for the calling code to do the right synchronization, because it is not even aware of the sharing.

So to deal with these cases, the code that knows about the shared reference count, namely the shared_ptr implementation, has to synchronize access to the reference count. For reference counting, this is typically done by making the reference count a mutable atomic variable. (See also GotW #6a and #6b.)

For completeness, yes, of course external synchronization is still required as usual if the calling code shared a given visible shared_ptr object and makes that same shared_ptr object writable across threads:

// Case C: External synchronization still required as usual
//         for non-const access to same visible shared object

// thread 1
{
    lock_guard hold(mut_sp);           // acquire lock
    auto sp3 = sp;                     // read from sp
}

// thread 2
{
    lock_guard hold(mut_sp);           // acquire lock
    sp = something_else;               // modify sp
}

So it’s not like shared_ptr is a fully internally synchronized type; if the caller is sharing an object of that type, the caller must synchronize access to it like it would do for other types, as noted in Question 3(d).

So what’s the purpose of the internal synchronization? It’s only to do necessary synchronization on the parts that the internals know are shared and that the internals own, but that the caller can’t synchronize because he doesn’t know about the sharing and shouldn’t need to because the caller doesn’t own them, the internals do. So in the internal implementation of the type we do just enough internal synchronization to get back to the level where the caller can assume his usual duty of care and in the usual ways correctly synchronize any objects that might actually be shared.

The same applies to other uses of reference counting, such as copy-on-write strategies. It also applies generally to any other internal sharing going on under the covers between objects that appear distinct and independent to the calling code.

Guideline: If you design a class where two objects may invisibly share state under the covers, it is your class’ responsibility to internally synchronize access to that mutable shared state (only) that it owns and that only it can see, because the calling code can’t. If you opt for under-the-covers-sharing strategies like copy-on-write, be aware of the duty you’re taking on for yourself and code with care.

For why such internal shared state should be mutable, see GotW #6a and #6b.

5. … What types should be fully internally synchronized, and why?

There is exactly one category of types which should be fully internally synchronized, so that any object of that type is always safe to use concurrently without external synchronization: Inter-thread synchronization and communication primitives themselves. This includes standard types like mutexes and atomics, but also inter-thread communication and synchronization types you might write yourself such as a message queue (communicating messages from one thread to another), Producer/Consumer active objects (again passing data from one concurrent entity to another), or a thread-safe counter (communicating counter increments and decrements among multiple threads).

If you’re wondering if there might be other kinds of types that should be internally synchronized, consider: The only type for which it would make sense to always internally synchronize every operation is a type where you know every object is going to be both (a) writeable and (b) shared across threads… and that means that the type is by definition designed to be used for inter-thread communication and/or synchronization.

Acknowledgments

Thanks in particular to the following for their feedback to improve this article: Daniel Hardman, Casey, Alb, Marcel Wid, ixache.