The Future of C++: Live broadcast this Friday

In my talk on Friday, there will be announcements of broad interest to C++ developers on all compilers and platforms. Please help spread the word.

The Future of C++

Friday, November 2, 2012
12:45pm (U.S. Pacific Time)

This talk will give an update on recent progress and near-future directions for C++, both at Microsoft and across the industry, with some announcements of interest in both areas. The speaker is the lead language architect of Visual C++ and chair of the ISO C++ committee.

The talk will be webcast live on Channel 9, and available soon afterwards on demand.

If you know people who are interested in C++, on any platform, you’ll want to let them know to tune in.

Reader Q&A: volatile (again)

Sarmad Asgher asked a variant of a perennial question:

I am implementing multi producer single consumer problem. I have shared variables like m_currentRecordsetSize which tells the current size of the buffer. I am using m_currentRecordsetSize in a critical section do i need to declare it as volatile.

If you’re in C or C++, and the variable is not already being protected by a mutex or similar, then you need to declare it atomic (e.g., if it’s an int, then atomic_int in C or atomic<int> in C++. Not volatile.

Also is there any article by you on this topic. Please do reply.

There is! See my article “volatile vs. volatile” for the difference and why C/C++ volatile has nothing to do with inter-thread communication.

CTP of Windows XP Targeting with C++ in Visual Studio 2012

The three by-far-most-requested “missing features” from Visual C++ 2012 were:

Conformance: Keep adding more C++11 language conformance features.
XP Targeting: Deliver the ability to build applications that could run on Windows XP, as well as Windows Vista, 7, and 8.
Desktop Express: Deliver a free VC++ Express compiler that can be used to create traditional Windows desktop apps, not just Windows Store apps.

Over the spring and summer, we promised to address these “soon after VC++ 2012 ships.”

Well, VC++ 2012 shipped four weeks ago.

What’s happened since then?

3. On the same day VS2012 shipped, four weeks ago, we also announced and released Visual Studio 2012 Desktop Express – a free Visual Studio version for writing traditional Windows applications.

2. Today, we are pleased to share a “community tech preview” (CTP) of Windows XP targeting in Visual C++ 2012, being delivered as part of Visual Studio 2012 Update 1 CTP 3. You can download the preview here. See the announcement page for details, known issues, and full release information.

1. Stay tuned…

Poll: What features would you like to see added soonest in your favorite C++ compiler?

I just got back from teaching a class, and I’m always amazed at the breadth and diversity of C++ developers. As Bjarne Stroustrup famously says: “No one knows ‘what most C++ developers do.'”

In particular, I’m surprised at how strongly some people feel about certain features, such as refactoring or safety or raw performance or exhaustive conformance, that don’t matter much at all to other people, and it made me wonder how a cross-section of the C++ developer community might prioritize them if I ran a poll.

So let me pose a question: Whether you use gcc or Clang or VC++ or Intel C++ or any other compiler, if you could only pick five of the following for your compiler to implement in its very next release, which would they be? Note that this is intended to be purely an “urgency” or ordering exercise, not an “importance” or eventually-will/won’t-do exercise — assume that your compiler vendor would eventually add everything possible from the list in their next few releases, and ask yourself only what would you just love to have in your hands the very soonest, right away?

(This is not intended to be a complete list, by the way, just picking some things I’ve again recently heard people mention, to validate how broadly interesting they really are.)

Thanks in advance for your participation. I’m really curious to see the results.

[Updated to add: Several people have asked to change the poll to add an option for “faster build times”. I won’t change the poll while in progress which would skew results, but for now if you want to vote for “faster builds” please post it as a comment and I’ll manually add those comments up since that seems to be a popular request.]

Casablanca: C++ on Azure

I’ve blogged about Casablanca before. Here’s a related talk from TechEd Australia:

Casablanca is a Microsoft incubation effort to support cloud-based client-server communication in native code using a modern asynchronous C++ API design. Think of it as Node.js, but using C++ – from simple services, to JSON and REST, to Azure storage and deployment, and more.

Casablanca gives you the power to use existing native C++ libraries and code to do awesome things on the cloud server. In this talk from TechEd Australia, John Azariah and Mahesh Krishnan show how it’s done.

C&B 2012 panel posted: Ask Us Anything!

The second panel from C++ and Beyond 2012 is now available on Channel 9:

Alexandrescu, Meyers and Sutter – Ask Us Anything

Here is the “Ask Us Anything” panel from C++ and Beyond 2012.

Andrei Alexandrescu, Scott Meyers and Herb Sutter take questions from attendees. As expected, great questions and answers…

Table of contents (click the time codes ([xx:xx]) to hear the answers…):

message passing primitives in future versions of the standard… [00:00]

standardized unit testing framework… [02:55]

std::async… [04:30]

standard modules proposal… [08:14]

keyword additions and the standard library… [09:35]

problems (and solutions) with exceptions… [12:50]

future of concepts… [22:34]

std::thread and thread interruption… [23:03]

when to use the auto keyword (and when not to…)… [25:03]

more on auto (benefits of reduncancy, type conversion issues with bool to int?)… [29:31]

const and multithreaded programming, in C++11 const means thread safe, too… [35:00]

yet more on auto (impact on rampant use and code readability/comprehension)… [42:42]

compiler type deduction information (compiler switch that prints out auto deduced type information)… [50:18]

printing out code for review that replaces auto with the actual type… [53:30]

auto and dynamic memory allocation… [54:59]

useful, broadly-used concurrency libraries… [57:00]

VC++ 2012 Desktop Express (Free)

Today Microsoft released another free Express version of Visual C++ 2012. In addition to the free Express Visual C++ compiler for building tablet applications, Visual Studio Express 2012 for Windows Desktop directly supports traditional Windows and command-line applications in C++.

This a great free C++ compiler on Windows for everything from hobby development to using and contributing to open source projects. Besides additional C++11 standards conformance with range-for, override and final on the language side (with more to come in the coming months; watch this space) and a complete C++11 standard library implementation, the free compiler also includes unit testing framework for C++, code analysis for C++ (try /analyze today if you haven’t already, as John Carmack says so well), C++ AMP for GPGPU programming, and much more.

Reader Q&A: How to write a CAS loop using std::atomics

The following is not intended to be a complete treatise on atomics, but just an answer to a specific question.

A colleague asked:

How should one write the following “conditional interlocked” function in the new C++ atomic<> style?

// if (*plValue >= 0) *plValue += lAdd  ; return the original value

LONG MpInterlockedAddNonNegative(__inout LONG volatile* plValue,  __in  LONG const  lAdd) 
{ 
    LONG lValue = 0; 
    for (;;)  {

        lValue = *plValue; // volatile plValue suppress compile optimizations in which

                           // lValue is optimized out hence MT correctness is broken

        if (lValue < 0)   break;

        if (lValue == InterlockedCompareExchange(plValue, lValue + lAdd, lValue)) { 
            break; 
        } 
    }

    return lValue; 
}

Note: ISO C/C++ volatile is not for inter-thread communication,[*] but this is legacy code that predates std::atomics and was using a combination of platform-specific volatile semantics and Windows InterlockedXxx APIs.

The answer is to use a CAS loop (see code at top), which for std::atomics is spelled compare_exchange:

Use compare_exchange_weak by default when looping on this which generally naturally tolerates spurious failures.
Use compare_exchange_strong for single tests when you generally don’t want spurious failures.
Usage note: In the code at top we save an explicit reload from ‘a’ in the loop because compare_exchange helpfully (or “helpfully” – this took me a while to discover and remember) stores the actual value in the ‘expected’ value slot on failure. This actually makes loops simpler, though some of us are still have different feelings on different days about whether this subtlety was a good idea… anyway, it’s in the standard.

For the std::atomic version, roughly (compiling in my head), and generalizing to any numeric type just because I’m in the habit, and renaming for symmetry with atomic<T>::fetch_add(), I think this is what you want:

template<typename T>
T fetch_add_if_nonnegative( std::atomic<T>& a,  T val ) {
    T old = a;
    while( old >= 0 && !a.compare_exchange_weak( old, old+val ) )
        { }
    return old;
}

Because the only test in your loop was to break on negative values, it naturally migrated into the loop condition. If you want to do more work, then follow the general pattern which is the following (pasting from the standard, 29.6.5/23 – and note that the explicit “.load()” is unnecessary but some people including the author of this clause of the standard prefer to be pedantically explicit :) ):

[ Example: the expected use of the compare-and-exchange operations is as follows.

The compare-and-exchange operations will update expected when another iteration of the loop is needed.

expected = current.load();

do {

desired = function(expected);

} while (!current.compare_exchange_weak(expected, desired));

—end example ]

So the direct implementation of your function in the general pattern would be:

T old = a; 
do { 
    if( old < 0 ) break; 
} while(!a.compare_exchange_weak( old, old+val ) )

but since that easily moves into the loop test I just did this instead in the code at top:

T old = a; 
while( old >= 0 && !a.compare_exchange_weak( old, old+val ) ) 
    { }

and hoping that no one will discover and point out that I’ve somehow written a subtle bug by trying to make the code cuter just before leaving for a holiday weekend.

[*] Here’s the difference between ISO C/C++ volatile vs. std::atomic<T>/atomic_T: ISO C/C++ volatile is intended to be used only for things like hardware access and setjmp/longjmp safety, to express that the variable is in storage that is not guaranteed to follow the C++11 memory model (e.g., the compiler can’t make any assumptions about it). It has nothing to do with inter-thread communication – the proper tool for that is std::atomic<T> which for C compatibility can also be spelled atomic_T (note that in Java and C# this is called volatile which adds to the confusion). For more, see my article “volatile vs. volatile” and Hans Boehm’s ISO C++ paper “Should volatile Acquire Atomicity and Thread Visibility Semantics?”.

C&B Panel: Alexandrescu, Meyers, Sutter on Static If, C++11, and Metaprogramming

The first panel from C++ and Beyond 2012 is now available on Channel 9:

On Static If, C++11 in 2012, Modern Libraries, and Metaprogramming

Andrei Alexandrescu, Scott Meyers, Herb Sutter

Channel 9 was invited to this year’s C++ and Beyond to film some sessions (that will appear on C9 over the coming months!)…

At the end of day 2, Andrei, Herb and Scott graciously agreed to spend some time discussing various modern C++ topics and, even better, answering questions from the community. In fact, the questions from Niners (and a conversation on reddit/r/cpp) drove the conversation.

Here’s what happened…

[more]

“Strong” and “weak” hardware memory models

In Welcome to the Jungle, I predicted that “weak” hardware memory models will disappear. This is true, and it’s happening before our eyes:

x86 has always been considered a “strong” hardware memory model that supports sequentially consistent atomics efficiently.
The other major architecture, ARM, recently announced that they are now adding strong memory ordering in ARMv8 with the new sequentially consistent ldra and strl instructions, as I predicted they would. (Actually, Hans Boehm and I influenced ARM in this direction, so it was an ever-so-slightly disingenuous prediction…)

However, at least two people have been confused by what I meant by “weak” hardware memory models, so let me clarify what “weak” means – it means something different for hardware memory models and software memory models, so perhaps those aren’t the clearest terms to use.

By “weak (hardware) memory model” CPUs I mean specifically ones that do not natively support efficient sequentially consistent (SC) atomics, because on the software side programming languages have converged on “sequential consistency for data-race-free programs” (SC-DRF, roughly aka DRF0 or RCsc) as the default (C11, C++11) or only (Java 5+) supported software memory model for software. POWER and ARMv7 notoriously do not support SC atomics efficiently.

Hardware that supports only hardware memory models weaker than SC-DRF, meaning that they do not support SC-DRF efficiently, are permanently disadvantaged and will either become stronger or atrophy. As I mentioned specifically in the article, the two main current hardware architectures with what I called “weak” memory models were current ARM (ARMv7) and POWER:

ARM recently announced ARMv8 which, as I predicted, is upgrading to SC acquire/release by adding new SC acquire/release instructions ldra and strl that are mandatory in both 32-bit and 64-bit mode. In fact, this is something of an industry first — ARMv8 is the first major CPU architecture to support SC acquire/release instructions directly like this. (Note: That’s for CPUs, but the roadmap for ARM GPUs is similar. ARM GPUs currently have a stronger memory model, namely fully SC; ARM has announced their GPU future roadmap has the GPUs fully coherent with the CPUs, and will likely add “SC load acquire” and “SC store release” to GPUs as well.)
It remains to be seen whether POWER will adapt similarly, or die out.

Note that I’ve seen some people call x86 “weak”, but x86 has always been the poster child for a strong (hardware) memory model in all of our software memory model discussions for Java, C, and C++ during the 2000s. Therefore perhaps “weak” and “strong” are not useful terms if they mean different things to some people, and I’ve updated the WttJ text to make this clearer.

I will be discussing this in detail in my atomic<> Weapons talk at C&B next week, which I hope to make freely available online in the near future (as I do most of my talks). I’ll post a link on this blog when I can make it available online.