• Home
  • Welcome to the Jungle
  • Elements of Modern C++ Style
  • About

Sutter’s Mill

Herb Sutter on software, hardware, and concurrency

Feeds:
Posts
Comments
« C&B Panel: Alexandrescu, Meyers, Sutter on Static If, C++11, and Metaprogramming
VC++ 2012 Desktop Express (Free) »

Reader Q&A: How to write a CAS loop using std::atomics

2012-08-31 by Herb Sutter

The following is not intended to be a complete treatise on atomics, but just an answer to a specific question.

A colleague asked:

How should one write the following “conditional interlocked” function in the new C++ atomic<> style?

// if (*plValue >= 0) *plValue += lAdd  ; return the original value

LONG MpInterlockedAddNonNegative(__inout LONG volatile* plValue,  __in  LONG const  lAdd) 
{ 
    LONG lValue = 0; 
    for (;;)  {

        lValue = *plValue; // volatile plValue suppress compile optimizations in which

 

                           // lValue is optimized out hence MT correctness is broken

        if (lValue < 0)   break;

        if (lValue == InterlockedCompareExchange(plValue, lValue + lAdd, lValue)) { 
            break; 
        } 
    }

    return lValue; 
}

Note: ISO C/C++ volatile is not for inter-thread communication,[*] but this is legacy code that predates std::atomics and was using a combination of platform-specific volatile semantics and Windows InterlockedXxx APIs.

The answer is to use a CAS loop (see code at top), which for std::atomics is spelled compare_exchange:

  • Use compare_exchange_weak by default when looping on this which generally naturally tolerates spurious failures.
  • Use compare_exchange_strong for single tests when you generally don’t want spurious failures.
  • Usage note: In the code at top we save an explicit reload from ‘a’ in the loop because compare_exchange helpfully (or “helpfully” – this took me a while to discover and remember) stores the actual value in the ‘expected’ value slot on failure. This actually makes loops simpler, though some of us are still have different feelings on different days about whether this subtlety was a good idea… anyway, it’s in the standard.

For the std::atomic version, roughly (compiling in my head), and generalizing to any numeric type just because I’m in the habit, and renaming for symmetry with atomic<T>::fetch_add(), I think this is what you want:

template<typename T>
T fetch_add_if_nonnegative( std::atomic<T>& a,  T val ) {
    T old = a;
    while( old >= 0 && !a.compare_exchange_weak( old, old+val ) )
        { }
    return old;
}

Because the only test in your loop was to break on negative values, it naturally migrated into the loop condition. If you want to do more work, then follow the general pattern which is the following (pasting from the standard, 29.6.5/23 – and note that the explicit “.load()” is unnecessary but some people including the author of this clause of the standard prefer to be pedantically explicit :) ):

[ Example: the expected use of the compare-and-exchange operations is as follows.

The compare-and-exchange operations will update expected when another iteration of the loop is needed.

expected = current.load();

do {

desired = function(expected);

} while (!current.compare_exchange_weak(expected, desired));

—end example ]

So the direct implementation of your function in the general pattern would be:

T old = a; 
do { 
    if( old < 0 ) break; 
} while(!a.compare_exchange_weak( old, old+val ) )


but since that easily moves into the loop test I just did this instead in the code at top:

T old = a; 
while( old >= 0 && !a.compare_exchange_weak( old, old+val ) ) 
    { }

and hoping that no one will discover and point out that I’ve somehow written a subtle bug by trying to make the code cuter just before leaving for a holiday weekend.

 

[*] Here’s the difference between ISO C/C++ volatile vs. std::atomic<T>/atomic_T: ISO C/C++ volatile is intended to be used only for things like hardware access and setjmp/longjmp safety, to express that the variable is in storage that is not guaranteed to follow the C++11 memory model (e.g., the compiler can’t make any assumptions about it). It has nothing to do with inter-thread communication – the proper tool for that is std::atomic<T> which for C compatibility can also be spelled atomic_T (note that in Java and C# this is called volatile which adds to the confusion). For more, see my article “volatile vs. volatile” and Hans Boehm’s ISO C++ paper “Should volatile Acquire Atomicity and Thread Visibility Semantics?”.

Posted in C++, Concurrency | 8 Comments

8 Responses

  1. on 2012-09-02 at 12:30 am MF

    I hate to disappoint you on your holiday, but I think you have reversed the ‘old >= 0′ test between the second to last and the last version.


  2. on 2012-09-02 at 6:35 am Herb Sutter

    @MF: You’re right. Fixed, thanks.


  3. on 2012-09-02 at 11:38 pm pizer

    I don’t understand why you put the old=a assignment out of the loop. Could you shed some more light on this, please? To me, it seems that it breaks the “atomicity” of the operation.


  4. on 2012-09-03 at 12:34 am Amir

    In the version that appears on the front page, you still have the “if( old >= 0 ) break” condition.
    Sigh…I thought I was be the first one who noticed the bug :-)


  5. on 2012-09-03 at 8:29 am Herb Sutter

    @Amir: It must have taken time to refresh the front page — it looks okay to me now.

    @Pizer: The operation is a RMW (read-modify-write). So you have to do an initial load (which is atomic in itself), then the work (in this case a test since we need do nothing if the value is less than 0), then the update write (which is also atomic) — and to make it all atomic, the update write has to test that the value hasn’t changed, and if it has then do nothing (we’re not atomic) and retry (until the whole transaction is atomic, not interleaved with something else). Note that, as the commentary says, the compare_exchange will again reload the current value on failure so you don’t have to write an explicit equivalent “old = a;” inside the loop — this is kind of subtle, but it’s the way it was specified because it gives a slight performance advantage.


  6. on 2012-09-04 at 6:16 am Javier Estrada

    Herb, where’s the book ;-)

    With the amount of work in my plate I rather read this emerging good advice and good practices in book form.

    Cheers,
    –Javier


  7. on 2012-09-05 at 3:39 pm Joe

    Would it be even better to do the initial load using a.load(std::memory_order_relaxed) ? That would probably be most equivalent to the original implementation, wouldn’t it? It would also save you from the unnecessary synchronization on the load, and just rely on the CAS to keeps things atomic.


  8. on 2012-09-05 at 4:20 pm Paul Coccoli

    Am I right in that the behavior of the ‘expected’ argument to compare_exchange is analogous to the behavior of the GCC intrinsic __sync_val_compare_and_swap()? That intrinsic returns the value it found at the memory location, instead of a boolean, and therefore makes an explicit reload unnecessary, IIUC.



Comments are closed.

  • Tweets

    • Lost two comments: As mentioned in my GotW kickoff post, I’m experimenting with software and a workflow that l... bit.ly/14IUuDA 11 hours ago
    • GotW #6b: Const-Correctness, Part 2: const and mutable are powerful tools for writing safer code. Use them con... bit.ly/11ivVOR 15 hours ago
    • GotW #6a: Const-Correctness, Part 1: const and mutable have been in C++ for many years. How well do you know w... bit.ly/14I2apD 15 hours ago
    Follow @herbsutter
  • Popular

    • GotW #5 Solution: Overriding Virtual Functions
    • GotW #6a: Const-Correctness, Part 1
    • GotW #6a Solution: Const-Correctness, Part 1
  • Categories

    • Apple
    • C# / .NET
    • C++
    • Cloud
    • Concurrency
    • Effective Concurrency
    • Friday Thoughts
    • GotW
    • Hardware
    • Java
    • Microsoft
    • Opinion & Editorial
    • Reader Q&A
    • Software Development
    • Talks & Events
    • Uncategorized
    • Web

Blog at WordPress.com.

Theme: Customized MistyLook by WPThemes.


Follow

Get every new post delivered to your Inbox.

Join 1,423 other followers

Powered by WordPress.com