Effective Concurrency: Eliminate False Sharing

This month’s Effective Concurrency column, “Eliminate False Sharing”, is now live on DDJ’s website.

People keep writing asking me about my previous mentions of false sharing, even debating whether it’s really a problem. So this month I decided to treat it in depth, including:

A compelling and realistic example where just changing a couple of lines to remove false sharing takes an algorithm from zero scaling to perfect scaling – even when many threads are merely doing reads. Hopefully after this nobody will argue that false sharing isn’t a problem. :-)
How your performance monitoring and analysis tools do and/or don’t help you uncover the problem, and how to use them effectively to identify the culprit. Short answer: CPU activity monitors aren’t very helpful, but cycles-per-instruction (CPI) and cache miss rate measurements attributed to specific lines of source code are your friend.
The two ways to correct the code: Reduce the frequency of writes to the too-popular cache line, or add padding to move other data off the line.
Reusable code in C++ and C#, and a note about Java, that you can use to use padding (and alignment if available) to put frequently-updated objects on their own cache lines.

From the article:

In two previous articles I pointed out the performance issue of false sharing (aka cache line ping-ponging), where threads use different objects but those objects happen to be close enough in memory that they fall on the same cache line, and the cache system treats them as a single lump that is effectively protected by a hardware write lock that only one core can hold at a time. … It’s easy to see why the problem arises when multiple cores are writing to different parts of the same cache line… In practice, however, it can be even more common to encounter a reader thread using what it thinks is read-only data still getting throttled by a writer thread updating a different but nearby memory location…

A number of readers have asked for more information and examples on where false sharing arises and how to deal with it. … This month, let’s consider a concrete example that shows an algorithm in extremis due to false sharing distress, how to use tools to analyze the problem, and the two coding techniques we can use to eliminate false sharing trouble. …

I hope you enjoy it. Finally, here are links to previous Effective Concurrency columns:

The Pillars of Concurrency (Aug 2007)

How Much Scalability Do You Have or Need? (Sep 2007)

Use Critical Sections (Preferably Locks) to Eliminate Races (Oct 2007)

Apply Critical Sections Consistently (Nov 2007)

Avoid Calling Unknown Code While Inside a Critical Section (Dec 2007)

Use Lock Hierarchies to Avoid Deadlock (Jan 2008)

Break Amdahl’s Law! (Feb 2008)

Going Superlinear (Mar 2008)

Super Linearity and the Bigger Machine (Apr 2008)

Interrupt Politely (May 2008)

Maximize Locality, Minimize Contention (Jun 2008)

Choose Concurrency-Friendly Data Structures (Jul 2008)

The Many Faces of Deadlock (Aug 2008)

Lock-Free Code: A False Sense of Security (Sep 2008)

Writing Lock-Free Code: A Corrected Queue (Oct 2008)

Writing a Generalized Concurrent Queue (Nov 2008)

Understanding Parallel Performance (Dec 2008)

Measuring Parallel Performance: Optimizing a Concurrent Queue (Jan 2009)

volatile vs. volatile (Feb 2009)

Sharing Is the Root of All Contention (Mar 2009)

Use Threads Correctly = Isolation + Asynchronous Messages (Apr 2009)

“Use Thread Pools Correctly: Keep Tasks Short and Nonblocking” (Apr 2009)

“Eliminate False Sharing” (May 2009)

You Know When Your UI Needs Help When…

Seen at a gas station:

You know your UI has usability issues when people tape multiple signs on your gas pump to help people get through the intricate and error-prone process of purchasing fuel.

Why does the upper note exist? The trouble is that there’s a Debit button but not a Credit button, and so since Credit is the default, users need to remember to override the default before swiping the card. People will sometimes naturally forget the special step because there was originally no reminder that this needed to be done, and because most other pumps don’t work that way. One solution would be to print the information prominently near the card reader, thus standardizing the note and making it more visible. A better and simpler solution would be to do what most pumps do: Avoid the opportunity for forgetting to specify the right thing by having both buttons, Debit and Credit, and simply prompting the user to press one or the other after they swipe their card.

But the trouble behind the lower note is even more blatant and, frankly, inexcusable: Is there any good reason not to conditionally print “ENTER PIN” or “ENTER ZIP,” instead of just always printing “ENTER DATA” which is not only unclear but also one of the classic geeky words to avoid in a consumer-oriented UI?

Then again, I’m amazed that in this day and age I still see output like “1 item(s) purchased.” Apparently it’s still more important to write the programmer-friendly printf( “%d item(s)”, count ) than the user-friendly printf( “%d item%s”, count, (count==1 ? “” : “s”) ).

A Wryly Repurposed Quotation

In my travels, I recently came across this empty store with an almost-empty box beside the front door. As seen in Monterey, CA:

Evidently some character had also noticed the empty store with its empty box, and decided to do a little walk-by wry economic commentary via repurposed quotation. Zooming on the once-empty box:

RIP: SD Conferences

The latest casualties in the technical education world are the Software Development conferences – SD West, SD Best Practices, and Architecture & Design World – which are being discontinued effective immediately, making the SD West that was just held earlier this month the last of its kind. The conferences were run by the same company that ran some important but now-defunct magazines including C/C++ Users Journal and of course Software Development Magazine, as well as Dr. Dobb’s which still exists but recently went web-only.

SD, especially SD West, has been a mainstay of the programming conferences world since 1985. I’m going to miss these shows. They were a wonderful gathering place for talks and tutorials by A-list speakers covering many technologies, from C++ to Java to AJAX and more. More personally, Bjarne Stroustrup and I immensely enjoyed giving our two-day Stroustrup & Sutter seminar there several times in recent years, and skipped this year’s event in anticipation of doing a fresh one again next spring. Alas, it’s not to be (at least not with SD).

SD, you will be missed.

Postponed: Effective Concurrency Europe

Right now I should be at 40,000 feet somewhere over Baffin Island on my way to Stockholm for Effective Concurrency Europe, but instead I’m in bed with a fever that I’ve had since Wednesday night and still unable to talk. The organizer and I have been staying in touch with flu updates every few hours to see if lots of “TLC and OTC” (tender loving care and over-the-counter drugs) could get me well enough to fly and push through teaching for three days; that usually works, but this time it’s a really nasty one, and we have to postpone.

I’m sorry to have to disappoint this week and for any inconvenience this will surely cause some of you, and I hope that you’ll understand. The organizer has already emailed everyone who has registered to notify you of the cancellation of this week’s dates (this post partly serves as a backup to that email), and we’ll keep you posted as we reschedule. The response has exceeded expectations by a good margin, and I’m grateful for your strong interest and planning to take time away from your schedules to come and talk concurrency with me. I’m still very much looking forward to seeing you there in the near future. (And minus potentially spreading wicked germs.)

Best wishes to you and yours,

Herb

Free Training For Laid-Off Developers

Like many areas in the United States, Seattle has recently been hit with layoffs and downsizing in our industry. So it’s quite timely that Steve McConnell’s company Construx, in the Seattle area, is offering free training for laid-off software workers:

After listening to doom and gloom economic reports for the past few months, we decided we would try to do something to brighten our little corner of the world. Here’s our official press release about it:

Construx Software has designated 25% of its public seminar seats free of charge to software workers who have been laid off. Construx seminars help software professionals improve their technical and managerial skills. Seminar attendees will be more effective when they reenter the workforce. Construx hopes this program will help laid-off software workers reenter the workforce more quickly.

Please read the full article for more details.

I’m very pleased to report that this program will include my next U.S. Effective Concurrency seminar which will be held in Bellevue, WA on July 27-30, 2009. I’ll blog about that seminar again as we get closer, but it’s on the Construx calendar now.

I hope that this will be helpful to some people, and I look forward to seeing some of you in Bellevue.

Aside: Please Be Professional

One problem with offering things for free is that people don’t always value them. I was surprised to learn that a surprising number of people who’ve asked for and received this free admission at Construx have been no-shows.

The following should go without saying, but here it is: As a courtesy to Construx and to other attendees, please don’t sign up unless you intend to come, and if something comes up that prevents you from attending please notify Construx. Otherwise, you may be taking the place of someone who would have liked to attend in your place (seating is limited), and Construx is out the cost of food (they provide breakfast and lunch) and printed color binders and such that go to waste.

Other Free Materials

Most of you know that virtually all the articles and columns I’ve ever written have always been available for free on my own website or via magazine websites (modulo some link rot, sigh). These include all the Effective Concurrency articles, and my C++-focused “Guru of the Week” and magazine articles.

The only things I’ve written that aren’t legally freely available are the final texts of my books, including translations, to which the publishing company owns the copyright; but the books are based on the same freely-available English articles, and Addison-Wesley/Pearson has never had any issue with those staying available, so the basic material is all there.

But there’s news on the “final book contents” front, too, at least in English: Once my current book(s) on Effective Concurrency is done, Jim Hyslop and I plan to compile a book of the C++ Conversations articles we coauthored in C++ Report and C/C++ Users Journal (thanks again for your patience, Jim!) and I also plain to write a second edition of Exceptional C++ – and both books will be updated for C++0x, which is now feature-complete and undergoing its first round of international review. It’s quite interesting just how often using C++0x language and library features (from lambdas to shared_ptrs and concurrency) really help solve the old problems in even more elegant and robust ways… and I’m sure we’ll throw in a couple of new problems and solutions too. My goal is to post these books’ draft Items, perhaps in the form of blog entries, as we write them. But I’m also working with the publisher to see if perhaps we could even post the very final Items in that format with their permission. I’ll post more news about that as there’s something to say.

Income in Perspective: 2 Bppl @ $3/day

I just saw a CNN headline that read: “Young workers scrimp to live on $15/wk.” Before reading further, what do you think: Is that stunning and shocking? Or shockingly typical?

The story turned out to be a piece about white-collar workers in China trying to live frugally, spending only 100 Yuan on travel and food during the workweek to conserve funds. Of course, the workers’ actual total expenses and income are higher, because that $15/week figure doesn’t include weekend expenses and other major costs like rent. Even so, the story is considered newsworthy here, and is probably a shock to a number of readers in the western world.

But the headline wouldn’t surprise readers who are familiar with the approximate distribution of income/GDP/wealth in the world.

To illustrate, here are two personal data points from 2006, when my wife and I traveled to Kenya and Zambia to visit friends:

Income: In Kenya, we were told that being a staff worker at a safari lodge is considered a good job. What does it pay? About $2 per day, for long hours and six-day weeks. This isn’t unusual; in about 30 countries, including Kenya, more than half of the population earns under $2 per day. An estimated two billion people – 30% of the world’s population – live on an income of less than $3 per day. And $3 per day is about what the attention-grabbing CNN headline implies, though the actual story behind that headline is much less bad.
Cost of living: But what happens when we consider, not just dollar-for-dollar comparisons, but purchasing power? Isn’t it less expensive to live in less-developed countries? Yes, it usually is, especially for shelter and services – there’s been some talk lately on the U.S. news about retiring in Mexico as a way for older people to save money in this economy – but the difference for the same quality goods is often less than one might think. In Lusaka, the capital of Zambia, we found there were only three grocery stores [*] having similar goods to what we would expect to find in a U.S.-style Safeway, although of course the Zambian stores were much smaller than U.S. stores (more the size of a medium Trader Joe’s) and offered far less selection diversity (something like a factor of 20 fewer varieties or brands). When we visited one of the stores, I picked up a few Western-style items, totaled the price and converted to U.S. dollars in my head, and found that those comparable products in Lusaka cost nearly the same as we would have paid for the same items in Seattle. Our local friend replied: “Right, nobody who lives here would ever think of buying a can of soda pop.” Certainly not when a can of Coke costs a day’s wage for many people, and doesn’t confer any significant nutritional benefit.

The gulf between the western- and world-median standards of living is, simply put, vast – and growing. The standard of living that’s normal for most of the planet’s population is well nigh unimaginable to many of us in the western world, and even for those of us who’ve been there, it’s one thing to see it and quite another to really understand what such a life would be like. I don’t claim to.

[*] They might well be the only such stores in the country, not just the capital.

From the "we know what they meant, but it’s not what they said" department

While walking our dogs recently, we came across several of these signs — ironically, in front of our neighborhood school.

Effective Concurrency: volatile vs. volatile

This month’s Effective Concurrency column, “volatile vs. volatile”, is now live on DDJ’s website and also appears in the print magazine. (As a historical note, it’s DDJ’s final print issue, as I mentioned previously.)

This article aims to answer the frequently asked question: “What does volatile mean?” The short answer: “It depends, do you mean Java/.NET volatile or C/C++ volatile?” From the article:

What does the volatile keyword mean? How should you use it? Confusingly, there are two common answers, because depending on the language you use volatile supports one or the other of two different programming techniques: lock-free programming, and dealing with ‘unusual’ memory.

Adding to the confusion, these two different uses have overlapping requirements and impose overlapping restrictions, which makes them appear more similar than they are. Let’s define and understand them clearly, and see how to spell them correctly in C, C++, Java and C# — and not always as volatile. …

I hope you enjoy it. Finally, here are links to previous Effective Concurrency columns and the DDJ print magazine issue in which they first appeared:

The Pillars of Concurrency (Aug 2007)

How Much Scalability Do You Have or Need? (Sep 2007)

Use Critical Sections (Preferably Locks) to Eliminate Races (Oct 2007)

Apply Critical Sections Consistently (Nov 2007)

Avoid Calling Unknown Code While Inside a Critical Section (Dec 2007)

Use Lock Hierarchies to Avoid Deadlock (Jan 2008)

Break Amdahl’s Law! (Feb 2008)

Going Superlinear (Mar 2008)

Super Linearity and the Bigger Machine (Apr 2008)

Interrupt Politely (May 2008)

Maximize Locality, Minimize Contention (Jun 2008)

Choose Concurrency-Friendly Data Structures (Jul 2008)

The Many Faces of Deadlock (Aug 2008)

Lock-Free Code: A False Sense of Security (Sep 2008)

Writing Lock-Free Code: A Corrected Queue (Oct 2008)

Writing a Generalized Concurrent Queue (Nov 2008)

Understanding Parallel Performance (Dec 2008)

Measuring Parallel Performance: Optimizing a Concurrent Queue (Jan 2009)

volatile vs. volatile (Feb 2009)

Answer to "16 Technologies": Engelbart and the Mother of All Demos

A few days ago I posted a challenge to name the researcher/team and approximate year each of the following 16 important technologies was first demonstrated. In brief, they were:

The personal computer for dedicated individual use all day long.
The mouse.
Internetworks.
Network service discovery.
Live collaboration and desktop/app sharing.
Hierarchical structure within a file system and within a document.
Cut/copy/paste, with drag-and-drop.
Paper metaphor for word processing.
Advanced pattern search and macro search.
Keyword search and multiple weighted keyword search.
Catalog-based information retrieval.
Flexible interactive formatting and line drawing.
Hyperlinks within a document and across documents.
Tagging graphics, and parts of graphics, as hyperlinks.
Shared workgroup document collaboration with annotations etc.
Live shared workgroup collaboration with live audio/video teleconference in a window.

A single answer to all of the above: Doug Engelbart and his ARC team, in what is now known as “The Mother of All Demos”, on Monday, December 9, 1968.

Last month, we marked the 40th anniversary of the famous Engelbart Demo, a truly unique “Eureka!” moment in the history of computing. 40 years go, Engelbart and his visionary team foresaw — and prototyped and demonstrated — many essential details of what we take for granted as our commonplace computing environment today, including all of the above-listed technologies, most of them demonstrated for the first time in that talk.

This talk would be noteworthy and historic just for being the first time a “mouse” was shown and called by that name. Yet the mouse was just one of over a dozen important innovations to be compellingly presented with working prototype implementations.

Note: Yes, some of the individual technologies have earlier theoretical roots. I deliberately phrased the question to focus on implementations because it’s great to imagine a new idea, but it isn’t engineering until we prove it can work by actually building it. For example, consider hypertext: Vannevar Bush’s Memex, vintage 1945, was a theorectical “proto-hypertext” system but it unfortunately remained theoretical, understandably so given the nascent state of computers at the time. Project Xanadu, started in 1960, pursued similar ideas but wasn’t demonstrated until 1972. The Engelbart Demo was the first time that hypertext was publicly shown in a working form, together with a slew of other important working innovations that combined to deliver an unprecedented tour de force. What made it compelling wasn’t just the individual ideas, but the working demonstrations to show that the ideas worked and how they could combine and interact in wonderful ways.

Recommended viewing

You can watch the 100-minute talk here (Stanford University) in sections with commentary, and here (Google Video) all in one go.