The adage, quoted again this week by Bruce Schneier:
The plural of “anecdote” is not “data.”
But lest we enshrine raw data as holy in itself, another perspective:
And the plural of “datum” is not “proof.”
Last month, I was privileged to be part of a panel in a webinar on Embedded Multicore Development moderated by Richard Nass, Editor-in-Chief of Embedded Systems Design, Embedded.com, and the Embedded Systems Conferences. It’s online and available on demand.
I say “privileged” especially because of the stature of the other panelists. These distinguished gentlemen were:
It was a fun discussion, and I hope informative.
[Lock-free code is] hard even for experts. It’s easy to write lock-free code that appears to work, but it’s very difficult to write lock-free code that is correct and performs well. Even good magazines and refereed journals have published a substantial amount of lock-free code that was actually broken in subtle ways and needed correction.
To illustrate, let’s dissect some peer-reviewed lock-free code that was published here in DDJ just two months ago …
|August 2007||The Pillars of Concurrency|
|September 2007||How Much Scalability Do You Have or Need?|
|October 2007||Use Critical Sections (Preferably Locks) to Eliminate Races|
|November 2007||Apply Critical Sections Consistently|
|December 2007||Avoid Calling Unknown Code While Inside a Critical Section|
|January 2007||Use Lock Hierarchies to Avoid Deadlock|
|February 2008||Break Amdahl’s Law!|
|March 2008||Going Superlinear|
|April 2008||Super Linearity and the Bigger Machine|
|May 2008||Interrupt Politely|
|June 2008||Maximize Locality, Minimize Contention|
|July 2008||Choose Concurrency-Friendly Data Structures|
|August 2008||The Many Faces of Deadlock|
|September 2008||“Lock-Free Code: A False Sense of Security”|
Today I received an email that asked:
I have recently come across your excellent articles on concurrency and the changes in software writing paradigm. They make a lot of sense, but I am having trouble translating them to my world of Telecom oriented web services, where practically everything is run through a DBMS. It seems to me we get everything “free”, simply by using an inherently concurrent multi-everything beast such as that :-) .
Could you please share your thoughts on the issue in one of your coming blog entries? It seems to me nowadays most complex systems would take advantage of a DBMS, certainly any application that is internet based, telecom oriented, or enterprise level. Be it in C++, Java, or PHP and its ilk, using a DBMS – often as a sort of message queue – is one of the best practices that ensures parallelism.
Sure. At right is a slide I give in talks that summarizes the answer to this question, and I’ve addressed this and other similar issues in an ACM Queue article.
The problem with taking advantage of multicore/manycore hardware isn’t (as much) on the server, it’s on the client. When experienced people say things like, “but the concurrency problem is already solved, we’ve been building scalable software for years,” that’s server or niche-client application people talking. That kind of laid-back sound bite sure isn’t coming from mainstream client application developers.
On the server:
We already know how to build somewhat scalable server apps. Sure, it’s still rocket science and takes expert knowledge to do well. But we generally know the rocket science, have experts who can implement it with repeatable success, and have regularly scheduled missions to the “scalable servers” space station. With some care, we can say that the “concurrency problem is already solved” here.
The world is very different for typical mainstream client applications (i.e., I’m not talking about Photoshop and a handful of others), where:
We’re still discovering and productizing the rocket science here. You could say that the tools like OpenMP that we do have now are still at the V-2 stage — they have limited applicability, are somewhat fussy, and don’t always land where you aim them.
But we’re working on it. Up-and-coming tools like Threading Building Blocks are like the Mercury and Venera missions, setting out to reach successively higher goals and repeatability… and we’re starting to see what are perhaps Apollo– and ISS-class missions in the form of PLINQ, the Task Parallel Library, and one for native C++ we’ll be announcing in October at PDC. In part, these tools are trying to see how much we can make client workloads look more like server workloads, notably in providing a transaction-oriented programming model. For example, transactional memory is an area of active research that would let us write “begin transaction; /* … read/write whatever memory variables you feel the need to, then … */ end transaction;”, and if successful it could eventually replace many or even most existing uses of locks.
We have rightly celebrated some successful ‘manned’ flights with client products like Photoshop (parallel rendering) and Excel (parallel recalc) that scale to a number of cores. We’re on the road to, but still working toward, establishing the infrastructure and technology base to enable regularly scheduled commercial flights/shipments of scalable client applications that “light up” on multicore/manycore machines.