Effective Concurrency: Going Superlinear

The latest Effective Concurrency column, "Going Superlinear", just went live on DDJ’s site, and will also appear in the print magazine. From the article:

We spend most of our scalability lives inside a triangular box, shown in Figure 1. It reminds me of the early days of flight: We try to lift ourselves away from the rough ground of zero scalability and fly as close as possible to the cloud ceiling of linear speedup. Normally, the Holy Grail of parallel scalability is to linearly use P processors or cores to complete some work almost P times faster, up to some hopefully high number of cores before our code becomes bound on memory or I/O or something else that means diminishing returns for adding more cores. As Figure 1 illustrates, the traditional shape of our "success" curve lies inside the triangle.

Sometimes, however, we can equip our performance plane with extra tools and safely break through the linear ceiling into the superlinear stratosphere. So the question is: "Under what circumstances can we use P cores to do work more than P times faster?" There are two main ways to enter that rarefied realm:

  • Do disproportionately less work.
  • Harness disproportionately more resources.

This month and next, we’ll consider situations and techniques that fall into one or both of these categories. …

I hope you enjoy it.
Finally, here are links to previous Effective Concurrency columns (based on the dates they hit the web, not the magazine print issue dates):
July 2007 The Pillars of Concurrency
August 2007 How Much Scalability Do You Have or Need?
September 2007 Use Critical Sections (Preferably Locks) to Eliminate Races
October 2007 Apply Critical Sections Consistently
November 2007 Avoid Calling Unknown Code While Inside a Critical Section
December 2007 Use Lock Hierarchies to Avoid Deadlock
January 2008 Break Amdahl’s Law!
February 2008 Going Superlinear

2 thoughts on “Effective Concurrency: Going Superlinear

  1. Thanks for the article. The one thing I don’t understand is this: If you are trying to find the first match in parallel when the matches are clustered, how do you terminate when you’ve found the first match? Won’t there need to be some IPC for the thread that finds the hit to let the others know?

    I do most of my parallel work in MPI where I can’t just check a mutex every iteration. Is this superlinear algorithm limited to environments in which communication is inexpensive?

  2. (no name) asked: "If you are trying to find the first match in parallel when the matches are clustered, how do you terminate when you’ve found the first match? Won’t there need to be some IPC for the thread that finds the hit to let the others know?"
    Yes, and you’ve anticipated the direction we’re going in the next two articles. The next article is mostly still about superlinear scaling and has a couple of paragraphs about why interruption is important, and the article after that will be devoted to the topic of interruption/cancellation.

Comments are closed.

In comments, use [code] [/code] for code blocks, and the same with angle brackets for code inside text paragraphs. I appreciate and try to acknowledge feedback on GotW articles; if you're suggesting an improvement for a GotW Solution post, please post under your real name (or include it in the comment text) if you would like to be acknowledged here and when the material is updated and expanded in future Exceptional C++ books.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s