Other Concurrency Sessions at PDC’09

I mentioned yesterday that I’ll be involved in two sessions at PDC09, including a parallel patterns tutorial. I know many of you are interested in concurrency in general and on Microsoft platforms in particular, so I thought I’d share this more complete list of concurrency-related sessions at PDC, put together by my colleague Stephen Toub.

Overview:

Native code in Visual Studio 2010:

Managed code in Visual Studio 2010:

HPC Server:

Accelerating Applications Using Windows HPC Server 2008

Research and Incubation:

PDC’09: Tutorial & Panel

For those of you coming to PDC’09 in Los Angeles a couple of weeks from now, I’ll be there for a few hours on Monday and Wednesday participating in two events:

Patterns of Parallel Programming: A Tutorial on Fundamental Patterns and Practices for Parallelism. The full-day tutorial is full of useful information. I’ll be giving the first hour or so as an intro/overview; if you’ve seen my high-level concurrency talks you’ll recognize much of it, but it’ll also have a slant toward patterns of course to set up the day. The rest of the tutorial will be presented by my colleagues Richard Ciapata, Ade Miller, and Stephen Toub.
Panel: Microsoft Perspectives on the Future of Programming. This one’s going to be a blast. You can judge just from the names of my heavyweight fellow panelists: Butler Lampson, Burton Smith, Don Box, Erik Meijer, and Jeffrey Snover. Anytime you get a chance to watch or attend a talk by any of these, do. If you get a chance to come to this panel when they’re all on one stage, definitely do. As Bill Shatner might put it: “Do. Not Miss!”

See you at PDC!

Hoare on Testing

On the flight to the ISO C standards meeting this morning, I was reading this month’s issue of CACM, and found that Sir C.A.R. (Tony) Hoare wrote a nice piece called Retrospective: An Axiomatic Basis for Computer Programming.

Hoare has long been a noted proponent of axioms and formal proofs of program correctness. In that light, the following passage on testing and axioms struck me as well put and I thought I’d share it (emphasis added):

One thing I got spectacularly wrong. I could see that programs were getting larger, and I thought that testing would be an increasingly ineffective way of removing errors from them. I did not realize that the success of tests is that they test the programmer, not the program. Rigorous testing regimes rapidly persuade error-prone programmers (like me) to remove themselves from the profession. Failure in test immediately punishes any lapse in programming concentration, and (just as important) the failure count enables implementers to resist management pressure for premature delivery of unreliable code [or forces management to be explicitly unreasonable in the face of bug bar data and specific failure cases having an objective severity –hps]. The experience, judgment, and intuition of programmers who have survived the rigors of testing are what make programs of the present day useful, efficient, and (nearly) correct. Formal methods for achieving correctness must support the intuitive judgment of programmers, not replace it.

My basic mistake was to set up proof in opposition to testing, where in fact both of them are valuable and mutually supportive ways of accumulating evidence of the correctness and serviceability of programs. …

He also mentions many other useful observations and reminds, including the value of assertions to find, not run-time errors, but programming bugs. (See also C++ Coding Standards Item 68: Assert liberally to document internal assumptions and invariants.)

The whole article is good reading, and not long. Recommended.

Deprecating export considered for ISO C++0x

How interesting.

I’m at the ISO C++ meeting in Santa Cruz, CA, USA this week. Ten minutes ago we had a committee straw poll about whether we should remove, deprecate, or leave as-is the export template feature for C++0x. The general sentiment was to remove or deprecate it, with deprecation getting the strongest support because it’s safer per ISO procedure. Deprecation would not remove it from C++0x, but would put the world on notice that the committee is considering removing it from a future standard.

Note: Nothing is happening to export at this meeting. The effect of this straw poll of the committee was to give guidance to the core language working group (CWG) on the committee’s general sentiment on this issue. The CWG is taking this as guidance to produce a detailed proposal to take some action on export at our next meeting in Pittsburgh in March.

I’ll write a longer trip report in the next week, but this was worth noting in real time because it’s the first time export template has been discussed in committee since the highly controversial discussion of my and Tom Plum’s proposal to remove export at the Oxford meeting in 2003.

A Concurrency Poll

I’ve opened up a short concurrency poll to get a sense of what concurrency issues are top-of-mind for programmers, and I’d appreciate it if you could take a few minutes to participate. Some questions are about what you want to learn more about, others about your tools of choice in specific areas, and a few are slightly whimsical. I plan to use the results as input to topics to cover in future Effective Concurrency articles and talks, so by participating you’ll help influence the direction of future EC topics.

There are about 28 questions, each asking for just a word or a phrase in answer. Here again is the link: http://www.surveygizmo.com/s/193214/apw36

Thank you in advance for taking a few minutes to participate. If you’re interested in receiving a summary of the survey results, please leave your email address at the end of the survey and I’ll send you a copy (your email will not be used for any other purpose).

Mailbag: Shutting up compiler warnings

I recently received the following reader question (slightly edited):

About the (Stroustrup) approach of implementing IsDerivedFrom at page 27 in your book More Exceptional C++: […] why the second pointer assignment in:

static void Constraints(D* p)
{
B* pb=p; // okay, D better inherit from B…
pb=p; // huh? why this again?
}

Isn’t the initialization ” B* pb=p ” enough? What does the second assignment bring to the picture?

Excellent question. Interestingly, here’s the exact text from the book, which contains an additional comment that was intended to answer this specific question (emphasis added):

static void Constraints(D* p)
{
B* pb = p;
pb = p; // suppress warnings about unused variables
}

The reason for the redundant assignment is to try to shut up some compilers that issue warnings about unused variables. In a normal function, this is a helpful warning because, dollars to donuts, it’s a mistake and the programmer meant to use the variable but forgot or accidentally used some other variable instead. In this case, without the second line, pb is declared but never mentioned again, and by adding the extra line that mentions it one more time after its declaration, some compilers stop complaining.

Sometimes, however, we’re intentionally not using the variable, such as in this case where the only purpose of the function is to ensure that code that tries to convert a D* to a B* is compilable. (Note: There are other/better ways to achieve this. See the rest of Item 27 for superior approaches.)

However, in practice even the line “pb = p;” doesn’t shut compilers up very much, because many will still notice that now the second assigned value isn’t used, and will helpfully remind you that you probably didn’t mean to write an apparently useless “dead write” that is never read again. Most compilers offer a compiler-specific #pragma or other nonportable extension to silence this warning, but wouldn’t it be nice to have a portable way to suppress the warning that will work on all compilers?

Here’s a simple one-liner that works on all compilers I tried, and should not incur any overhead: Define an empty function template that takes any object by reference, but doesn’t actually do anything with it and so the empty function body should compile down to nothing. Let’s call it “ignore” – you can put it into your project namespace to avoid collisions on that common name.

  template<class T> void ignore( const T& ) { }

Note that ignore<T>’s parameter does not need a name; in fact, it had better not have a name or else you’ll get the same compiler warning as before about an unused variables, only in a new place.

Using ignore is easy. When you want to suppress the compiler warning, just write:

  static void Constraints(D* p)

    B* pb = p;

    ignore(pb); // portably suppresses the warning

Now all compilers I tried agree that the variable is “used” and stop complaining about pb.

[Oct 20: Updated to add “const” on ignore<T>’s parameter to also handle rvalues.]

whois terry.crowley

Astute readers may have noticed that Terry Crowley’s name frequently crops up in the Acknowledgments section of my Effective Concurrency columns. Who is Terry? To answer, Mary-Jo Foley profiles him this week.

Effective Concurrency: Avoid Exposing Concurrency – Hide It Inside Synchronous Methods

This month’s Effective Concurrency column, Avoid Exposing Concurrency – Hide It Inside Synchronous Methods, is now live on DDJ’s website.

From the article:

You have a mass of existing code and want to add concurrency. Where do you start?

Let’s say you need to migrate existing code to take advantage of concurrent execution or scale on parallel hardware. In that case, you’ll probably find yourself in one of these two common situations, which are actually more similar than different:

Migrating an application: You’re an application developer, and you want to migrate your existing synchronous application to be able to benefit from concurrency.

Migrating a library or framework: You’re a developer on a team that produces a library or framework used by other teams or external customers, and you want to let the library take advantage of concurrency on behalf of the application without requiring application code rewrites.

You have a mountain of opportunities and obstacles before you. Where do you start?

I hope you enjoy it. Finally, here are links to previous Effective Concurrency columns:

The Pillars of Concurrency (Aug 2007)

How Much Scalability Do You Have or Need? (Sep 2007)

Use Critical Sections (Preferably Locks) to Eliminate Races (Oct 2007)

Apply Critical Sections Consistently (Nov 2007)

Avoid Calling Unknown Code While Inside a Critical Section (Dec 2007)

Use Lock Hierarchies to Avoid Deadlock (Jan 2008)

Break Amdahl’s Law! (Feb 2008)

Going Superlinear (Mar 2008)

Super Linearity and the Bigger Machine (Apr 2008)

Interrupt Politely (May 2008)

Maximize Locality, Minimize Contention (Jun 2008)

Choose Concurrency-Friendly Data Structures (Jul 2008)

The Many Faces of Deadlock (Aug 2008)

Lock-Free Code: A False Sense of Security (Sep 2008)

Writing Lock-Free Code: A Corrected Queue (Oct 2008)

Writing a Generalized Concurrent Queue (Nov 2008)

Understanding Parallel Performance (Dec 2008)

Measuring Parallel Performance: Optimizing a Concurrent Queue (Jan 2009)

volatile vs. volatile (Feb 2009)

Sharing Is the Root of All Contention (Mar 2009)

Use Threads Correctly = Isolation + Asynchronous Messages (Apr 2009)

Use Thread Pools Correctly: Keep Tasks Short and Nonblocking (Apr 2009)

Eliminate False Sharing (May 2009)

Break Up and Interleave Work to Keep Threads Responsive (Jun 2009)

The Power of “In Progress” (Jul 2009)

Design for Manycore Systems (Aug 2009)

Avoid Exposing Concurrency – Hide It Inside Synchronous Methods (Oct 2009)

“What’s the Best Way To Process a Pool of Work?”

“What’s the best way to process a pool of work?” is a recurring question. As usual, the answer is “it depends” because the optimal answer often depends on both the characteristics of the work itself and the constraints imposed by run-time system resources.

For example, I recently received the following email from reader Sören Meyer-Eppler, where the key was to avoid oversubscribing system resources (in this case, memory):

I have an application that has multiple threads processing work from a todo queue. I have no influence over what gets into the queue and in what order (it is fed externally by the user). A single work item from the queue may take anywhere between a couple of seconds to several hours of runtime and should not be interrupted while processing. Also, a single work item may consume between a couple of megabytes to around 2GBs of memory. The memory consumption is my problem. I’m running as a 64bit process on a 8GB machine with 8 parallel threads. If each of them hits a worst case work item at the same time I run out of memory. I’m wondering about the best way to work around this.

1. plan conservatively and run 4 threads only. The worst case shouldn’t be a problem anymore, but we waste a lot of parallelism, making the average case a lot slower.

2. make each thread check available memory (or rather total allocated memory by all threads) before starting with a new item. Only start when more than 2GB memory are left. Recheck periodically, hoping that other threads will finish their memory hogs and we may start eventually. Still dangerous if the check happens when all threads are just starting out with their allocations.

3. try to predict how much memory items from the queue will need (hard) and plan accordingly. We could reorder the queue (overriding user choice) or simply adjust the number of running worker threads.

4. more ideas?

I’m currently tending towards number 2 because it seems simple to implement and solve most cases. However, I’m still wondering what standard ways of handling situations like this exist? The operating system must do something very similar on a process level after all…

I replied:

I don’t have time to write a detailed answer right now, but also consider two queues (one for big tasks and one for small tasks), or having work items give a rough size estimate (possibly by doing an extra lightweight pass over the data up front).

May I post an extract of your mail on my blog? Then others may comment and provide useful hints.

He said yes, and so here it is for your consideration.

Note also this similar question that came up a few days ago on comp.programming threads, but with different constraints — in that case, it was about avoiding idleness rather than avoiding oversubscription.

When is a zero-length array okay?

I just received a reader email that asked about GotW #42:

You write "Non-Problem: Zero-Length Arrays Are Okay", but both 14882:2003 and N2914 "[dcl.array]" say "If the constant-expression (5.19) is present, it shall be an integral constant expression and its value shall be greater than zero.". Shall we assume that you overrule the standard? :-) Or am I missing something, like the meaning of "derived-declarator-type-list" (I can’t find it anywhere in 14882:2003)?

I thought the answer might be of interest to other people, so I’m posting it here.

First, this reader gets kudos for consulting not only the ISO C++ standard, but also the current C++0x working draft from June (paper N2914), to do research before asking the question. Good stuff.

He also gets a small penalty point, though, for reading only the subhead text that he quoted from GotW #42, and not the actual short passage the heading introduces. The article already contains the answer to his qustion, and that answer is still the same in the current C++0x draft all these years later:

From 5.3.4 [expr.new], paragraph 7:

When the value of the expression in a direct-new-declarator is zero, the allocation function is called to allocate an array with no elements. The pointer returned by the new-expression is non-null. [Note: If the library allocation function is called, the pointer returned is distinct from the pointer to any other object.]

So the reader is quoting from 8.3.4 [dcl.array], which governs non-heap allocated arrays (e.g., T array[N];). In that case, a zero length is not allowed by the standard. You can’t rely on it in portable code because although it is allowed as an extension in some popular compilers (e.g., Gnu gcc) it is treated as an error in others (e.g., Visual C++).

But the case being discussed in GotW #42 is dynamically allocating an array using the array form of new (e.g., new T[n] where n is zero), which is governed by 5.3.4 [expr.new]. Here, zero length is okay for the reasons given in GotW #42:

The result of "new T[0]" is just a pointer to an array with zero elements, and that pointer behaves just like any other result of "new T[n]" including the fact that you may not attempt to access more than n elements of the array… in this case, you may not attempt to access any elements at all, because there aren’t any.

"Well, if you can’t do anything with zero-length arrays (other than remember their address)," you may wonder, "why should they be allowed?" One important reason is that it makes it easier to write code that does dynamic array allocation. For example, the function f above would be needlessly more complex if it was forced to check the value of its n parameter before performing the "new T[n]" call.

To make GotW #42 completely unambiguous, it could more specifically say that zero-length heap-allocated arrays are okay, which was the case being discussed in the article. But it always helps to read more than the subhead text.