September 2009 – Sutter’s Mill

“What’s the best way to process a pool of work?” is a recurring question. As usual, the answer is “it depends” because the optimal answer often depends on both the characteristics of the work itself and the constraints imposed by run-time system resources.

For example, I recently received the following email from reader Sören Meyer-Eppler, where the key was to avoid oversubscribing system resources (in this case, memory):

I have an application that has multiple threads processing work from a todo queue. I have no influence over what gets into the queue and in what order (it is fed externally by the user). A single work item from the queue may take anywhere between a couple of seconds to several hours of runtime and should not be interrupted while processing. Also, a single work item may consume between a couple of megabytes to around 2GBs of memory. The memory consumption is my problem. I’m running as a 64bit process on a 8GB machine with 8 parallel threads. If each of them hits a worst case work item at the same time I run out of memory. I’m wondering about the best way to work around this.

1. plan conservatively and run 4 threads only. The worst case shouldn’t be a problem anymore, but we waste a lot of parallelism, making the average case a lot slower.

2. make each thread check available memory (or rather total allocated memory by all threads) before starting with a new item. Only start when more than 2GB memory are left. Recheck periodically, hoping that other threads will finish their memory hogs and we may start eventually. Still dangerous if the check happens when all threads are just starting out with their allocations.

3. try to predict how much memory items from the queue will need (hard) and plan accordingly. We could reorder the queue (overriding user choice) or simply adjust the number of running worker threads.

4. more ideas?

I’m currently tending towards number 2 because it seems simple to implement and solve most cases. However, I’m still wondering what standard ways of handling situations like this exist? The operating system must do something very similar on a process level after all…

I replied:

I don’t have time to write a detailed answer right now, but also consider two queues (one for big tasks and one for small tasks), or having work items give a rough size estimate (possibly by doing an extra lightweight pass over the data up front).

May I post an extract of your mail on my blog? Then others may comment and provide useful hints.

He said yes, and so here it is for your consideration.

Note also this similar question that came up a few days ago on comp.programming threads, but with different constraints — in that case, it was about avoiding idleness rather than avoiding oversubscription.

I just received a reader email that asked about GotW #42:

You write "Non-Problem: Zero-Length Arrays Are Okay", but both 14882:2003 and N2914 "[dcl.array]" say "If the constant-expression (5.19) is present, it shall be an integral constant expression and its value shall be greater than zero.". Shall we assume that you overrule the standard? :-) Or am I missing something, like the meaning of "derived-declarator-type-list" (I can’t find it anywhere in 14882:2003)?

I thought the answer might be of interest to other people, so I’m posting it here.

First, this reader gets kudos for consulting not only the ISO C++ standard, but also the current C++0x working draft from June (paper N2914), to do research before asking the question. Good stuff.

He also gets a small penalty point, though, for reading only the subhead text that he quoted from GotW #42, and not the actual short passage the heading introduces. The article already contains the answer to his qustion, and that answer is still the same in the current C++0x draft all these years later:

From 5.3.4 [expr.new], paragraph 7:

When the value of the expression in a direct-new-declarator is zero, the allocation function is called to allocate an array with no elements. The pointer returned by the new-expression is non-null. [Note: If the library allocation function is called, the pointer returned is distinct from the pointer to any other object.]

So the reader is quoting from 8.3.4 [dcl.array], which governs non-heap allocated arrays (e.g., T array[N];). In that case, a zero length is not allowed by the standard. You can’t rely on it in portable code because although it is allowed as an extension in some popular compilers (e.g., Gnu gcc) it is treated as an error in others (e.g., Visual C++).

But the case being discussed in GotW #42 is dynamically allocating an array using the array form of new (e.g., new T[n] where n is zero), which is governed by 5.3.4 [expr.new]. Here, zero length is okay for the reasons given in GotW #42:

The result of "new T[0]" is just a pointer to an array with zero elements, and that pointer behaves just like any other result of "new T[n]" including the fact that you may not attempt to access more than n elements of the array… in this case, you may not attempt to access any elements at all, because there aren’t any.

"Well, if you can’t do anything with zero-length arrays (other than remember their address)," you may wonder, "why should they be allowed?" One important reason is that it makes it easier to write code that does dynamic array allocation. For example, the function f above would be needlessly more complex if it was forced to check the value of its n parameter before performing the "new T[n]" call.

To make GotW #42 completely unambiguous, it could more specifically say that zero-length heap-allocated arrays are okay, which was the case being discussed in the article. But it always helps to read more than the subhead text.

Month: September 2009

“What’s the Best Way To Process a Pool of Work?”

When is a zero-length array okay?