Someone emailed me today asking:
I’m writing because I’m somewhat conscious of what I would consider a rather large hole in the parallel programming literature.
… What if one or more of your tasks throws an exception? Should the thread that runs the task swallow it? Should the caught exceptions get stashed somewhere so that the "parent" thread can deal with them once the tasks are complete? (This is somewhat tricky currently in a language such as C++(98) where one cannot store an exception caught with the "catch(…)" construct). Perhaps all tasks should have a no-throw guarantee? Perhaps some kind of asynchronous error handlers might be installed, somewhat like POSIX signals? The options are many, but choosing a strategy is hard for those of us with little parallel programming experience.
I thought I’d share my response here:
That’s an excellent question. Someone asked that very question in Stockholm last month at my Effective Concurrency course, and my answer started out somewhat dismissive: "Well, it’s about the same as you do in sequential code, and all the same guarantees apply; nothrow/nofail is only for a few key functions used for commit/rollback operations, and you’d usually target the basic guarantee unless adding the strong guarantee comes along naturally for near-free. So it’s pretty much the same as always. Although, well, of course futures may transport exceptions across threads, but that’s still the same because they manifest on .get(). And of course for parallel loops you may get multiple concurrent exceptions from multiple concurrent loop bodies that get aggregated into a single exception; and then there’s the question of whether you start new loop bodies that haven’t started yet (usually no) but do you interrupt loop bodies that are in progress (probably not), and… oh, hmm, yeah, I guess it would be good to write an article about that."
So the above is now adding to my notes of things to write about. :-) Maybe some of that stream-of-consciousness may be helpful until I can get to writing it up in more detail.
I pointed him to Doug Lea’s Concurrent Programming in Java pages 161-176, "Dealing with Failure", adding that I haven’t read it in detail but the subtopics look right. Also Joe Duffy’s Concurrent Programming on Windows, pages 721-733.
If you know of a good standalone treatise focused on error handling in concurrent code, please mention it in the comments.