Server Concurrency != Client Concurrency

Today I received an email that asked:

I have recently come across your excellent articles on concurrency and the changes in software writing paradigm. They make a lot of sense, but I am having trouble translating them to my world of Telecom oriented web services, where practically everything is run through a DBMS. It seems to me we get everything “free”, simply by using an inherently concurrent multi-everything beast such as that :-) .

Could you please share your thoughts on the issue in one of your coming blog entries? It seems to me nowadays most complex systems would take advantage of a DBMS, certainly any application that is internet based, telecom oriented, or enterprise level. Be it in C++, Java, or PHP and its ilk, using a DBMS – often as a sort of message queue – is one of the best practices that ensures parallelism.

imageSure. At right is a slide I give in talks that summarizes the answer to this question, and I’ve addressed this and other similar issues in an ACM Queue article.

The problem with taking advantage of multicore/manycore hardware isn’t (as much) on the server, it’s on the client. When experienced people say things like, “but the concurrency problem is already solved, we’ve been building scalable software for years,” that’s server or niche-client application people talking. That kind of laid-back sound bite sure isn’t coming from mainstream client application developers.

Typical Server Workloads

On the server:

  • Workloads typically already have lots of inherent concurrency (000s or 000,000s of incoming requests for web/dbms/etc. operations), and it’s easy to launch independent requests concurrently.
  • Shared data is typically inside highly structured relational databases where we have decades of experience with automatic concurrency control. The DBMS itself knows how to optimistically run transactions in parallel and control conflicts by escalating row locks to page locks, index locks to table locks, and so on.
  • The programming model is typically transactions, and all the programmer has to know about is to write “begin transaction; /* … read/write whatever stuff you feel the need to, then … */ end transaction;” which is about as sweet as it gets.

We already know how to build somewhat scalable server apps. Sure, it’s still rocket science and takes expert knowledge to do well. But we generally know the rocket science, have experts who can implement it with repeatable success, and have regularly scheduled missions to the “scalable servers” space station. With some care, we can say that the “concurrency problem is already solved” here.

Typical Client Workloads

The world is very different for typical mainstream client applications (i.e., I’m not talking about Photoshop and a handful of others), where:

  • Workloads don’t have lots of inherent concurrency. The user clicks one button, and we have to figure out how to divide the work and recombine it in order to get the answer faster on many cores.
  • Shared data is typically in unstructured pointer-chasing graphs of objects in shared memory that require explicit concurrency control. Note that “unstructured” doesn’t mean there’s no structure — of course there’s some — but it’s a gloriously diverse pile of objects and containers, and more like an organically growing shantytown with unplanned twisty little alleys and passages, than the nice rectangular downtown city street plan of a nice rectangular database table.
  • The programming model to protect shared data is to use error-prone explicit locks. You have to remember which locks protect which data, and not to acquire them in inconsistent and deadlock-prone orders.

We’re still discovering and productizing the rocket science here. You could say that the tools like OpenMP that we do have now are still at the V-2 stage — they have limited applicability, are somewhat fussy, and don’t always land where you aim them.

But we’re working on it. Up-and-coming tools like Threading Building Blocks are like the Mercury and Venera missions, setting out to reach successively higher goals and repeatability… and we’re starting to see what are perhaps Apollo– and ISS-class missions in the form of PLINQ, the Task Parallel Library, and one for native C++ we’ll be announcing in October at PDC. In part, these tools are trying to see how much we can make client workloads look more like server workloads, notably in providing a transaction-oriented programming model. For example, transactional memory is an area of active research that would let us write “begin transaction; /* … read/write whatever memory variables you feel the need to, then … */ end transaction;”, and if successful it could eventually replace many or even most existing uses of locks.

We have rightly celebrated some successful ‘manned’ flights with client products like Photoshop (parallel rendering) and Excel (parallel recalc) that scale to a number of cores. We’re on the road to, but still working toward, establishing the infrastructure and technology base to enable regularly scheduled commercial flights/shipments of scalable client applications that “light up” on multicore/manycore machines.

8 thoughts on “Server Concurrency != Client Concurrency

  1. Dear Herb,

    you are right regarding the much greater difficulties in making client code concurrent. You cite “library” or “language” approaches to the problem.

    I think that a “pattern” approach would also be very much desirable. There are a few common ways of parallelizing sequential code. Maybe starting to collect these approaches from practical experiences, abstracting patterns, and publishing them is the most correct approach at this stage.

    Can you cite anybody that is following this approach?

  2. I think Herb’s working on a new book “Effective Concurrency”. I hope he’ll cover the topics you mentioned.

  3. I’m not convinced this is only a client side problem. Although its easy to put many users through a server side app that means there is a limited amount of processing that can be done for anyone user. We likely have the same single threaded performance for the next 5 years.

    On that basis the most complex an app can be is set to about 0.5 seconds of processing (web app). What a frightening thought.

    Scalability in data is still a big problem, the relational database is in need of a lot more parrallel processing. We’re not done yet needing faster algorithms, we could certainly use them.

  4. I agree with the above comments, but client side concurrency more important than server-side. I can’t see this, with a large amount of client-side code being developed for either the browser or flex that are single-threaded languages, I’ve never heard anyone complaining!

    On the server-side for transaction true the server does automatically use the processing power when multiple users are accessing it.

    The reality how many people write applications that care about client-side performance like GIS, Game Development, Graphics packages? The sad fact is most of us write bespoke business applications.

    In finance the biggest problem is long batch runs that can be solved by using concurrency. Nobody really cares if it takes a few seconds for a transaction to go through, but they do start caring when the nightly process runs contract by contract and starts to creap into the start of the new working day.

  5. i’ve not much experience in client app concurrency and admittedly I’m not very interested in it either. But I love your writing and the anology you’ve given and read the post the whole way through..

    great work.

  6. Nice article! A couple of comments:

    There’s another way to think about the issue you call “inherent concurrency”. Typical server jobs (such as database systems) are often measured by total throughput (eg transactions per second) whereas many client applications are sensitive to response time in order to support interactive behavior. This difference can lead to very different strategies when considering how to extract parallel performance.

    Also, you point out that locks are error-prone, but there is another serious problem with locks – programming with explicit locks can often serialize a program, sometimes in subtle ways. This results in losing the benefits of parallelism as threads line up to acquire the lock one at a time.

    Another alternative to explicit-lock programming, in addition to the transactional model you describe, is the “hyperobject” concept that our company (Cilk Arts) describes at http://www.cilk.com/multicore-products/cilk-hyperobjects.

Comments are closed.