Bill Schere, High-performance Synchronization for User-level Software Systems

Concurrency in user applications is on the rise. Modern computers have multiple hardware threads per processor and multiple processors per chip, each of which may switch to a different software thread many times per second. Applications of the future will be havily multithreaded.

Traditional lock-based synchronization is prone to deadlock, non-composability, priority inversion, convoying, and intolerance of thread failure, preemption, and eveven page faults. Nonblocking algorithms avoid these limitations by ensuring that the delay or failure of a thread never prevents the system as a whole from making forward progress. For user-level software systems in particular, obtaining high performance mandates avoiding delays due to attempted synchronization with a preempted thread.

We broaden the range of known nonblocking algorithms by introducing a design methodology, dual data structures, that supports partial operations in concurrent objects with standard linearizability theory. In this talk we will introduce our improved SynchronousQueue class that uses this technology to improve upon its predecessor's performance by up to a factor of 14. This object appears in Java 6, and forms a core part of the ThreadPoolExecutor mechanism upon which many multi-threaded applications rely.