Sat 26 - Thu 31 October 2013 Indianapolis, United States

Efficient multicore programming demands fundamental data structures that support a high degree of concurrency. Existing research on non-blocking data structures promises to satisfy such demands by providing progress guarantees that allow a significant increase in parallelism while avoiding the safety hazards of lock-based synchronizations. It is well-acknowledged that the use of non-blocking containers can bring significant performance benefits to applications where the shared data experience heavy contention. However, the practical implications of integrating these data structures in real-world applications are not well-understood. In this paper, we study the effective use of non-blocking data structures in a data deduplication application which performs a large number of concurrent compression operations on a data stream using the pipeline parallel processing model. We present our experience of manually refactoring the application from using conventional lock-based synchronization mechanisms to using a wait-free hash map and a set of lock-free queues to boost the degree of concurrency of the application. Our experimental study explores the performance trade-offs of parallelization mechanisms that rely on a) traditional blocking techniques, b) fine-grained mutual exclusion, and c) lock-free and wait-free synchronization.