Yuhan Peng, Accelerating Cloud Based Genomics Pipeline Using Graphics Processors

Slides

Since introduced in 2009, the Burrows-Wheeler Aligner (BWA) has become one of the most commonly used sequence alignment tools in modern genomics pipelines. The aligner uses Burrows-Wheeler transform (BWT) to perform exact matching with fast speed and small memory demand. However, due to the quadratic running time of the extension step, the execution time of BWA will be long for large input data like human genome. The demand to accelerate the speed of BWA becomes more and more critical. Existing cloud based implementation adopts a map-reduce technique to enable distributed acceleration for the aligner. However, there are still chances to further paralellize the computation on each single node. In this project, we propose a parallelization algorithm on the Smith-Waterman kernel, and use GPGPU to accelerate the computation on each single node. The performance results shows our OpenCL kernel can achieve a 20X speedup against a single-threaded CPU, and up to 5X speedup against a 12-threaded CPU. While optimizing the OpenCL kernel, we also identified a certain transformations that current auto code generation tools misses. These transformations may be supported by future auto code generation tools.