Description |
Precision medicine can provide breakthroughs in current medical technology and treatment. The ability to prescribe a speci#12;c cure to a speci#12;c disease for a spe- ci#12;c person can greatly reduce the cost of medical treatments, reduce mistakes in the medical #12;eld, and increase the e#11;ectiveness of medical treatments. Genetic in- formation is one of the biggest sources of information for precision medicine, and extracting and understanding this DNA can be a lengthy and di#14;cult task. DNA Sequence alignment is one of the most time-consuming and computationally inten- sive parts of precision medicine. Hash-based DNA aligners are currently the fastest and most e#14;cient aligners in their #12;eld. This work explores the common pipeline structure for hash-based aligners, that of seed selection, #12;ltering, and edit distance, to construct a new sequence alignment pipeline. This new pipeline takes advantage of the availability of large modern memory systems, built-in vector instructions in mod- ern CPU cores, and explores the possibility of using a heterogeneous memory system and near data processing to further accelerate the pipeline. With these software and hardware innovations, a speedup of 8#2; and reduced false negative rates are achieved for a range of datasets over the traditional sequence alignment pipeline without the need to create a specialized accelerator chip. |