Performance optimization in DNA short-read alignment

Richard Wilton, Alexander S. Szalay

Research output: Contribution to journalArticlepeer-review

Abstract

Summary: Over the past decade, short-read sequence alignment has become a mature technology. Optimized algorithms, careful software engineering and high-speed hardware have contributed to greatly increased throughput and accuracy. With these improvements, many opportunities for performance optimization have emerged. In this review, we examine three general-purpose short-read alignment tools - BWA-MEM, Bowtie 2 and Arioc - with a focus on performance optimization. We analyze the performance-related behavior of the algorithms and heuristics each tool implements, with the goal of arriving at practical methods of improving processing speed and accuracy. We indicate where an aligner's default behavior may result in suboptimal performance, explore the effects of computational constraints such as end-to-end mapping and alignment scoring threshold, and discuss sources of imprecision in the computation of alignment scores and mapping quality. With this perspective, we describe an approach to tuning short-read aligner performance to meet specific data-analysis and throughput requirements while avoiding potential inaccuracies in subsequent analysis of alignment results. Finally, we illustrate how this approach avoids easily overlooked pitfalls and leads to verifiable improvements in alignment speed and accuracy. Contact: [email protected]

Original languageEnglish (US)
Pages (from-to)2081-2087
Number of pages7
JournalBioinformatics
Volume38
Issue number8
DOIs
StatePublished - Apr 15 2022

ASJC Scopus subject areas

  • Computational Mathematics
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Performance optimization in DNA short-read alignment'. Together they form a unique fingerprint.

Cite this