REFRESH Bioinformatics Group

Whisper

Whisper—What is it?

Whisper is a new algorithm for mapping of whole genome sequencing data. For WGS Illumina data it is several times faster than BWA-MEM preserving similar variant calling quality.

How good is Whisper?

We are able to map the reads from whole genome sequencing a few times faster than BWA-MEM with comparable quality of mappings.

Terms of use of Whisper

Whisper is a free program available in source code release. More details can be found out on the download page.

Publications

+ Deorowicz, S., Debudaj-Grabysz, A., Gudys, A., Grabowski, Sz., Whisper: read sorting allows robust robust mapping of DNA sequencing data, Bioinformatics, 2018; :Abstract.

Motivation: Mapping reads to a reference genome is often the first step in a sequencing data analysis pipeline. The reduction of sequencing costs implies a need for algorithms able to process increasing amounts of generated data in reasonable time.
Results: We present Whisper, an accurate and high-performant mapping tool, based on the idea of sorting reads and then mapping them against suffix arrays for the reference genome and its reverse complement. Employing task and data parallelism as well as storing temporary data on disk result in superior time efficiency at reasonable memory requirements. Whisper excels at large NGS read collections, in particular Illumina reads with typical WGS coverage. The experiments with real data indicate that our solution works in about 15% of the time needed by the well-known BWA-MEM and Bowtie2 tools at a comparable accuracy, validated in a variant calling pipeline.

+ Deorowicz, S., Debudaj-Grabysz, A., Gudys, A., Grabowski, Sz., Robust mapping of whole genome sequencing data, Poster at The Biology of Genomes Conference, 2017; Abstract.

Mapping reads generated by sequencing platforms to reference genomes is a crucial step affecting all downstream procedures like variant calling or expression analysis. We present a new mapping algorithm Whisper, especially suited for whole genome sequencing. It is characterized by:

  • superior quality results (comparable to BWA-MEM, a state-of-art mapping software),
  • excellent speed (it maps several times faster than BWA-MEM),
  • reasonable memory requirements (a few GB of RAM),
  • support for variable-length reads (from tens to hundreds of bases),
  • useful features like direct support of gzipped input and output.
All these make Whisper an interesting alternative to existing mapping packages.
The software is available at http://sun.aei.polsl.pl/REFRESH/whisper under a free license.