REFRESH Bioinformatics Group

Kmer-db

Kmer-db—What is it?

Kmer-db is a tool to represent a collection of k-mers in genomes of many species in one compact database. The compressed structure supports fast queries of various types.

How good is Kmer-db?

We were able to compress the collection of k-mers from 40,715 pathogen genomes to tens of GBs. Then we were able to estimate the evolutionary distances between then in a few hours at modern workstation. More details can be found in our paper pointed below.

Terms of use of Kmer-db

Kmer-db is in general a free program available in source code release. More details can be found out on the download page.

Publications

+ Deorowicz, S., Gudys, A., Dlugosz, M., Kokot, M., Danek, A., Kmer-db: instant evolutionary distance estimation, Bioinformatics, 2018; ():Abstract.

Summary: Kmer-db is a new tool for estimating evolutionary relationship on the basis of k-mers extracted from genomes or sequencing reads. Thanks to an efficient data structure and parallel implementation, our software estimates distances between 40,715 pathogens in <7 min (on a modern workstation), 26 times faster than Mash, its main competitor.
Availability and implementation: https://github.com/refresh-bio/kmer-db.