针对三代测序的比对工具,在原程序基础之上,做一些并行优化
 
 
 
 
 
 
Go to file
Heng Li 9823317e8f r158: optionally ignore base quality 2017-07-05 18:23:50 -04:00
test added test data 2017-07-04 10:56:48 -04:00
.gitignore make sdust working with kalloc 2017-04-06 15:51:36 -04:00
LICENSE.txt added license 2017-07-01 11:39:19 -04:00
Makefile r148: revamped regs handling after cigar 2017-07-03 10:44:26 -04:00
README.md for clarity 2017-07-04 11:56:50 -04:00
align.c r148: revamped regs handling after cigar 2017-07-03 10:44:26 -04:00
bseq.c r131: wrong EOF test; make mb_size <= batch_size 2017-07-01 09:26:09 -04:00
bseq.h r129: fixed memory leak caused by qualities 2017-06-30 23:48:00 -04:00
chain.c finished the first draft of manpage 2017-07-01 11:25:54 -04:00
format.c r121: output QUAL and unmapped to SAM 2017-06-30 14:40:54 -04:00
hit.c r153: sam primary record not set sometimes 2017-07-03 13:18:57 -04:00
index.c r131: wrong EOF test; make mb_size <= batch_size 2017-07-01 09:26:09 -04:00
kalloc.c number of allocated units must be an even number 2017-06-26 23:15:30 -04:00
kalloc.h Homopolymer-compressed k-mer sketch 2017-04-06 15:37:34 -04:00
kdq.h make sdust working with kalloc 2017-04-06 15:51:36 -04:00
khash.h index can be compiled; not tested yet 2017-04-07 15:30:30 -04:00
kseq.h Homopolymer-compressed k-mer sketch 2017-04-06 15:37:34 -04:00
ksort.h chaining 2017-05-03 20:47:29 +08:00
ksw2.h backup 2017-06-29 12:58:52 -04:00
ksw2_extz2_sse.c backup 2017-06-29 12:58:52 -04:00
kthread.c index can be compiled; not tested yet 2017-04-07 15:30:30 -04:00
kthread.h index can be compiled; not tested yet 2017-04-07 15:30:30 -04:00
kvec.h Homopolymer-compressed k-mer sketch 2017-04-06 15:37:34 -04:00
main.c r158: optionally ignore base quality 2017-07-05 18:23:50 -04:00
map.c r158: optionally ignore base quality 2017-07-05 18:23:50 -04:00
minimap.h r158: optionally ignore base quality 2017-07-05 18:23:50 -04:00
minimap2.1 r158: optionally ignore base quality 2017-07-05 18:23:50 -04:00
misc.c r149: introduced debugging flags on CLI 2017-07-03 11:02:32 -04:00
mmpriv.h r153: sam primary record not set sometimes 2017-07-03 13:18:57 -04:00
sdust.c fixed memory leaks 2017-06-06 21:45:55 -04:00
sdust.h fixed a few memory leaks 2017-04-13 23:05:19 -04:00
sketch.c wrong HPC 2017-05-04 17:45:19 +08:00

README.md

Getting Started

git clone https://gitlab.com/lh3/minimap2
cd minimap2 && make
# long reads against a reference genome
./minimap2 -ax map10k test/MT-human.fa test/MT-orang.fa > test.sam
# create an index first and then map
./minimap2 -x map10k -d MT-human.mmi test/MT-human.fa
./minimap2 -ax map10k MT-human.mmi test/MT-orang.fa > test.sam
# long-read overlap (no test data)
./minimap2 -x ava10k your-reads.fa your-reads.fa > overlaps.paf
# man page
man ./minimap2.1

Introduction

Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR). At present, it works efficiently with query sequences from a few kilobases to ~100 megabases in length at a error rate ~15%. Minimap2 outputs in the PAF or the SAM format. On limited test data sets, minimap2 is over 20 times faster than most other long-read aligners.

Minimap2 is the successor of minimap. It uses a similar minimizer-based indexing and seeding algorithm, and improves the original minimap with homopolyer-compressed k-mers (see also SMARTdenovo and longISLND), better chaining and the ability to produce CIGAR with fast extension alignment (see also libgaba and ksw2).

Limitations

At the alignment phase, minimap2 performs global alignments between minimizer hits. If the positions of these minimizer hits are incorrect, the final alignment may be suboptimal or broken due to the Z-drop heuristics. In addition, in the event of a long insertion/deletion, minimap2 may split the long event into a few smaller events. We will address these issues in future.

Minimap2 does not work well with Illumina short reads as of now.