two discussion paragraphs; need one more

This commit is contained in:
Heng Li 2017-11-05 12:27:52 -05:00
parent fa5a645ca5
commit 2191ac58ad
2 changed files with 36 additions and 37 deletions

View File

@ -558,16 +558,33 @@ of small variant calling.
\subsection{Other applications}
Minimap2 retains minimap's functionality to find overlaps between long reads
and to search against huge multi-species databases such as \emph{nt} from NCBI.
Minimap2 can also align similar genomes or different assemblies of the same
species. It took 7 wall-clock minutes over 8 CPU cores to align a human SMRT
assembly (AC:GCA\_001297185.1) to GRCh38, over 20 times as fast as
and to search against large multi-species databases such as \emph{nt} from
NCBI. Minimap2 can also align similar genomes or different assemblies of the
same species. It took 7 wall-clock minutes over 8 CPU cores to align a human
SMRT assembly (AC:GCA\_001297185.1) to GRCh38, over 20 times as fast as
MUMmer4~\citep{Kurtz:2004zr}.
\section{Conclusion}
\section{Discussions}
Minimap2 is a fast, accurate and versatile aligner for long nucleotide
sequences.
Minimap2 is a versatile mapper and pairwise aligner for nucleotide sequences.
It works with short reads, assembly contigs and long noisy genomic and RNA-seq
reads. It can be used as a read mapper, long-read overlapper or a full-genome
aligner. Minimap2 is also accurate and efficient, often outperforming other
domain-specific alignment tools in terms of both speed and accuracy.
The capability of minimap2 comes from a fast base-level alignment algorithm and
an accurate chaining algorithm. When aligning long query sequences, base-level
alignment is often the performance bottleneck. The Suzuki-Kasahara algorithm
greatly alleviates the bottleneck and enables DP-based splice alignment
involving $>$100kb introns, which was impractically slow ten years ago. The
minimap2 chaining algorithm is fast and highly accurate by itself. In fact,
chaining alone is more accurate than all the other long-read mappers in
Fig.~\ref{fig:eval}a (data not shown). This accuracy helps to reduce downstream
base-level alignment of candidate chains, which is still times slower than
chaining even with the Suzuki-Kasahara improvement. In addition, taking a
general form, minimap2 chaining can be adapted to non-typical data types such
spliced reads and multiple reads per fragment. This gives us the opportunity to
extend the same base algorithm to a variety of use cases.
\section*{Acknowledgements}
We owe a debt of gratitude to H. Suzuki and M. Kasahara for releasing their

View File

@ -1,30 +1,12 @@
Q 60 32066 0 0.000000000
Q 40 32 1 0.000031155
Q 38 19 1 0.000062272
Q 36 11 1 0.000093376
Q 35 32 1 0.000124378
Q 33 15 1 0.000155400
Q 32 58 1 0.000186145
Q 27 11 1 0.000217095
Q 26 80 1 0.000247494
Q 21 19 2 0.000309186
Q 20 16 1 0.000339936
Q 19 19 1 0.000370622
Q 18 22 2 0.000432099
Q 17 37 5 0.000585751
Q 15 24 2 0.000646930
Q 14 18 3 0.000738939
Q 13 30 6 0.000922821
Q 12 18 1 0.000953054
Q 11 29 2 0.001013638
Q 10 30 1 0.001043393
Q 9 20 5 0.001196099
Q 8 25 8 0.001440348
Q 7 28 6 0.001622830
Q 6 35 12 0.001988132
Q 5 34 12 0.002352725
Q 4 29 8 0.002594865
Q 3 36 14 0.003018937
Q 2 46 15 0.003471482
Q 1 69 36 0.004558162
Q 0 167 94 0.007377173
Q 60 32084 0 0.000000000 32084
Q 24 318 2 0.000061725 32402
Q 11 98 2 0.000123077 32500
Q 8 37 2 0.000184405 32537
Q 7 37 3 0.000276294 32574
Q 6 40 3 0.000367940 32614
Q 5 34 2 0.000428816 32648
Q 4 37 5 0.000581306 32685
Q 3 28 6 0.000764222 32713
Q 2 38 6 0.000946536 32751
Q 1 50 21 0.001585318 32801
Q 0 286 150 0.006105117 33087