Revision 1
This commit is contained in:
parent
a58b05a61b
commit
6e65c5e631
|
|
@ -480,7 +480,7 @@ applications, but it has to apply different sets of parameters depending on
|
||||||
input data types. Similar to BWA-MEM, minimap2 introduces `presets' that
|
input data types. Similar to BWA-MEM, minimap2 introduces `presets' that
|
||||||
modify multiple parameters with a simple invokation. Detailed settings
|
modify multiple parameters with a simple invokation. Detailed settings
|
||||||
and command-line options can be found in the minimap2 manpage. In addition to
|
and command-line options can be found in the minimap2 manpage. In addition to
|
||||||
the applications described in the following sections, minimap2 also retains
|
the applications evaluated in the following sections, minimap2 also retains
|
||||||
minimap's functionality to find overlaps between long reads and to search
|
minimap's functionality to find overlaps between long reads and to search
|
||||||
against large multi-species databases such as \emph{nt} from NCBI.
|
against large multi-species databases such as \emph{nt} from NCBI.
|
||||||
|
|
||||||
|
|
@ -654,22 +654,22 @@ context of small variant calling.
|
||||||
|
|
||||||
\subsection{Aligning long-read assemblies}
|
\subsection{Aligning long-read assemblies}
|
||||||
|
|
||||||
Minimap2 can align a human SMRT assembly (AC:GCA\_001297185.1) against
|
Minimap2 can align a SMRT assembly (AC:GCA\_001297185.1) against GRCh38 in 7
|
||||||
GRCh38 in 7 minutes using 8 CPU cores, over 20 times faster than
|
minutes using 8 CPU cores, over 20 times faster than nucmer from
|
||||||
MUMmer4~\citep{Marcais:2018aa}. With the paftools.js script from the minimap2
|
MUMmer4~\citep{Marcais:2018aa}. With the paftools.js script from the minimap2
|
||||||
package, we called 2.67 million single-base substitutions out of 2.78Gbp
|
package, we called 2.67 million single-base substitutions out of 2.78Gbp
|
||||||
genomic regions. The transition-to-transversion ratio (ts/tv) is 2.01. In
|
genomic regions. The transition-to-transversion ratio (ts/tv) is 2.01. In
|
||||||
comparison, using MUMmer4's dnadiff pipeline, we called 2.86 million
|
comparison, using MUMmer4's dnadiff pipeline, we called 2.86 million
|
||||||
substitutions in 2.83Gbp at ts/tv=1.87. Given that ts/tv averaged across the
|
substitutions in 2.83Gbp at ts/tv=1.87. Given that ts/tv averaged across the
|
||||||
human genome is about 2 but ts/tv averaged over random errors is 0.5, the
|
human genome is about 2 but ts/tv averaged over random errors is 0.5, the
|
||||||
minimap2 callset arguably has higher accuracy.
|
minimap2 callset arguably has higher precision at lower sensitivity.
|
||||||
|
|
||||||
The sample being assembled is a female. Minimap2 still called 201 substitutions
|
The sample being assembled is a female. Minimap2 still called 201 substitutions
|
||||||
on the Y chromosome. These substitutions all come from one contig aligned at
|
on the Y chromosome. These substitutions all come from one contig aligned at
|
||||||
96.8\% sequence identity. The contig could be a diverged segmental duplication
|
96.8\% sequence identity. The contig could be a segmental duplication
|
||||||
absent from GRCh38. In constrast, on the Y chromosome, MUMmer4 called 9070
|
absent from GRCh38. In constrast, dnadiff called 9070 substitutions on the Y
|
||||||
substitutions across 73 SMRT contigs. The accuracy of the MUMmer4 pipeline is
|
chromosome across 73 SMRT contigs. This again implies our minimap2-based
|
||||||
probably lower than our minimap2-based pipeline.
|
pipeline has higher precision.
|
||||||
|
|
||||||
\section{Discussions}
|
\section{Discussions}
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue