From de0480ac5b4603f03deb3c7072d4b72238e75f94 Mon Sep 17 00:00:00 2001 From: Heng Li Date: Mon, 12 Mar 2018 13:05:51 -0400 Subject: [PATCH] finished section "Read Overlap" --- cookbook.md | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/cookbook.md b/cookbook.md index 143a5f6..8b4d99f 100644 --- a/cookbook.md +++ b/cookbook.md @@ -5,6 +5,9 @@ * [Mapping long reads](#map-pb) * [Mapping Illumina paired-end reads](#map-sr) * [Evaluating mapping accuracy with simulated reads (for developers)](#mapeval) +- [Read Overlap](#read-overlap) + * [Long-read overlap](#long-read-overlap) + * [Evaluating overlap sensitivity](#ov-eval) ## Installation @@ -73,5 +76,31 @@ paftools.js mason2fq tmp.sam | seqtk seq -1 > ecoli_mason_1.fq paftools.js mason2fq tmp.sam | seqtk seq -1 > ecoli_mason_2.fq ``` +## Read Overlap + +### Long read overlap +```sh +# For pacbio reads: +minimap2 -x ava-pb ecoli_p6_25x_canu.fa ecoli_p6_25x_canu.fa > overlap.paf +# For Nanopore reads (ava-ont also works with PacBio but not as good): +minimap2 -x ava-ont -r 10000 ecoli_p6_25x_canu.fa ecoli_p6_25x_canu.fa > overlap.paf +``` +Here we explicitly applied `-r 10000`. We are considering to set this as the +default for the `ava-ont` mode as this seems to improve the contiguity for +nanopore read assembly (Loman, personal communication). + +**Minimap2 doesn't work well with short-read overlap.** + +### Evaluating overlap sensitivity + +```sh +# read to reference mapping +minimap2 -cx map-pb ecoli_ref.fa ecoli_p6_25x_canu.fa > to-ref.paf +# evaluate overlap sensitivity +sort -k6,6 -k8,8n to-ref.paf | paftools.js ov-eval - overlap.paf +``` +You can see that for PacBio reads, minimap2 achieves higher overlap sensitivity +with `-x ava-pb` (99% vs 93% with `-x ava-ont`). + [pbsim]: https://github.com/pfaucon/PBSIM-PacBio-Simulator [mason2]: https://github.com/seqan/seqan/tree/master/apps/mason2