Commit Graph

211 Commits (ee9b2773a8a70f3c30a4f05e00b6c873eb7d1927)

Author SHA1 Message Date
Heng Li ee9b2773a8 r456: min chain score should >k-mer length
or chain_dp() wastes time on unnecessarily sorting chains with one k-mer.
2017-09-29 22:33:55 -04:00
Heng Li 340483821e r455: set max_occ on command line 2017-09-29 22:18:43 -04:00
Heng Li 04fb2c2ec0 r454: rechain with higher max_occ if no good chain 2017-09-29 19:24:32 -04:00
Heng Li 0d4ecd19ee r453: avoid duplicated strcmp() for ava 2017-09-28 15:52:05 -04:00
Heng Li 0c63325985 r452: fixed - -G not working with -x sr 2017-09-28 14:28:12 -04:00
Heng Li 2a554a92e9 r451: changed rep_len mapq heuristic 2017-09-28 14:23:14 -04:00
Heng Li 935a6e6064 r450: differentiate exact repeats via mapq 2017-09-27 23:51:05 -04:00
Heng Li 8301222174 r448: fixed a bug when computing PE quality 2017-09-27 21:54:07 -04:00
Heng Li 7e0d70bfd3 r445: pair coordinate adjustment working
Next: mapq adjustment, which will be tricky...
2017-09-27 15:38:18 -04:00
Heng Li a349d85280 r444: changed the way orientation is specified
The old model doesn't work with RF or RR orientation. The new model only works
with paired-end reads. For >2 segments, only FF is supported.
2017-09-27 12:33:10 -04:00
Heng Li f611edf6f2 r443: don't filter small cm for split seg 2017-09-26 16:17:58 -04:00
Heng Li 1b1dd0cd57 r442: default max_gap to 200 in the sr mode 2017-09-26 13:31:01 -04:00
Heng Li 55d1e4f638 r440: better chain filtering for PE reads 2017-09-26 11:03:36 -04:00
Heng Li 64c0ad6b35 r439: use splice-like chain gap cost between segs
This improves accuracy
2017-09-25 16:04:38 -04:00
Heng Li 9538c985aa r438: fixed a rare case that leads to missing hits
It is a bug in chaining.
2017-09-25 14:59:34 -04:00
Heng Li 8f25cfa36e r437: fixed uninialized memory on rep_len 2017-09-25 14:22:45 -04:00
Heng Li 81008dd371 r436: working on short reads
The result is mixed - lots of room for tuning
2017-09-25 14:06:29 -04:00
Heng Li 3bb66e1ed3 multi-seg working on toy examples 2017-09-25 13:42:04 -04:00
Heng Li 5b39a1b34b Merge branch 'master' into sr 2017-09-20 12:24:08 -04:00
Heng Li e3b5802b2e r424: reduce memory for long query seqs 2017-09-20 12:22:13 -04:00
Heng Li 645db3350e Merge branch 'master' into sr 2017-09-20 11:15:14 -04:00
Heng Li 75e6bbc9f6 r421: removed the MM_F_SPLICE_BOTH mode
In the default splice mode, minimap2 applies two rounds of spliced alignment:
first assuming GT-AG to be the splice signal across all splicing sites and then
assuming CT-AC to be the signal. This is the idea strategy.

In the MM_F_SPLICE_BOTH mode, minimap2 applies one round of spliced alignment,
assuming GT-AG and CT-AC to be the splice signals AT THE SAME TIME. This will
be faster but less accurate. I don't think anyone would like to run minimap2 in
this mode, so I am removing it for clarity.
2017-09-20 11:11:53 -04:00
Heng Li 7a9b4db874 replaced --approx-ext with --sr
--sr disables Z-drop and may come with other heurstics
2017-09-20 10:51:18 -04:00
Heng Li b99c22840f r414: avoid assertion failure for 0-length reads 2017-09-19 22:21:27 -04:00
Heng Li 11081c6c27 r411: refactored kalloc for clarity
The new version is closer to K&R's original implementation.
2017-09-18 19:49:15 -04:00
Heng Li ea5a0cd17d Release minimap2-2.2 (r409) 2017-09-17 20:08:47 -04:00
Heng Li e9c57f6d8b r402: exposed kseq (for API in mappy later) 2017-09-17 13:09:16 -04:00
Heng Li c07f9f9a49 r372: default mm_verbose to 1, and change in main 2017-09-16 09:14:34 -04:00
Heng Li 14b853499f r369: updated example with the latest API 2017-09-14 22:44:10 -04:00
Heng Li 75ff7ceec5 r368: API documentation 2017-09-14 22:23:04 -04:00
Heng Li e2823d4aee r367: index reader optionally writes index 2017-09-14 21:18:13 -04:00
Heng Li eb00521d9b redesigned indexing and option APIs 2017-09-14 17:02:01 -04:00
Heng Li 0f7455cefa r365: documented the "sr" preset 2017-09-14 12:57:21 -04:00
Heng Li 4d3768bf26 r364: improved the mapq heuristics
* use repetitive seed lengths, not counts
* compute n_sub to higher accuracy
* use bwa-mem mapq heuristic as a backup

For short single-end reads, minimap2's ROC is not as good as bwa-mem's, but is
close.
2017-09-14 12:37:03 -04:00
Heng Li 47e9d76ca1 further mapq tuning 2017-09-14 10:46:14 -04:00
Heng Li f4a8766283 r362: fixed overestimated chaining score
Caused by ilog2_32(0)=-1. This bug was fixed once and reoccurred as I was
tuning the score function but forgot to apply the fix.
2017-09-14 10:15:22 -04:00
Heng Li 6a82a21dee r361: improved mapq for short reads 2017-09-13 15:32:39 -04:00
Heng Li 3c91d652dd r360: allow to set integer max occ 2017-09-13 11:37:00 -04:00
Heng Li d7f2ac1d4f better parameters for short reads
It turns out the key problem is not the minimizer density. It is the max
occurrence that tends to affect results more, especially sensitivity. There is
still lots of work to do, but for now, it seems a good start.
2017-09-12 16:11:23 -04:00
Heng Li eea9e851d8 Merge branch 'dev' into short 2017-09-11 09:32:28 -04:00
Heng Li c7c3585531 r347: merged mm_map_frag() into mm_map()
mm_map_frag() was separated due to an earlier design that has been rejected.
2017-09-10 15:02:55 -04:00
Heng Li 87a278d06a Merge branch 'dev' into short 2017-09-09 08:49:58 -04:00
Heng Li f422175e4e r344: avoid unnecessary refName retrieval 2017-09-08 22:44:14 -04:00
Heng Li 709b6ec1f1 increase seed occurrences 2017-09-08 22:42:39 -04:00
Heng Li 0031158936 Merge branch 'master' into short 2017-09-07 11:41:32 -04:00
Heng Li ef3f7ea2f2 Release minimap2-2.1.1 (r341) 2017-09-06 13:46:51 -04:00
Heng Li 8b9f2aaf04 r339: improved SIMD detection
old code does not check AVX2
2017-09-05 13:10:30 -04:00
Heng Li 46e8b6a4f9 r338: portable CPU dispatch, which is the default
working with gcc, icc, clang and msvc.
2017-09-03 20:29:24 -04:00
Heng Li 3c997ca016 r337: support CPU dispatch for gcc-4.8+
using __builtin_cpu_supports()
2017-09-03 14:29:49 -04:00
Heng Li f9ccc522cd Merge branch 'master' into short 2017-09-03 11:58:15 -04:00