Commit Graph

293 Commits (d7a31e40e66772e2c20bbb52ea2e462d9e92c2a2)

Author SHA1 Message Date
Heng Li d7a31e40e6 r569: last commit is buggy 2017-11-09 23:20:41 -05:00
Heng Li dd18cd75de r568: revert - don't take max(dp_max, dp_score) 2017-11-09 23:12:48 -05:00
Heng Li 99a2709913 r567: minor change to #56 2017-11-09 19:17:45 -05:00
mvdbeek 1cb0bf4bef Implement -Y for soft clipping of supp. alignments
I tried to base this on bwa-mem and it seems to work for sam alignments.
2017-11-09 19:22:36 +01:00
Heng Li a7b38f6900 r562: fixed a severe bug: wrong query start 2017-11-08 22:31:05 -05:00
Heng Li e896c9ec05 r559: prefer a chain involving more segments 2017-11-08 13:22:16 -05:00
Heng Li 98ba8928c6 r558: dp_max no less than dp_score 2017-11-08 10:06:10 -05:00
Heng Li b24d68ae9f r557: fixed another mapq underestimate
When a chain is split during base-level alignment, its chaining score is
reduced. However, the chaining score of its suboptimal chain remains the same.
This leads to underestimated mapping quality.
2017-11-07 23:20:49 -05:00
Heng Li 65deedfa96 r556: bugfix - underestimate mapq for split aln 2017-11-07 22:37:12 -05:00
Heng Li 21a46ba652 Release minimap2-2.4 (r555) 2017-11-06 12:54:02 -05:00
Heng Li fa5a645ca5 r552: fixed a tiny typo on struct packing
The old packing wastes memory, thought very small.
2017-11-05 08:27:26 -05:00
Heng Li a3f0aa1d5b r550: fixed -L issues with secondary and supp aln 2017-11-04 12:13:38 -04:00
Heng Li 22290db3e4 r546: minor mapQ tuning 2017-11-01 13:20:39 -04:00
Heng Li cd24dc8834 r545: removed option -i, not working well 2017-10-31 22:23:27 -04:00
Heng Li b8e758df0f r544: increased PE mapQ 2017-10-31 16:55:02 -04:00
Heng Li 311fa90030 r543: applied some sr mapq changes to long reads 2017-10-31 15:24:05 -04:00
Heng Li fb8a1b5536 r542: tuning mapQ calculation 2017-10-31 14:25:09 -04:00
Heng Li 285eb0da05 r540: removed a buggy debugging line 2017-10-29 00:02:41 -04:00
Heng Li 192217a10c r539: use --splice-flank=yes by default
In human/mouse, the GTr..yAG pattern occurs to 91/92% of all GT-AG introns.
Modeling r..y clearly leads to higher accuracy. However, in SIRV, this
percentage is reduced to ~60%. The default "--splice --splice-flank=yes"
leads to lower accuracy. If someone benchmark minimap2 on SIRV, this would be
bad, but minimap2 is developed for practical applications, not for benchmarks.
I will live with that.
2017-10-28 22:29:55 -04:00
Heng Li f22a94e868 r538: fixed a long existing bug in HPC k-mer (#47)
This bug may lead to a wrong minimizer when a HPC k-mer is longer than 256bp.
When there is a seed match involving this wrong HPC k-mer, the correct seed
sequences do not match in fact. This violates the assumption in align.c and
subsequently causes a segfault, which is what #47 has caught. This bug lurked
in the earliest piece of code and affected all released minimap2 versions so
far. It is extremely rare and does not affect the prebuilt GRCh37/38 indices.
2017-10-28 19:21:10 -04:00
Heng Li 79b0caca95 r537: model the next base to GT/AG
[PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in
both human and yeast, and that the base preceeding AG tends to be C or T (i.e.
Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost
if there is no r or y. This improves the junction accuracy when mapping to
human and mouse and decreases the accuacy when mapping to SIRV. My guess is
that SIRV does not honor this trend. Need to investigate in future.

Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed
to 9 instead of 5. I also added --splice-flank to enable the above model. This
may become the default once I confirm my hypothesis on SIRV.
2017-10-28 00:25:01 -04:00
Heng Li afc2f2e84b r536: removed an unnecessary assert() 2017-10-24 21:08:54 -04:00
Heng Li d4b5dfc297 r533: added --no-pairing
to prevent the use of any pairing information for paired-end reads.
2017-10-23 14:09:32 -04:00
Heng Li 306e4541f8 Released minimap2-2.3 (r531) 2017-10-22 23:13:35 -04:00
Heng Li beeb806829 r526: fixed a bug when HPC is in use
It happened when the query HPC minimizer is longer than the reference HPC
minimizer close to the beginning of a contig. We may get a negative coordinate,
which causes an assertion failure.
2017-10-21 19:54:04 -04:00
Heng Li be7f3c4ffe r525: fixed a bug in chaining; handle ovlp ends 2017-10-20 21:34:52 -04:00
Heng Li bd04372873 r524: reverted to bwa-mem end bonus
and reduced the cost of clipping when filtering by identity
2017-10-20 16:57:31 -04:00
Heng Li 15ed0712c2 r523: fixed a performance bug in ksw2_ll
Wont' affect accuracy.
2017-10-20 13:00:10 -04:00
Heng Li 4683da2455 r520: added option -L to write long cigar to CG 2017-10-17 17:32:44 -04:00
Heng Li ffd953029f r519: fixed a severe bug that misses long alns 2017-10-17 15:52:36 -04:00
Heng Li 04cf4ebf5e r518: increased the default -K to 500M
This helps multi-thread performance for ultra-long reads.
2017-10-17 13:21:29 -04:00
Heng Li 25ffd72690 r517: replaced --print-2nd with --secondary 2017-10-17 11:41:56 -04:00
Heng Li aa2d9d4e1b r516: throw a warning if -N0 is used 2017-10-16 14:55:35 -04:00
Heng Li addb61bcb2 r515: more conservative hit exclusion
When a hit covers a long query subsequence that has not been covered by better
primary hits, this hit is more likely to become a new primary hit.
2017-10-16 13:58:01 -04:00
Heng Li adf6cd7f52 r513: merged pre- and post-cigar blen and mlen
This saves a bit memory and is cleaner.
2017-10-16 10:55:18 -04:00
Heng Li e6f525edaf r512: option to filter poorly aligned reads 2017-10-16 10:38:22 -04:00
Heng Li 858213d513 r511: fixed wrong primary sam record 2017-10-12 23:02:18 -04:00
Heng Li dea3b60918 r510: fixed an off-by-1 bug for unmapped mate 2017-10-12 17:31:13 -04:00
Heng Li 7c555f9b7e r508: use two I/O threads for mapping
-x sr applies this option by default
2017-10-12 14:56:01 -04:00
Heng Li 2801ed9b4b r507: -K not working as is intended (#36) 2017-10-12 14:16:05 -04:00
Heng Li ce06188203 r506: fixed a memory leak 2017-10-12 10:12:22 -04:00
Heng Li 9862a75cd3 r505: a bit code simplification 2017-10-11 21:54:32 -04:00
Heng Li 3073f4a758 r504: better heuristics to reduce excessive ext 2017-10-11 21:42:11 -04:00
Heng Li 9364bc64d7 r501: added end_bonus to extz2 2017-10-11 09:39:41 -04:00
Heng Li 65abdb8f3c r500: temporarily disabled region trunc
because it is causing other problems.
2017-10-11 00:16:04 -04:00
Heng Li 7345621759 r499: end bonus working; DP region needs improve! 2017-10-11 00:14:25 -04:00
Heng Li ca632f907b r498: fixed a bug when merging like "4I5I" 2017-10-10 21:22:37 -04:00
Heng Li 6c78a980b6 r497: the previous change not working at the ends 2017-10-10 17:32:28 -04:00
Heng Li c217eecdb7 r496: avoid DP extending into another chain
When deciding the region for DP, exclude regions in the adjacent chain
2017-10-10 17:25:12 -04:00
Heng Li 13b66aad4d r495: fix impropriate CIGAR
1. Not left aligned
2. In one case, 50M24D50M becomes 24D100M. The leading D needs to be removed.
3. Avoid identical hits after DP
2017-10-10 11:59:44 -04:00