Commit Graph

72 Commits (843729df1e94e7e75cad3ee5a45f9e6e9695caae)

Author SHA1 Message Date
Heng Li a8f1fa8ea3 r1114: retain more candidate inversion alignments 2021-11-18 21:37:10 -05:00
Heng Li 39bdd45875 r1108: fixed missing inversions for #816 and #806 2021-10-04 16:34:30 -04:00
Heng Li b046052d82 Merge branch 'master' into utec 2021-07-16 13:32:47 -04:00
Heng Li 379728726a r1049: removed the long-join heuristics 2021-05-24 16:21:40 -04:00
Heng Li f995f55610 added --mask-len for #659 2020-08-21 11:12:50 -04:00
Heng Li da7109fd29 r985: optionally report cs/cg on the query strand
PAF only; not well tested
2020-04-21 12:37:35 -04:00
Heng Li eb3ed6993d support ALT mapping 2020-01-21 09:17:50 -05:00
Heng Li 32ab6ce15b r914: fixed two harmless division by 0
Resolves #326
2019-02-12 19:30:49 -05:00
Heng Li 1077b7ddc8 r846: added --hard-mask-level for #244 2018-09-27 14:46:26 -04:00
Heng Li 4b707aac92 working with toy examples 2018-07-15 10:55:00 -04:00
Heng Li 7e6e8ca73f r792: fixed -Wextra warnings and resolved #184 2018-06-19 15:26:58 -04:00
Heng Li a3afeec0b2 r783: reverted to r781 (#155) 2018-05-30 15:25:34 -04:00
Heng Li 3573784b4d r782: no mask a chain having long ref ovlp (#155) 2018-05-30 13:53:45 -04:00
Heng Li 372c90ceb5 r764: fixed incorrect inversion mapq (#148) 2018-04-10 09:11:49 -04:00
Heng Li ee4cd089f7 r763: fine control long join flank len (#128) 2018-03-29 14:16:58 -04:00
Heng Li 8fc5f8dc90 r711: assign proper mapq to primary inversions 2018-02-15 14:34:59 -05:00
Heng Li 1372977a37 r708: implemented double Z-drop thresholds (#112)
When aligning long reads, we would prefer to align through low-quality
regions. This requires a large Z-drop threshold. However, to find small
inversions, we need to use a small Z-drop. This commit address this
conflict with two Z-drop thresholds. When Z-drop exceeds the smaller
threshold, we perform a local alignment to check if there is a potential
inversion. If there is one, we break the alignment; otherwise we break
the alignment only if Z-drop excess the larger threshold.

This commit also fixes a bug that reported wrong coordinates when the
inversion is on the forward strand (#112).
2018-02-15 10:50:49 -05:00
Heng Li 7ef5490884 r703: added --max-clip-ratio
still testing the option
2018-02-12 13:29:18 -05:00
Heng Li 46d6349af4 r670: added PE support to mappy
and minor code cleanup
2018-01-31 11:33:08 -05:00
Heng Li 98a999fe44 r611: added pseudocount when est divergence 2017-12-08 12:57:57 -05:00
Heng Li 984f7846c0 r601: bugfix - a similar issue to r600
This bug unsets the alignment score of suboptimal alignments.
2017-11-30 11:51:34 -05:00
Heng Li af1d6afba9 r600: bugfix - missing secondary alignments (#71)
This should very rarely happen to typical data, but has a higher chance in
artifactual data.
2017-11-30 11:34:10 -05:00
Heng Li b24d68ae9f r557: fixed another mapq underestimate
When a chain is split during base-level alignment, its chaining score is
reduced. However, the chaining score of its suboptimal chain remains the same.
This leads to underestimated mapping quality.
2017-11-07 23:20:49 -05:00
Heng Li 65deedfa96 r556: bugfix - underestimate mapq for split aln 2017-11-07 22:37:12 -05:00
Heng Li cd24dc8834 r545: removed option -i, not working well 2017-10-31 22:23:27 -04:00
Heng Li 311fa90030 r543: applied some sr mapq changes to long reads 2017-10-31 15:24:05 -04:00
Heng Li fb8a1b5536 r542: tuning mapQ calculation 2017-10-31 14:25:09 -04:00
Heng Li bd04372873 r524: reverted to bwa-mem end bonus
and reduced the cost of clipping when filtering by identity
2017-10-20 16:57:31 -04:00
Heng Li addb61bcb2 r515: more conservative hit exclusion
When a hit covers a long query subsequence that has not been covered by better
primary hits, this hit is more likely to become a new primary hit.
2017-10-16 13:58:01 -04:00
Heng Li adf6cd7f52 r513: merged pre- and post-cigar blen and mlen
This saves a bit memory and is cleaner.
2017-10-16 10:55:18 -04:00
Heng Li e6f525edaf r512: option to filter poorly aligned reads 2017-10-16 10:38:22 -04:00
Heng Li ce06188203 r506: fixed a memory leak 2017-10-12 10:12:22 -04:00
Heng Li 13b66aad4d r495: fix impropriate CIGAR
1. Not left aligned
2. In one case, 50M24D50M becomes 24D100M. The leading D needs to be removed.
3. Avoid identical hits after DP
2017-10-10 11:59:44 -04:00
Heng Li 2a1e738a94 r461: randomize repetitive hits 2017-10-04 13:05:18 -04:00
Heng Li 2a554a92e9 r451: changed rep_len mapq heuristic 2017-09-28 14:23:14 -04:00
Heng Li 935a6e6064 r450: differentiate exact repeats via mapq 2017-09-27 23:51:05 -04:00
Heng Li f611edf6f2 r443: don't filter small cm for split seg 2017-09-26 16:17:58 -04:00
Heng Li 55d1e4f638 r440: better chain filtering for PE reads 2017-09-26 11:03:36 -04:00
Heng Li 9943e5fdd0 backup 2017-09-20 14:35:46 -04:00
Heng Li 03d6894517 backup 2017-09-20 11:47:46 -04:00
Heng Li 11081c6c27 r411: refactored kalloc for clarity
The new version is closer to K&R's original implementation.
2017-09-18 19:49:15 -04:00
Heng Li 4d3768bf26 r364: improved the mapq heuristics
* use repetitive seed lengths, not counts
* compute n_sub to higher accuracy
* use bwa-mem mapq heuristic as a backup

For short single-end reads, minimap2's ROC is not as good as bwa-mem's, but is
close.
2017-09-14 12:37:03 -04:00
Heng Li 47e9d76ca1 further mapq tuning 2017-09-14 10:46:14 -04:00
Heng Li 6a82a21dee r361: improved mapq for short reads 2017-09-13 15:32:39 -04:00
Heng Li 19d6ec885e r224: inversion alignment around Z-drop break 2017-07-29 13:09:10 -04:00
Heng Li 254280b8af r216: a bit cleanup; identical output to r215 2017-07-28 11:54:18 -04:00
Heng Li a01d758af6 r206: mapq penalize short chains further
The old code penalized at the log() scale. Now added a linear-scaled factor. If
the chain consists of few minimizers, its quality is really not good.
2017-07-26 11:50:04 -04:00
Heng Li e9dc1ce2b6 r205: when computing mapq, consider min_chain_sc
Not doing this was a mistake.
2017-07-26 11:34:14 -04:00
Heng Li 00c6db5073 r203: check more subopt aln if score small 2017-07-25 20:02:44 -04:00
Heng Li 38aa66fa30 r178: fixed integer overflow in mapq calculation 2017-07-16 21:45:39 -04:00