Heng Li
ee4cd089f7
r763: fine control long join flank len ( #128 )
2018-03-29 14:16:58 -04:00
Heng Li
8fc5f8dc90
r711: assign proper mapq to primary inversions
2018-02-15 14:34:59 -05:00
Heng Li
1372977a37
r708: implemented double Z-drop thresholds ( #112 )
...
When aligning long reads, we would prefer to align through low-quality
regions. This requires a large Z-drop threshold. However, to find small
inversions, we need to use a small Z-drop. This commit address this
conflict with two Z-drop thresholds. When Z-drop exceeds the smaller
threshold, we perform a local alignment to check if there is a potential
inversion. If there is one, we break the alignment; otherwise we break
the alignment only if Z-drop excess the larger threshold.
This commit also fixes a bug that reported wrong coordinates when the
inversion is on the forward strand (#112 ).
2018-02-15 10:50:49 -05:00
Heng Li
7ef5490884
r703: added --max-clip-ratio
...
still testing the option
2018-02-12 13:29:18 -05:00
Heng Li
46d6349af4
r670: added PE support to mappy
...
and minor code cleanup
2018-01-31 11:33:08 -05:00
Heng Li
98a999fe44
r611: added pseudocount when est divergence
2017-12-08 12:57:57 -05:00
Heng Li
984f7846c0
r601: bugfix - a similar issue to r600
...
This bug unsets the alignment score of suboptimal alignments.
2017-11-30 11:51:34 -05:00
Heng Li
af1d6afba9
r600: bugfix - missing secondary alignments ( #71 )
...
This should very rarely happen to typical data, but has a higher chance in
artifactual data.
2017-11-30 11:34:10 -05:00
Heng Li
b24d68ae9f
r557: fixed another mapq underestimate
...
When a chain is split during base-level alignment, its chaining score is
reduced. However, the chaining score of its suboptimal chain remains the same.
This leads to underestimated mapping quality.
2017-11-07 23:20:49 -05:00
Heng Li
65deedfa96
r556: bugfix - underestimate mapq for split aln
2017-11-07 22:37:12 -05:00
Heng Li
cd24dc8834
r545: removed option -i, not working well
2017-10-31 22:23:27 -04:00
Heng Li
311fa90030
r543: applied some sr mapq changes to long reads
2017-10-31 15:24:05 -04:00
Heng Li
fb8a1b5536
r542: tuning mapQ calculation
2017-10-31 14:25:09 -04:00
Heng Li
bd04372873
r524: reverted to bwa-mem end bonus
...
and reduced the cost of clipping when filtering by identity
2017-10-20 16:57:31 -04:00
Heng Li
addb61bcb2
r515: more conservative hit exclusion
...
When a hit covers a long query subsequence that has not been covered by better
primary hits, this hit is more likely to become a new primary hit.
2017-10-16 13:58:01 -04:00
Heng Li
adf6cd7f52
r513: merged pre- and post-cigar blen and mlen
...
This saves a bit memory and is cleaner.
2017-10-16 10:55:18 -04:00
Heng Li
e6f525edaf
r512: option to filter poorly aligned reads
2017-10-16 10:38:22 -04:00
Heng Li
ce06188203
r506: fixed a memory leak
2017-10-12 10:12:22 -04:00
Heng Li
13b66aad4d
r495: fix impropriate CIGAR
...
1. Not left aligned
2. In one case, 50M24D50M becomes 24D100M. The leading D needs to be removed.
3. Avoid identical hits after DP
2017-10-10 11:59:44 -04:00
Heng Li
2a1e738a94
r461: randomize repetitive hits
2017-10-04 13:05:18 -04:00
Heng Li
2a554a92e9
r451: changed rep_len mapq heuristic
2017-09-28 14:23:14 -04:00
Heng Li
935a6e6064
r450: differentiate exact repeats via mapq
2017-09-27 23:51:05 -04:00
Heng Li
f611edf6f2
r443: don't filter small cm for split seg
2017-09-26 16:17:58 -04:00
Heng Li
55d1e4f638
r440: better chain filtering for PE reads
2017-09-26 11:03:36 -04:00
Heng Li
9943e5fdd0
backup
2017-09-20 14:35:46 -04:00
Heng Li
03d6894517
backup
2017-09-20 11:47:46 -04:00
Heng Li
11081c6c27
r411: refactored kalloc for clarity
...
The new version is closer to K&R's original implementation.
2017-09-18 19:49:15 -04:00
Heng Li
4d3768bf26
r364: improved the mapq heuristics
...
* use repetitive seed lengths, not counts
* compute n_sub to higher accuracy
* use bwa-mem mapq heuristic as a backup
For short single-end reads, minimap2's ROC is not as good as bwa-mem's, but is
close.
2017-09-14 12:37:03 -04:00
Heng Li
47e9d76ca1
further mapq tuning
2017-09-14 10:46:14 -04:00
Heng Li
6a82a21dee
r361: improved mapq for short reads
2017-09-13 15:32:39 -04:00
Heng Li
19d6ec885e
r224: inversion alignment around Z-drop break
2017-07-29 13:09:10 -04:00
Heng Li
254280b8af
r216: a bit cleanup; identical output to r215
2017-07-28 11:54:18 -04:00
Heng Li
a01d758af6
r206: mapq penalize short chains further
...
The old code penalized at the log() scale. Now added a linear-scaled factor. If
the chain consists of few minimizers, its quality is really not good.
2017-07-26 11:50:04 -04:00
Heng Li
e9dc1ce2b6
r205: when computing mapq, consider min_chain_sc
...
Not doing this was a mistake.
2017-07-26 11:34:14 -04:00
Heng Li
00c6db5073
r203: check more subopt aln if score small
2017-07-25 20:02:44 -04:00
Heng Li
38aa66fa30
r178: fixed integer overflow in mapq calculation
2017-07-16 21:45:39 -04:00
Heng Li
b4280d186f
r176: removed seedcov_ratio; changed default opt
...
min_seedcov_ratio is not used
2017-07-12 12:47:46 -04:00
Heng Li
801bc84b01
r169: output more accurate col. 10&11 to PAF
...
In r168, col.10 is smaller than what it should be. This confuses miniasm.
2017-07-11 14:09:51 -04:00
Heng Li
782449975d
r168: fixed a bug in long join: a[] not sorted
...
Also added length requirement for long join and changed -g in the ava mode
2017-07-09 12:14:20 -04:00
Heng Li
1ac48556ae
r167: long join threshold depends on gap
...
also caught a bug for reverse strand join
2017-07-09 10:38:51 -04:00
Heng Li
38b2830e18
r161: filter bad seeds; changed default -g/-r
2017-07-08 13:31:27 -04:00
Heng Li
e07daad7ad
r153: sam primary record not set sometimes
2017-07-03 13:18:57 -04:00
Heng Li
b625247300
r150: mm_sync_regs() doesn't work with negative id
2017-07-03 11:36:34 -04:00
Heng Li
2e4fd9f1d0
r148: revamped regs handling after cigar
2017-07-03 10:44:26 -04:00
Heng Li
696ebce66e
backup; still buggy
2017-07-03 00:52:00 -04:00
Heng Li
e06c342659
r146: in filtering, drop children if parent out
...
This has been causing several segfaults.
2017-07-03 00:28:12 -04:00
Heng Li
632b8638d2
r144: adjust primary aln after cigar
2017-07-02 22:43:02 -04:00
Heng Li
2b45ba7a0b
r143: fixed a segfault and incorrect .parent
2017-07-02 19:56:21 -04:00
Heng Li
74d306a596
fixed bug when retaining 2ndary aln; still buggy
2017-07-02 19:08:30 -04:00
Heng Li
426c2975f6
r126: filter by fraction of seed coverage
...
otherwise we may get too many poor overlap mappings.
2017-06-30 22:15:45 -04:00