Heng Li
22290db3e4
r546: minor mapQ tuning
2017-11-01 13:20:39 -04:00
Heng Li
cd24dc8834
r545: removed option -i, not working well
2017-10-31 22:23:27 -04:00
Heng Li
b8e758df0f
r544: increased PE mapQ
2017-10-31 16:55:02 -04:00
Heng Li
311fa90030
r543: applied some sr mapq changes to long reads
2017-10-31 15:24:05 -04:00
Heng Li
fb8a1b5536
r542: tuning mapQ calculation
2017-10-31 14:25:09 -04:00
Heng Li
285eb0da05
r540: removed a buggy debugging line
2017-10-29 00:02:41 -04:00
Heng Li
192217a10c
r539: use --splice-flank=yes by default
...
In human/mouse, the GTr..yAG pattern occurs to 91/92% of all GT-AG introns.
Modeling r..y clearly leads to higher accuracy. However, in SIRV, this
percentage is reduced to ~60%. The default "--splice --splice-flank=yes"
leads to lower accuracy. If someone benchmark minimap2 on SIRV, this would be
bad, but minimap2 is developed for practical applications, not for benchmarks.
I will live with that.
2017-10-28 22:29:55 -04:00
Heng Li
f22a94e868
r538: fixed a long existing bug in HPC k-mer ( #47 )
...
This bug may lead to a wrong minimizer when a HPC k-mer is longer than 256bp.
When there is a seed match involving this wrong HPC k-mer, the correct seed
sequences do not match in fact. This violates the assumption in align.c and
subsequently causes a segfault, which is what #47 has caught. This bug lurked
in the earliest piece of code and affected all released minimap2 versions so
far. It is extremely rare and does not affect the prebuilt GRCh37/38 indices.
2017-10-28 19:21:10 -04:00
Heng Li
79b0caca95
r537: model the next base to GT/AG
...
[PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in
both human and yeast, and that the base preceeding AG tends to be C or T (i.e.
Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost
if there is no r or y. This improves the junction accuracy when mapping to
human and mouse and decreases the accuacy when mapping to SIRV. My guess is
that SIRV does not honor this trend. Need to investigate in future.
Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed
to 9 instead of 5. I also added --splice-flank to enable the above model. This
may become the default once I confirm my hypothesis on SIRV.
2017-10-28 00:25:01 -04:00
Heng Li
afc2f2e84b
r536: removed an unnecessary assert()
2017-10-24 21:08:54 -04:00
Heng Li
d4b5dfc297
r533: added --no-pairing
...
to prevent the use of any pairing information for paired-end reads.
2017-10-23 14:09:32 -04:00
Heng Li
306e4541f8
Released minimap2-2.3 (r531)
2017-10-22 23:13:35 -04:00
Heng Li
beeb806829
r526: fixed a bug when HPC is in use
...
It happened when the query HPC minimizer is longer than the reference HPC
minimizer close to the beginning of a contig. We may get a negative coordinate,
which causes an assertion failure.
2017-10-21 19:54:04 -04:00
Heng Li
be7f3c4ffe
r525: fixed a bug in chaining; handle ovlp ends
2017-10-20 21:34:52 -04:00
Heng Li
bd04372873
r524: reverted to bwa-mem end bonus
...
and reduced the cost of clipping when filtering by identity
2017-10-20 16:57:31 -04:00
Heng Li
15ed0712c2
r523: fixed a performance bug in ksw2_ll
...
Wont' affect accuracy.
2017-10-20 13:00:10 -04:00
Heng Li
4683da2455
r520: added option -L to write long cigar to CG
2017-10-17 17:32:44 -04:00
Heng Li
ffd953029f
r519: fixed a severe bug that misses long alns
2017-10-17 15:52:36 -04:00
Heng Li
04cf4ebf5e
r518: increased the default -K to 500M
...
This helps multi-thread performance for ultra-long reads.
2017-10-17 13:21:29 -04:00
Heng Li
25ffd72690
r517: replaced --print-2nd with --secondary
2017-10-17 11:41:56 -04:00
Heng Li
aa2d9d4e1b
r516: throw a warning if -N0 is used
2017-10-16 14:55:35 -04:00
Heng Li
addb61bcb2
r515: more conservative hit exclusion
...
When a hit covers a long query subsequence that has not been covered by better
primary hits, this hit is more likely to become a new primary hit.
2017-10-16 13:58:01 -04:00
Heng Li
adf6cd7f52
r513: merged pre- and post-cigar blen and mlen
...
This saves a bit memory and is cleaner.
2017-10-16 10:55:18 -04:00
Heng Li
e6f525edaf
r512: option to filter poorly aligned reads
2017-10-16 10:38:22 -04:00
Heng Li
858213d513
r511: fixed wrong primary sam record
2017-10-12 23:02:18 -04:00
Heng Li
dea3b60918
r510: fixed an off-by-1 bug for unmapped mate
2017-10-12 17:31:13 -04:00
Heng Li
7c555f9b7e
r508: use two I/O threads for mapping
...
-x sr applies this option by default
2017-10-12 14:56:01 -04:00
Heng Li
2801ed9b4b
r507: -K not working as is intended ( #36 )
2017-10-12 14:16:05 -04:00
Heng Li
ce06188203
r506: fixed a memory leak
2017-10-12 10:12:22 -04:00
Heng Li
9862a75cd3
r505: a bit code simplification
2017-10-11 21:54:32 -04:00
Heng Li
3073f4a758
r504: better heuristics to reduce excessive ext
2017-10-11 21:42:11 -04:00
Heng Li
9364bc64d7
r501: added end_bonus to extz2
2017-10-11 09:39:41 -04:00
Heng Li
65abdb8f3c
r500: temporarily disabled region trunc
...
because it is causing other problems.
2017-10-11 00:16:04 -04:00
Heng Li
7345621759
r499: end bonus working; DP region needs improve!
2017-10-11 00:14:25 -04:00
Heng Li
ca632f907b
r498: fixed a bug when merging like "4I5I"
2017-10-10 21:22:37 -04:00
Heng Li
6c78a980b6
r497: the previous change not working at the ends
2017-10-10 17:32:28 -04:00
Heng Li
c217eecdb7
r496: avoid DP extending into another chain
...
When deciding the region for DP, exclude regions in the adjacent chain
2017-10-10 17:25:12 -04:00
Heng Li
13b66aad4d
r495: fix impropriate CIGAR
...
1. Not left aligned
2. In one case, 50M24D50M becomes 24D100M. The leading D needs to be removed.
3. Avoid identical hits after DP
2017-10-10 11:59:44 -04:00
Heng Li
46fa520db9
r494: simpler and better SR gap filling
...
Still one thing to do: left alignment
2017-10-09 22:02:30 -04:00
Heng Li
1e53610fb4
r493: reduced calling extd2 for ungapped aln
...
Still need to improve in case of 3I5M3D
2017-10-09 21:13:34 -04:00
Heng Li
9396d9e11b
r452: typo in the last commit
2017-10-09 10:05:32 -04:00
Heng Li
198849a716
r491: an ambiguous base costs the same as gap ext
2017-10-09 09:59:42 -04:00
Heng Li
9fea4d16b3
r490: improved short-read extension heuristic
...
Now we find the best scoring ungapped seeded segment and then extend from it.
There is no gap filling for short reads.
2017-10-08 21:36:34 -04:00
Heng Li
f9415628a8
r489: don't use approximate zdrop
...
it doesn't work well
2017-10-08 19:29:09 -04:00
Heng Li
61e56c941d
r488: parameter to control max fragment length
2017-10-07 23:54:32 -04:00
Heng Li
f150257a0d
r487: demote "map10k"; improved README
2017-10-07 19:19:40 -04:00
Heng Li
bf2d4f7aec
r486: treat "U" as "T" for RNA reads ( #33 )
2017-10-07 18:53:25 -04:00
Heng Li
c6384ed2c8
r482: increased short-read bandwidth to 100
...
This has very minor effect on speed.
2017-10-06 10:20:32 -04:00
Heng Li
e0baf1ad54
r479: a bit code cleanup
2017-10-05 16:15:14 -04:00
Heng Li
f266092699
r478: simplied useless code, a tiny bit
2017-10-05 15:56:00 -04:00