Heng Li
7f11f4c4d4
Instructions on different long RNA-seq techs
2017-10-29 13:58:25 -04:00
Heng Li
285eb0da05
r540: removed a buggy debugging line
2017-10-29 00:02:41 -04:00
Heng Li
192217a10c
r539: use --splice-flank=yes by default
...
In human/mouse, the GTr..yAG pattern occurs to 91/92% of all GT-AG introns.
Modeling r..y clearly leads to higher accuracy. However, in SIRV, this
percentage is reduced to ~60%. The default "--splice --splice-flank=yes"
leads to lower accuracy. If someone benchmark minimap2 on SIRV, this would be
bad, but minimap2 is developed for practical applications, not for benchmarks.
I will live with that.
2017-10-28 22:29:55 -04:00
Heng Li
f22a94e868
r538: fixed a long existing bug in HPC k-mer ( #47 )
...
This bug may lead to a wrong minimizer when a HPC k-mer is longer than 256bp.
When there is a seed match involving this wrong HPC k-mer, the correct seed
sequences do not match in fact. This violates the assumption in align.c and
subsequently causes a segfault, which is what #47 has caught. This bug lurked
in the earliest piece of code and affected all released minimap2 versions so
far. It is extremely rare and does not affect the prebuilt GRCh37/38 indices.
2017-10-28 19:21:10 -04:00
Heng Li
79b0caca95
r537: model the next base to GT/AG
...
[PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in
both human and yeast, and that the base preceeding AG tends to be C or T (i.e.
Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost
if there is no r or y. This improves the junction accuracy when mapping to
human and mouse and decreases the accuacy when mapping to SIRV. My guess is
that SIRV does not honor this trend. Need to investigate in future.
Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed
to 9 instead of 5. I also added --splice-flank to enable the above model. This
may become the default once I confirm my hypothesis on SIRV.
2017-10-28 00:25:01 -04:00
Heng Li
afc2f2e84b
r536: removed an unnecessary assert()
2017-10-24 21:08:54 -04:00
Heng Li
e6f66f2f3b
disabled download counts
...
seems not working any more
2017-10-24 14:40:08 -04:00
Heng Li
70735098e2
fixed a typo in README
2017-10-24 14:39:33 -04:00
Heng Li
d4b5dfc297
r533: added --no-pairing
...
to prevent the use of any pairing information for paired-end reads.
2017-10-23 14:09:32 -04:00
Heng Li
5acd709524
updated the download link to v2.3
2017-10-23 13:43:31 -04:00
Heng Li
306e4541f8
Released minimap2-2.3 (r531)
2017-10-22 23:13:35 -04:00
Heng Li
1dd221ad82
a bit more on short read mapping
...
The tech note still needs improvement. Will do that after the release of v2.3.
2017-10-22 18:38:35 -04:00
Heng Li
c6b6392b70
minor wording changes
2017-10-21 23:46:36 -04:00
Heng Li
dc37aee881
minor wording changes
2017-10-21 23:38:05 -04:00
Heng Li
37e627aa98
note on long cigar in README
2017-10-21 22:28:06 -04:00
Heng Li
beeb806829
r526: fixed a bug when HPC is in use
...
It happened when the query HPC minimizer is longer than the reference HPC
minimizer close to the beginning of a contig. We may get a negative coordinate,
which causes an assertion failure.
2017-10-21 19:54:04 -04:00
Heng Li
be7f3c4ffe
r525: fixed a bug in chaining; handle ovlp ends
2017-10-20 21:34:52 -04:00
Heng Li
bd04372873
r524: reverted to bwa-mem end bonus
...
and reduced the cost of clipping when filtering by identity
2017-10-20 16:57:31 -04:00
Heng Li
15ed0712c2
r523: fixed a performance bug in ksw2_ll
...
Wont' affect accuracy.
2017-10-20 13:00:10 -04:00
Heng Li
55dcbefe87
updated text (unfinished)
2017-10-20 12:44:54 -04:00
Heng Li
8abba332ad
replaced mapQ plot with sr roc
...
figure legend and text to be updated later
2017-10-19 23:43:17 -04:00
Heng Li
4683da2455
r520: added option -L to write long cigar to CG
2017-10-17 17:32:44 -04:00
Heng Li
ffd953029f
r519: fixed a severe bug that misses long alns
2017-10-17 15:52:36 -04:00
Heng Li
04cf4ebf5e
r518: increased the default -K to 500M
...
This helps multi-thread performance for ultra-long reads.
2017-10-17 13:21:29 -04:00
Heng Li
25ffd72690
r517: replaced --print-2nd with --secondary
2017-10-17 11:41:56 -04:00
Heng Li
aa2d9d4e1b
r516: throw a warning if -N0 is used
2017-10-16 14:55:35 -04:00
Heng Li
addb61bcb2
r515: more conservative hit exclusion
...
When a hit covers a long query subsequence that has not been covered by better
primary hits, this hit is more likely to become a new primary hit.
2017-10-16 13:58:01 -04:00
Heng Li
b24c9c90c7
updated mappy and example.c
2017-10-16 11:15:07 -04:00
Heng Li
adf6cd7f52
r513: merged pre- and post-cigar blen and mlen
...
This saves a bit memory and is cleaner.
2017-10-16 10:55:18 -04:00
Heng Li
e6f525edaf
r512: option to filter poorly aligned reads
2017-10-16 10:38:22 -04:00
Heng Li
858213d513
r511: fixed wrong primary sam record
2017-10-12 23:02:18 -04:00
Heng Li
dea3b60918
r510: fixed an off-by-1 bug for unmapped mate
2017-10-12 17:31:13 -04:00
Heng Li
7c555f9b7e
r508: use two I/O threads for mapping
...
-x sr applies this option by default
2017-10-12 14:56:01 -04:00
Heng Li
2801ed9b4b
r507: -K not working as is intended ( #36 )
2017-10-12 14:16:05 -04:00
Heng Li
27025c70a7
added evaluation scripts to README
2017-10-12 12:55:10 -04:00
Heng Li
9bafbe4e70
added Getting help and cs
2017-10-12 12:40:52 -04:00
Heng Li
ce06188203
r506: fixed a memory leak
2017-10-12 10:12:22 -04:00
Heng Li
9862a75cd3
r505: a bit code simplification
2017-10-11 21:54:32 -04:00
Heng Li
3073f4a758
r504: better heuristics to reduce excessive ext
2017-10-11 21:42:11 -04:00
Heng Li
ba6ddda6b0
Merge pull request #35 from mcshane/sam_fix
...
fix sam output for some unmapped queries
2017-10-11 20:09:16 -04:00
Heng Li
9364bc64d7
r501: added end_bonus to extz2
2017-10-11 09:39:41 -04:00
Shane McCarthy
5498565157
fix sam output for some unmapped queries
2017-10-11 08:46:24 +01:00
Heng Li
65abdb8f3c
r500: temporarily disabled region trunc
...
because it is causing other problems.
2017-10-11 00:16:04 -04:00
Heng Li
7345621759
r499: end bonus working; DP region needs improve!
2017-10-11 00:14:25 -04:00
Heng Li
ca632f907b
r498: fixed a bug when merging like "4I5I"
2017-10-10 21:22:37 -04:00
Heng Li
6c78a980b6
r497: the previous change not working at the ends
2017-10-10 17:32:28 -04:00
Heng Li
c217eecdb7
r496: avoid DP extending into another chain
...
When deciding the region for DP, exclude regions in the adjacent chain
2017-10-10 17:25:12 -04:00
Heng Li
13b66aad4d
r495: fix impropriate CIGAR
...
1. Not left aligned
2. In one case, 50M24D50M becomes 24D100M. The leading D needs to be removed.
3. Avoid identical hits after DP
2017-10-10 11:59:44 -04:00
Heng Li
46fa520db9
r494: simpler and better SR gap filling
...
Still one thing to do: left alignment
2017-10-09 22:02:30 -04:00
Heng Li
1e53610fb4
r493: reduced calling extd2 for ungapped aln
...
Still need to improve in case of 3I5M3D
2017-10-09 21:13:34 -04:00