Heng Li
fc965805f7
r215: bring back a log gap component
...
Otherwise chaining may more often break a long gap into several gaps.
2017-07-28 00:17:19 -04:00
Heng Li
667b32a516
added algorithm overview
2017-07-27 18:50:39 -04:00
Heng Li
2c79580649
r213: more careful solution to wrong seeds
...
a little better, but not good enough!
2017-07-27 13:19:11 -04:00
Heng Li
b927838495
r212: better heuristic to fix wrong seeding
...
but not good enough. Will explore more.
2017-07-27 11:24:51 -04:00
Heng Li
371e20cc7c
r211: a better heurstic to reduce false seeds
2017-07-26 23:56:38 -04:00
Heng Li
8b08a2ec41
added first-only and all-primary modes to eval
2017-07-26 19:32:26 -04:00
Heng Li
9f728dd96a
tunable overlap ratio
2017-07-26 18:30:29 -04:00
Heng Li
323293fbda
changed the way mapq is counted
2017-07-26 14:20:31 -04:00
Heng Li
bd33bed455
make output better
2017-07-26 12:03:42 -04:00
Heng Li
a01d758af6
r206: mapq penalize short chains further
...
The old code penalized at the log() scale. Now added a linear-scaled factor. If
the chain consists of few minimizers, its quality is really not good.
2017-07-26 11:50:04 -04:00
Heng Li
e9dc1ce2b6
r205: when computing mapq, consider min_chain_sc
...
Not doing this was a mistake.
2017-07-26 11:34:14 -04:00
Heng Li
a8ad53ee81
evaluating read mapping for pbsim2fa reads
...
More functionality to be added later
2017-07-26 11:17:34 -04:00
Heng Li
00c6db5073
r203: check more subopt aln if score small
2017-07-25 20:02:44 -04:00
Heng Li
f2ef48878a
r202: trim bad chain ends before extension
...
This fixes a few more FP long INDELs towards the end of alignments.
2017-07-25 19:53:19 -04:00
Heng Li
21ca564112
r201: fixed a minor chaining issue
...
Chaining looked at the end of a chain, but the end may not be the best. We now
go back to find the max.
2017-07-25 18:26:51 -04:00
Heng Li
215e92ed7b
r200: reduce long gaps in chaining
...
Every seed can initiate a chain.
2017-07-25 17:32:54 -04:00
Heng Li
b530ade333
r199: changed to linear gap cost for chaining
...
The old cost doesn't penalize long gaps enough. Will also drop seeds close to
the edge in the next commit.
2017-07-25 15:35:10 -04:00
Heng Li
4bd7ebc39c
convert pbsim .maf to fasta
2017-07-25 13:54:58 -04:00
Heng Li
f81f37fef1
r197: allocate index seq names from kalloc
...
to reduce malloc() overhead.
2017-07-24 19:36:05 -04:00
Heng Li
5c4d040b13
r191: warning if CLI index opt diff from prebuilt
...
Also added index testing API (moved from main.c to index.c)
2017-07-19 10:25:11 -04:00
Heng Li
4aff301ef4
r190: default -k to 15; added -x map-ont
2017-07-19 10:11:14 -04:00
Heng Li
470021fd27
r189: sync with ksw2 (no effective changes)
2017-07-19 09:28:25 -04:00
Heng Li
71c988f6ab
r188: renamed bseq* to mm_bseq*
...
to avoid naming collisions between minimap2 and bwa/fermi-lite/etc
2017-07-19 09:26:46 -04:00
Heng Li
b9b0b6f49c
r187: fixed non-terminated sam output ( #3 )
...
Only happen to unmapped read, with quality, and in the SAM output
2017-07-18 15:20:29 -04:00
Heng Li
947cf162be
update dependency; delete minimap2-lite on clean
2017-07-18 14:48:46 -04:00
Heng Li
293a7049f3
migrated minimap/example.c here.
2017-07-18 14:43:43 -04:00
Heng Li
a1addb2949
use -msse4 by default, not -march
2017-07-18 13:56:51 -04:00
Heng Li
b4edb0cf55
fixed URL
2017-07-18 13:26:51 -04:00
Heng Li
9a935dcef5
Fixed grammar errors
2017-07-18 11:37:58 -04:00
Heng Li
495a78e40a
Get documentation ready for release
2017-07-18 11:04:09 -04:00
Heng Li
71e2a97a4c
r180: changed -x asm5 settings
2017-07-18 00:00:36 -04:00
Heng Li
941059292e
r179: changed the preset for assembly alignment
2017-07-17 22:41:46 -04:00
Heng Li
38aa66fa30
r178: fixed integer overflow in mapq calculation
2017-07-16 21:45:39 -04:00
Heng Li
f42790398d
fixed a band boundary bug in ksw2
2017-07-16 17:52:57 -04:00
Heng Li
b4280d186f
r176: removed seedcov_ratio; changed default opt
...
min_seedcov_ratio is not used
2017-07-12 12:47:46 -04:00
Heng Li
52caf79395
r175: halved max-chain-skip in the ava mode
2017-07-12 10:42:19 -04:00
Heng Li
eeeb2ffb68
r174: make max-chain-skip work
...
The max-chain-skip heuristics did not work due to a bug. Without this
heuristics, chaining is too slow for long-read overlap.
2017-07-12 10:08:06 -04:00
Heng Li
33451aba45
r173: changed the debugging output format
2017-07-11 15:23:28 -04:00
Heng Li
cfa083a98b
r172: separated PacBio and ONT read overlapping
...
HPC k-mer works better for PacBio, but worse for ONT. Interesting...
2017-07-11 15:12:35 -04:00
Heng Li
7598809577
r171: reduced log gap cost at chaining
...
The cost is so large that it discards too many valid seeds without HPC k-mers.
This change may introduce false long gaps to reference mapping. We have another
mechanism mm_filter_bad_seeds() to protect against this. In addition, minimap2
is not that bad to have long gaps. Some other aligners are worse.
Still need tuning in future.
2017-07-11 14:57:49 -04:00
Heng Li
826c8ba892
r170: added a debugging flag
...
something wrong with chaining
2017-07-11 14:47:35 -04:00
Heng Li
801bc84b01
r169: output more accurate col. 10&11 to PAF
...
In r168, col.10 is smaller than what it should be. This confuses miniasm.
2017-07-11 14:09:51 -04:00
Heng Li
782449975d
r168: fixed a bug in long join: a[] not sorted
...
Also added length requirement for long join and changed -g in the ava mode
2017-07-09 12:14:20 -04:00
Heng Li
1ac48556ae
r167: long join threshold depends on gap
...
also caught a bug for reverse strand join
2017-07-09 10:38:51 -04:00
Heng Li
34ed85d46a
optionally print the postions of long indels
2017-07-08 20:16:25 -04:00
Heng Li
2f11a63b22
script to collect alignment stats
2017-07-08 20:01:36 -04:00
Heng Li
4ee3202539
r164: unmapped read not properly flagged
2017-07-08 18:16:18 -04:00
Heng Li
42846ce65d
r163: reduced long join score requirement
...
because the chaining score is generally smaller with the last few commits.
2017-07-08 15:51:52 -04:00
Heng Li
3f6a0b0b5c
r162: improved chaining accuracy
2017-07-08 14:29:36 -04:00
Heng Li
38b2830e18
r161: filter bad seeds; changed default -g/-r
2017-07-08 13:31:27 -04:00