Commit Graph

59 Commits (e9607fcd9b34a1bfbf705bf4e15c4d10d82d05de)

Author SHA1 Message Date
Heng Li eb819c29e8 Release minimap2-2.6 (r623) 2017-12-12 11:09:59 -05:00
Heng Li 98a6e52c06 r618: heuristics to avoid tiny terminal exons 2017-12-11 00:57:55 -05:00
Heng Li 0e42628ef6 r611: document --idx-no-seq; better inv aln 2017-12-08 13:16:18 -05:00
Heng Li 3b518271ee Release minimap2-2.5 (r572) 2017-11-11 11:29:28 -05:00
Heng Li 99a2709913 r567: minor change to #56 2017-11-09 19:17:45 -05:00
Heng Li 21a46ba652 Release minimap2-2.4 (r555) 2017-11-06 12:54:02 -05:00
Heng Li 192217a10c r539: use --splice-flank=yes by default
In human/mouse, the GTr..yAG pattern occurs to 91/92% of all GT-AG introns.
Modeling r..y clearly leads to higher accuracy. However, in SIRV, this
percentage is reduced to ~60%. The default "--splice --splice-flank=yes"
leads to lower accuracy. If someone benchmark minimap2 on SIRV, this would be
bad, but minimap2 is developed for practical applications, not for benchmarks.
I will live with that.
2017-10-28 22:29:55 -04:00
Heng Li 79b0caca95 r537: model the next base to GT/AG
[PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in
both human and yeast, and that the base preceeding AG tends to be C or T (i.e.
Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost
if there is no r or y. This improves the junction accuracy when mapping to
human and mouse and decreases the accuacy when mapping to SIRV. My guess is
that SIRV does not honor this trend. Need to investigate in future.

Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed
to 9 instead of 5. I also added --splice-flank to enable the above model. This
may become the default once I confirm my hypothesis on SIRV.
2017-10-28 00:25:01 -04:00
Heng Li 306e4541f8 Released minimap2-2.3 (r531) 2017-10-22 23:13:35 -04:00
Heng Li 04cf4ebf5e r518: increased the default -K to 500M
This helps multi-thread performance for ultra-long reads.
2017-10-17 13:21:29 -04:00
Heng Li 7c555f9b7e r508: use two I/O threads for mapping
-x sr applies this option by default
2017-10-12 14:56:01 -04:00
Heng Li f150257a0d r487: demote "map10k"; improved README 2017-10-07 19:19:40 -04:00
Heng Li ae2adf04d4 r476: multi-file fragment mode working 2017-10-05 15:39:26 -04:00
Heng Li b839758335 r475: added --cs=none; updated manpage 2017-10-05 15:27:37 -04:00
Heng Li 7d50e646dd r466: detect multi-part index more smartly
though it might not work in an extremely rare case: the end of a sequence ends
at X*16384 and it is the last sequence in a batch. This can be resolved by
never letting the kstream_t buffer empty.
2017-10-04 17:32:58 -04:00
Heng Li 7a9b4db874 replaced --approx-ext with --sr
--sr disables Z-drop and may come with other heurstics
2017-09-20 10:51:18 -04:00
Heng Li ea5a0cd17d Release minimap2-2.2 (r409) 2017-09-17 20:08:47 -04:00
Heng Li cf93e5c0a1 more functional minimap2.py; added categories 2017-09-17 17:06:39 -04:00
Heng Li 0f7455cefa r365: documented the "sr" preset 2017-09-14 12:57:21 -04:00
Heng Li ef3f7ea2f2 Release minimap2-2.1.1 (r341) 2017-09-06 13:46:51 -04:00
Heng Li bf8246f872 Release minimap2-2.1-r311 2017-08-25 13:35:55 +08:00
Heng Li 0fe1a224ab r309: improved SAM header output 2017-08-25 10:35:58 +08:00
Heng Li bbb37d95f2 support inserting RG lines 2017-08-17 23:34:09 +08:00
Heng Li b5f5929bf9 r296: expose splicing related options to CLI 2017-08-13 21:37:51 -04:00
Heng Li 53b3265d84 r290: in techrep, explain spliced alignment 2017-08-12 15:40:49 -04:00
Heng Li 5a74088b74 r288: changed max intron length to 200k 2017-08-12 12:39:21 -04:00
Heng Li d240318741 r287: refined CLI options and manpage 2017-08-12 12:26:04 -04:00
Heng Li 6840370f3c Release minimap2-2.0 (r275) 2017-08-08 21:16:25 -04:00
Heng Li d8d4d29b68 Release minimap2-2.0rc1-r232 2017-07-30 14:32:40 -04:00
Heng Li 1f78e1ee53 r230: code formatting changes only 2017-07-30 12:31:40 -04:00
Heng Li 19d6ec885e r224: inversion alignment around Z-drop break 2017-07-29 13:09:10 -04:00
Heng Li 84922cfe41 clarify that wrong seeding mainly occurs in LCRs 2017-07-28 14:33:15 -04:00
Heng Li 667b32a516 added algorithm overview 2017-07-27 18:50:39 -04:00
Heng Li 4aff301ef4 r190: default -k to 15; added -x map-ont 2017-07-19 10:11:14 -04:00
Heng Li 9a935dcef5 Fixed grammar errors 2017-07-18 11:37:58 -04:00
Heng Li 495a78e40a Get documentation ready for release 2017-07-18 11:04:09 -04:00
Heng Li 71e2a97a4c r180: changed -x asm5 settings 2017-07-18 00:00:36 -04:00
Heng Li 941059292e r179: changed the preset for assembly alignment 2017-07-17 22:41:46 -04:00
Heng Li b4280d186f r176: removed seedcov_ratio; changed default opt
min_seedcov_ratio is not used
2017-07-12 12:47:46 -04:00
Heng Li eeeb2ffb68 r174: make max-chain-skip work
The max-chain-skip heuristics did not work due to a bug. Without this
heuristics, chaining is too slow for long-read overlap.
2017-07-12 10:08:06 -04:00
Heng Li cfa083a98b r172: separated PacBio and ONT read overlapping
HPC k-mer works better for PacBio, but worse for ONT. Interesting...
2017-07-11 15:12:35 -04:00
Heng Li 1fee5f8edc r160: -O and -E accept two numbers 2017-07-08 11:34:52 -04:00
Heng Li cc554aee43 r159: use two-piece gap penalty 2017-07-08 10:26:00 -04:00
Heng Li 9823317e8f r158: optionally ignore base quality 2017-07-05 18:23:50 -04:00
Heng Li a94bc31311 r151: documentations 2017-07-03 12:11:07 -04:00
Heng Li 51cfb60520 r145: changed default -p from 2 to 0.8
For long reads, secondary alignments can be very information.
2017-07-02 22:51:45 -04:00
Heng Li da90b614db r141: replaced -b with -a (for SAM output)
-b sounds like BAM. I like -a better.
2017-07-01 16:54:59 -04:00
Heng Li 67cad9dff3 added limitations 2017-07-01 12:11:56 -04:00
Heng Li a92bd75b7b make the table fit 80 columns 2017-07-01 11:54:07 -04:00
Heng Li 02145b2166 explain option -d 2017-07-01 11:48:37 -04:00