Commit Graph

145 Commits (767556b6f0cf7f8cc8096e29a88b0ab520651ff2)

Author SHA1 Message Date
Heng Li c8f0a35c40 r1117: added --no-hash-name for deterministic 2021-11-24 16:49:48 -05:00
Heng Li 39bdd45875 r1108: fixed missing inversions for #816 and #806 2021-10-04 16:34:30 -04:00
Heng Li aefa2c0d86 added --chain-skip-scale 2021-10-01 16:58:03 -04:00
Heng Li 05a8a45d44 r1105: avoid long running time occasionally (#771)
Caused by highly repetitive minimizers on a query sequence. The solution is to
filter out these query minimizers.
2021-08-15 19:43:01 -04:00
Heng Li 7e33fde82b dev-r1087: added --cap-kalloc 2021-07-19 21:20:04 -04:00
Heng Li 161ae7ff73 dev-r1079: per-read error rate
more tuning needed
2021-07-18 20:38:53 -04:00
Heng Li 8a6edab847 dev-r1078: decoupling ranking penalty 2021-07-18 16:22:48 -04:00
Heng Li 2546999639 dev-r1076: log gap penalty 2021-07-17 18:23:59 -04:00
Heng Li b046052d82 Merge branch 'master' into utec 2021-07-16 13:32:47 -04:00
John Marshall 260a68d232 Use #defines for CIGAR operators in C code
Give the CIGAR constants names to clarify the code. So that ksw2.h
remains self-contained, define KSW_* versions of the CIGAR operators
it needs for use within ksw2.h. Other code should in general use the
full set of MM_CIGAR_* constants in minimap.h.
2021-07-02 13:03:03 -04:00
John Marshall 177eef259d Use the full MIDNSHP=X string whenever printing CIGAR strings
Define MM_CIGAR_STR to the full string of CIGAR operators (including
the 'B' operator as well) and use it throughout the C code.

It would be possible to use it from the Cython code too, but it's easier
to keep that as a Cython string literal to avoid adding extra runtime
code to handle locale conversion.
2021-07-02 13:03:03 -04:00
Heng Li 34a41197d7 r1051: added two internal parameters
rmq_rescue_size and rmq_rescue_ratio
2021-05-24 16:38:45 -04:00
Heng Li 379728726a r1049: removed the long-join heuristics 2021-05-24 16:21:40 -04:00
Heng Li 4f91558160 r1048: rescue long gaps 2021-05-24 16:09:09 -04:00
Heng Li bbb4f97e52 support RMQ 2021-05-03 09:27:04 -04:00
Heng Li 0f5608c4a4 r1028: backport minigraph -U 2021-05-01 15:41:39 -04:00
Heng Li feb92d32ea r1025: seed rescuring 2021-04-30 17:33:16 -04:00
Heng Li f995f55610 added --mask-len for #659 2020-08-21 11:12:50 -04:00
Heng Li da7109fd29 r985: optionally report cs/cg on the query strand
PAF only; not well tested
2020-04-21 12:37:35 -04:00
Heng Li 9dceae59a0 r972: renamed --alt-diff to --alt-drop 2020-01-21 10:33:39 -05:00
Heng Li eb3ed6993d support ALT mapping 2020-01-21 09:17:50 -05:00
Heng Li d2e14705e7 r968: allow large mini_batch; resolves #491 2020-01-18 12:24:44 -05:00
Heng Li 040f74102c r965: added --chain-gap-scale for #540 2020-01-18 10:29:33 -05:00
Heng Li c2aec88b84 r938: added --sam-hit-only; resolved #377 2019-04-30 22:40:36 -04:00
Heng Li 97f67a2a0a r937: enlarge mm_mapopt_t::flag to 64 bits 2019-04-30 22:30:32 -04:00
Heng Li 49c6d83a8e r934: --junc-bed to read BED12 2019-04-28 20:12:28 -04:00
Heng Li be171aa2dc implemented in exts; testing is the next 2019-04-28 16:47:12 -04:00
Heng Li 6420acca6d BED I/O 2019-04-28 16:47:12 -04:00
Heng Li d431dc0181 r917: added --max-chain-iter to avoid worst case
Resolves #324
2019-02-27 14:41:01 -05:00
Heng Li ea2b1c5b2a r894: added --max-qlen to filter out long query 2018-12-12 12:27:32 -05:00
Heng Li 13981404e2 r876: skip DP if taking too much RAM (#259) 2018-11-05 11:43:10 -05:00
Heng Li 1077b7ddc8 r846: added --hard-mask-level for #244 2018-09-27 14:46:26 -04:00
Heng Li 5ab6538757 r822: added option --no-end-flt 2018-08-05 19:42:12 -04:00
Heng Li ff9917a1c4 r819: mappy to support cs/MD 2018-07-24 23:29:55 -04:00
Heng Li 3545e35a42 pairing in the split-idx mode 2018-07-14 23:43:34 -04:00
Heng Li 1a55227d5a write hits to tmp files (unfinished) 2018-07-14 12:15:10 -04:00
Heng Li a609a07f8c optionally output unmapped query in PAF 2018-07-07 10:26:08 -05:00
Heng Li 0517972d02 Release minimap2-2.11 (r797) 2018-06-21 00:04:08 -04:00
Ilya Kolpakov 57f37551f8 expose mm_idx_is_idx, mm_idx_load and mm_idx_dump 2018-06-19 14:46:05 -04:00
Heng Li 154d2caf5b r784: support the =/X CIGAR operators (#156) 2018-05-30 16:11:22 -04:00
Heng Li 734ac379bb r770: matching N bases not working properly (#155) 2018-04-30 19:55:23 -04:00
Heng Li ee4cd089f7 r763: fine control long join flank len (#128) 2018-03-29 14:16:58 -04:00
Heng Li 08bd2123b6 r752: option to copy comments to output (#136) 2018-03-23 10:04:33 -04:00
Heng Li 8766d286df r751: optionally output MD (#118) 2018-03-22 14:15:33 -04:00
Heng Li bdc615c1d4 r741: added --min-occ-floor to improve #107 2018-03-12 14:32:27 -04:00
Heng Li 24a4808826 r718: retrieve sequence from the index 2018-02-23 10:18:26 -05:00
Heng Li 1372977a37 r708: implemented double Z-drop thresholds (#112)
When aligning long reads, we would prefer to align through low-quality
regions. This requires a large Z-drop threshold. However, to find small
inversions, we need to use a small Z-drop. This commit address this
conflict with two Z-drop thresholds. When Z-drop exceeds the smaller
threshold, we perform a local alignment to check if there is a potential
inversion. If there is one, we break the alignment; otherwise we break
the alignment only if Z-drop excess the larger threshold.

This commit also fixes a bug that reported wrong coordinates when the
inversion is on the forward strand (#112).
2018-02-15 10:50:49 -05:00
Heng Li 7ef5490884 r703: added --max-clip-ratio
still testing the option
2018-02-12 13:29:18 -05:00
Heng Li 29b4a1786c r685: tune end seed filter again 2018-02-05 11:48:22 -05:00
Heng Li dbf284b2d9 r684: separate end score from min_chain_score 2018-02-05 11:40:38 -05:00