Commit Graph

149 Commits (41d7ccb191a3c1734fb3ad715a97224ecbd37d84)

Author SHA1 Message Date
Heng Li 4f91558160 r1048: rescue long gaps 2021-05-24 16:09:09 -04:00
Heng Li 827ca4b461 r1012: fixed an off-by-one bug; resolves #489 2021-04-07 23:31:31 -04:00
Armin Töpfer c9874e2dc5 Initialize r->p if ez->zdropped 2020-06-12 09:22:18 -04:00
Heng Li a7a01fe5bd r973: fixed compiling errors caused 2020-01-21 10:43:31 -05:00
Heng Li eb3ed6993d support ALT mapping 2020-01-21 09:17:50 -05:00
Heng Li 69af86657e r935: fixed a cigar like 5I6D7I; resolved #392 2019-04-30 21:35:24 -04:00
Heng Li be171aa2dc implemented in exts; testing is the next 2019-04-28 16:47:12 -04:00
Heng Li cf2bae6e9b r904: fixed a corner-case segfault. Resolves #307. 2019-01-10 09:57:05 -05:00
Heng Li 83a8ee7038 r888: fixed incorrect CIGAR when --eqx in use
This was caused by mm_fix_cigar() which may change query/target offset in very
rare cases. Generating EQX has to beware of this change.

Resolves #266
2018-11-18 14:22:29 -05:00
Heng Li 88c421e8de r881: a recent change reduces sr accuracy 2018-11-05 22:03:59 -05:00
Heng Li 13981404e2 r876: skip DP if taking too much RAM (#259) 2018-11-05 11:43:10 -05:00
Heng Li 377c7099a8 r858: fixed a bug; resolves #254 2018-10-22 22:47:11 -04:00
Heng Li d04ac068fd r852: a minor when large --end-bonus is in use
We may use a large --end-bonus to mimic end-to-end alignment. In the short-read
mode, the candidate alignment region may be out of the band, which leads to
truncated alignment.
2018-10-15 21:28:27 -04:00
Heng Li 5ab6538757 r822: added option --no-end-flt 2018-08-05 19:42:12 -04:00
Heng Li 4b707aac92 working with toy examples 2018-07-15 10:55:00 -04:00
Heng Li 951c0d1d35 apparently mm_append_cigar() wastes some memory 2018-07-14 23:47:44 -04:00
Heng Li 66674afd09 r794: fixed a bug in seed filtering 2018-06-20 10:26:29 -04:00
Heng Li 7e6e8ca73f r792: fixed -Wextra warnings and resolved #184 2018-06-19 15:26:58 -04:00
Aaron Wenger 3d3bcc29a8 Fix CIGAR reallocation with --eqx
Fix the logic that calculates the number of CIGAR entries when
match "M" entries are expanded into "=" and "X".  The number
of entries depends not on the number of mismatches but rather
on the number of transitions between "=" to "X".
2018-06-19 14:37:41 -04:00
Heng Li 154d2caf5b r784: support the =/X CIGAR operators (#156) 2018-05-30 16:11:22 -04:00
Heng Li a3afeec0b2 r783: reverted to r781 (#155) 2018-05-30 15:25:34 -04:00
Heng Li 9f4309c376 r777: avoid skipping too many seeds 2018-05-11 10:25:18 -04:00
Heng Li 881b4ca3a2 r774: Merge branch 'hot-fix' into fix-long-gap 2018-05-11 10:02:17 -04:00
Heng Li e61812ee55 reduced gap len to trigger bad seed filtering 2018-05-01 16:17:21 -04:00
Heng Li 734ac379bb r770: matching N bases not working properly (#155) 2018-04-30 19:55:23 -04:00
Heng Li 759f8e4ac9 r769: filter out seeds breaking long gaps 2018-04-24 15:37:37 -04:00
Heng Li 83c57a9d98 r719: fixed bad memory access 2018-02-23 17:27:41 -05:00
Heng Li a0d62519c1 r710: fixed incorrect inversion coordinate (#112) 2018-02-15 14:23:42 -05:00
Heng Li 1372977a37 r708: implemented double Z-drop thresholds (#112)
When aligning long reads, we would prefer to align through low-quality
regions. This requires a large Z-drop threshold. However, to find small
inversions, we need to use a small Z-drop. This commit address this
conflict with two Z-drop thresholds. When Z-drop exceeds the smaller
threshold, we perform a local alignment to check if there is a potential
inversion. If there is one, we break the alignment; otherwise we break
the alignment only if Z-drop excess the larger threshold.

This commit also fixes a bug that reported wrong coordinates when the
inversion is on the forward strand (#112).
2018-02-15 10:50:49 -05:00
Heng Li c0e0d5d84b r707: bugfix for inversions on rev strand (#112) 2018-02-14 14:09:03 -05:00
Heng Li 7ef5490884 r703: added --max-clip-ratio
still testing the option
2018-02-12 13:29:18 -05:00
Heng Li a8d476c6ad r686: end seed trimming don't go over long join 2018-02-06 11:31:32 -05:00
Heng Li 29b4a1786c r685: tune end seed filter again 2018-02-05 11:48:22 -05:00
Heng Li dbf284b2d9 r684: separate end score from min_chain_score 2018-02-05 11:40:38 -05:00
Heng Li 35d3e064bf r677: reduce the change of missing hits
that are close to end of alignments. It is still possible to create examples
that fail the heuristic.
2018-02-02 10:35:33 -05:00
Heng Li 12a5a5fa3c r669: improved self chain extension (#10)
This has not fully resolved #10, only alleviated the issue.
2018-01-30 20:05:02 -05:00
Heng Li 33f8157961 r655: options to map to one strand of the ref #91 2018-01-16 10:34:30 -05:00
Heng Li f5cfd439ee r651: incorrectly treat introns as deletions
This happened when the last operation during backtracking is an intron.
2018-01-07 19:42:50 -05:00
Heng Li 98a6e52c06 r618: heuristics to avoid tiny terminal exons 2017-12-11 00:57:55 -05:00
Heng Li 824712a4ee r617: removed some unused code 2017-12-10 17:54:50 -05:00
Heng Li 0e42628ef6 r611: document --idx-no-seq; better inv aln 2017-12-08 13:16:18 -05:00
Heng Li 2f463b1db0 r573: prepare to generalize index 2017-11-11 19:54:06 -05:00
Heng Li d7a31e40e6 r569: last commit is buggy 2017-11-09 23:20:41 -05:00
Heng Li dd18cd75de r568: revert - don't take max(dp_max, dp_score) 2017-11-09 23:12:48 -05:00
Heng Li a7b38f6900 r562: fixed a severe bug: wrong query start 2017-11-08 22:31:05 -05:00
Heng Li 98ba8928c6 r558: dp_max no less than dp_score 2017-11-08 10:06:10 -05:00
Heng Li cd24dc8834 r545: removed option -i, not working well 2017-10-31 22:23:27 -04:00
Heng Li 79b0caca95 r537: model the next base to GT/AG
[PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in
both human and yeast, and that the base preceeding AG tends to be C or T (i.e.
Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost
if there is no r or y. This improves the junction accuracy when mapping to
human and mouse and decreases the accuacy when mapping to SIRV. My guess is
that SIRV does not honor this trend. Need to investigate in future.

Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed
to 9 instead of 5. I also added --splice-flank to enable the above model. This
may become the default once I confirm my hypothesis on SIRV.
2017-10-28 00:25:01 -04:00
Heng Li beeb806829 r526: fixed a bug when HPC is in use
It happened when the query HPC minimizer is longer than the reference HPC
minimizer close to the beginning of a contig. We may get a negative coordinate,
which causes an assertion failure.
2017-10-21 19:54:04 -04:00
Heng Li ffd953029f r519: fixed a severe bug that misses long alns 2017-10-17 15:52:36 -04:00