Heng Li
e6f525edaf
r512: option to filter poorly aligned reads
2017-10-16 10:38:22 -04:00
Heng Li
9862a75cd3
r505: a bit code simplification
2017-10-11 21:54:32 -04:00
Heng Li
3073f4a758
r504: better heuristics to reduce excessive ext
2017-10-11 21:42:11 -04:00
Heng Li
9364bc64d7
r501: added end_bonus to extz2
2017-10-11 09:39:41 -04:00
Heng Li
65abdb8f3c
r500: temporarily disabled region trunc
...
because it is causing other problems.
2017-10-11 00:16:04 -04:00
Heng Li
7345621759
r499: end bonus working; DP region needs improve!
2017-10-11 00:14:25 -04:00
Heng Li
ca632f907b
r498: fixed a bug when merging like "4I5I"
2017-10-10 21:22:37 -04:00
Heng Li
6c78a980b6
r497: the previous change not working at the ends
2017-10-10 17:32:28 -04:00
Heng Li
c217eecdb7
r496: avoid DP extending into another chain
...
When deciding the region for DP, exclude regions in the adjacent chain
2017-10-10 17:25:12 -04:00
Heng Li
13b66aad4d
r495: fix impropriate CIGAR
...
1. Not left aligned
2. In one case, 50M24D50M becomes 24D100M. The leading D needs to be removed.
3. Avoid identical hits after DP
2017-10-10 11:59:44 -04:00
Heng Li
46fa520db9
r494: simpler and better SR gap filling
...
Still one thing to do: left alignment
2017-10-09 22:02:30 -04:00
Heng Li
1e53610fb4
r493: reduced calling extd2 for ungapped aln
...
Still need to improve in case of 3I5M3D
2017-10-09 21:13:34 -04:00
Heng Li
9fea4d16b3
r490: improved short-read extension heuristic
...
Now we find the best scoring ungapped seeded segment and then extend from it.
There is no gap filling for short reads.
2017-10-08 21:36:34 -04:00
Heng Li
f9415628a8
r489: don't use approximate zdrop
...
it doesn't work well
2017-10-08 19:29:09 -04:00
Heng Li
e0baf1ad54
r479: a bit code cleanup
2017-10-05 16:15:14 -04:00
Heng Li
3ff6eda3a4
r473: don't count introns into blen
2017-10-05 14:37:21 -04:00
Heng Li
841763ec24
Merge branch 'master' into sr
2017-10-04 11:42:44 -04:00
Heng Li
95eb1dec36
r458: fixed wrong chr for inversion aln ( #30 )
2017-10-04 11:32:06 -04:00
Heng Li
645db3350e
Merge branch 'master' into sr
2017-09-20 11:15:14 -04:00
Heng Li
75e6bbc9f6
r421: removed the MM_F_SPLICE_BOTH mode
...
In the default splice mode, minimap2 applies two rounds of spliced alignment:
first assuming GT-AG to be the splice signal across all splicing sites and then
assuming CT-AC to be the signal. This is the idea strategy.
In the MM_F_SPLICE_BOTH mode, minimap2 applies one round of spliced alignment,
assuming GT-AG and CT-AC to be the splice signals AT THE SAME TIME. This will
be faster but less accurate. I don't think anyone would like to run minimap2 in
this mode, so I am removing it for clarity.
2017-09-20 11:11:53 -04:00
Heng Li
7a9b4db874
replaced --approx-ext with --sr
...
--sr disables Z-drop and may come with other heurstics
2017-09-20 10:51:18 -04:00
Heng Li
11081c6c27
r411: refactored kalloc for clarity
...
The new version is closer to K&R's original implementation.
2017-09-18 19:49:15 -04:00
Heng Li
0f7455cefa
r365: documented the "sr" preset
2017-09-14 12:57:21 -04:00
Heng Li
d7f2ac1d4f
better parameters for short reads
...
It turns out the key problem is not the minimizer density. It is the max
occurrence that tends to affect results more, especially sensitivity. There is
still lots of work to do, but for now, it seems a good start.
2017-09-12 16:11:23 -04:00
Heng Li
eccdb3a1ca
r315: added getopt from musl
2017-09-01 20:20:34 +08:00
Heng Li
0fe1a224ab
r309: improved SAM header output
2017-08-25 10:35:58 +08:00
Heng Li
993a2bb521
r301: separate introns from deletions
...
When an intron is adjacent to a deletion, the old code count both as introns,
which lead to an inaccurate exon boundary.
2017-08-18 15:31:15 +08:00
Heng Li
2cde8d257c
r297: bidirectional RNA alignment
2017-08-17 06:02:44 -04:00
Heng Li
b5f5929bf9
r296: expose splicing related options to CLI
2017-08-13 21:37:51 -04:00
Heng Li
28f86688ab
r295: gap closure from the middle of non-HPC k
...
This WILL slightly affect the result of genomic mapping, but hopefully
in the good direction.
2017-08-12 23:48:43 -04:00
Heng Li
43506edbc5
backup: preliminary boundary alignment
2017-08-12 23:10:14 -04:00
Heng Li
61eef0575c
separate out spliced alignment; not right yet
2017-08-12 18:54:32 -04:00
Heng Li
d240318741
r287: refined CLI options and manpage
2017-08-12 12:26:04 -04:00
Heng Li
0f4c823b0c
r286: ignore introns when computing max seg score
2017-08-12 10:58:16 -04:00
Heng Li
163fa36ee6
r281: don't open long gaps on query
2017-08-10 15:04:59 -04:00
Heng Li
1a7d782131
r273: cdna mapping mode for testing
...
Differences from the typical mapping mode:
* banded alignment disabled
* log gap cost during chaining
* zero long-gap extension during alignment
* up to 100kb (by default) reference gap
* bad seeding not filtered (to tune later)
2017-08-08 11:31:49 -04:00
Heng Li
5934d68772
r229: a new way to prevent out-of-band backtrack
2017-07-29 23:52:30 -04:00
Heng Li
fa99d28d34
r228: reduced unnecessary INV alignment
2017-07-29 20:21:53 -04:00
Heng Li
d08b7a0c51
r227: use local alignment for INV alignment
2017-07-29 17:40:53 -04:00
Heng Li
da3db3c095
r226: only try inv alignment for primary
2017-07-29 14:09:35 -04:00
Heng Li
783ead6f47
r225: removed a debugging line
2017-07-29 13:21:38 -04:00
Heng Li
19d6ec885e
r224: inversion alignment around Z-drop break
2017-07-29 13:09:10 -04:00
Heng Li
120bebc290
ake
2017-07-29 11:01:49 -04:00
Heng Li
5e3eecd6d4
r222: no effective changes
2017-07-29 10:31:46 -04:00
Heng Li
ebbe9c1eb8
r219: fixed a bug caused by skipping tandem seeds
2017-07-28 14:06:56 -04:00
Heng Li
c672690564
r218: increase the frequency of SW slightly
2017-07-28 13:30:42 -04:00
Heng Li
f4fee60188
r217: ignore tandem seeds during alignment
...
This helps a tiny bit.
2017-07-28 12:26:56 -04:00
Heng Li
254280b8af
r216: a bit cleanup; identical output to r215
2017-07-28 11:54:18 -04:00
Heng Li
2c79580649
r213: more careful solution to wrong seeds
...
a little better, but not good enough!
2017-07-27 13:19:11 -04:00
Heng Li
b927838495
r212: better heuristic to fix wrong seeding
...
but not good enough. Will explore more.
2017-07-27 11:24:51 -04:00
Heng Li
371e20cc7c
r211: a better heurstic to reduce false seeds
2017-07-26 23:56:38 -04:00
Heng Li
f2ef48878a
r202: trim bad chain ends before extension
...
This fixes a few more FP long INDELs towards the end of alignments.
2017-07-25 19:53:19 -04:00
Heng Li
38b2830e18
r161: filter bad seeds; changed default -g/-r
2017-07-08 13:31:27 -04:00
Heng Li
cc554aee43
r159: use two-piece gap penalty
2017-07-08 10:26:00 -04:00
Heng Li
2e4fd9f1d0
r148: revamped regs handling after cigar
2017-07-03 10:44:26 -04:00
Heng Li
696ebce66e
backup; still buggy
2017-07-03 00:52:00 -04:00
Heng Li
426c2975f6
r126: filter by fraction of seed coverage
...
otherwise we may get too many poor overlap mappings.
2017-06-30 22:15:45 -04:00
Heng Li
d11049eb32
r120: use max-scoring seg to control output
...
much better now
2017-06-30 14:21:44 -04:00
Heng Li
5dcd8f8965
r117: fixed a bug in logic
2017-06-30 11:52:42 -04:00
Heng Li
91e1c4d6db
r116: fixed another bug caused by refactoring
2017-06-30 00:03:45 -04:00
Heng Li
e2b86d0332
r110: fixed a bug caused by refactoring
2017-06-29 21:12:31 -04:00
Heng Li
4cd456b9ba
r108: refactoring, move reg1 routines to hit.c
2017-06-29 19:44:11 -04:00
Heng Li
cc67f1b781
compute mapq; not working for z-split yet
2017-06-29 17:52:48 -04:00
Heng Li
337c2a21cd
r105: fixed a bug in repeated right ext when zdrop
2017-06-29 15:45:07 -04:00
Heng Li
b9075d39a8
r104: long gap patching
2017-06-29 14:54:54 -04:00
Heng Li
d274e1b743
backup
2017-06-29 12:58:52 -04:00
Heng Li
38070e8a05
r98: fixed segfault for certain scoring
...
due to unsigned comparisons between -1 and chromosome length
2017-06-28 22:18:51 -04:00
Heng Li
bcd9b1c621
r93: fixed various small issues
2017-06-28 10:35:21 -04:00
Heng Li
42283ef10c
r87: fixed a bug in ksw2
2017-06-27 13:29:48 -04:00
Heng Li
734028f92b
validate z-drop
2017-06-27 11:25:39 -04:00
Heng Li
c02ff4662c
r85: two-round z-drop
2017-06-27 10:36:24 -04:00
Heng Li
5b614ae828
r78: fixed a split bug
2017-06-26 14:45:23 -04:00
Heng Li
de54c9dac2
r77: fixed an index loading bug (offset not set)
2017-06-26 13:56:25 -04:00
Heng Li
24d7f7f8b1
disabled dynamic banding; buggy
2017-06-25 22:43:15 -04:00
Heng Li
39083be9ab
separated formating printing
...
for SAM in future; and for performance
2017-06-25 16:13:54 -04:00
Heng Li
7780ab792a
removed some comment lines
2017-06-25 11:53:35 -04:00
Heng Li
56364200c8
avoid contained hits due to split
2017-06-25 11:43:50 -04:00
Heng Li
f20d550a59
fixed the NM bug
...
due to reversed CIGAR
2017-06-25 11:24:39 -04:00
Heng Li
b261a4a74b
zdrop sort of works, but has issues
...
left and right extensions are different; NM is also wrong
2017-06-25 10:52:14 -04:00
Heng Li
72dfb0c99e
fixed a bug in ksw2
2017-06-25 10:22:13 -04:00
Heng Li
ef5dd318ca
implemented chain splitting; NOT tested!!!
2017-06-24 22:57:43 -04:00
Heng Li
aa5881e7bb
backup
2017-06-24 22:51:31 -04:00
Heng Li
d08b58d21f
minor
2017-06-24 19:33:59 -04:00
Heng Li
672782f03a
simplified hp_len a little
2017-06-24 19:27:58 -04:00
Heng Li
2d8fda9586
an alternative strategy to fix for HPC
...
it makes the result better, but still not quite right.
2017-06-24 09:26:24 -04:00
Heng Li
6088dc11ff
wrong SW score
2017-06-23 23:24:49 -04:00
Heng Li
2987be288c
output cigar
2017-06-23 22:53:47 -04:00
Heng Li
35b84f88c6
backup
2017-06-23 22:42:15 -04:00
Heng Li
523a8832ad
backup
2017-06-23 20:57:17 -04:00
Heng Li
7ed26a4857
backup
2017-06-23 19:31:23 -04:00
Heng Li
d75b5b4e8a
backup; NOT working yet
2017-06-23 19:21:47 -04:00
Heng Li
4fea3d778a
backup
2017-06-23 18:57:00 -04:00
Heng Li
6c8368c24c
get the left-extension sequence correctly
2017-06-23 18:25:47 -04:00
Heng Li
990f7b0b71
backup
2017-06-23 15:13:53 -04:00
Heng Li
4ae0b46972
min_ksw_len
2017-06-23 14:38:28 -04:00
Heng Li
9cd313eae1
sequence retrieval working
2017-06-23 14:11:56 -04:00
Heng Li
326d91deb0
backup
2017-06-23 14:06:00 -04:00
Heng Li
44cdd18de0
start to work on alignment
2017-06-23 13:44:45 -04:00