Heng Li
4683da2455
r520: added option -L to write long cigar to CG
2017-10-17 17:32:44 -04:00
Heng Li
ffd953029f
r519: fixed a severe bug that misses long alns
2017-10-17 15:52:36 -04:00
Heng Li
04cf4ebf5e
r518: increased the default -K to 500M
...
This helps multi-thread performance for ultra-long reads.
2017-10-17 13:21:29 -04:00
Heng Li
25ffd72690
r517: replaced --print-2nd with --secondary
2017-10-17 11:41:56 -04:00
Heng Li
aa2d9d4e1b
r516: throw a warning if -N0 is used
2017-10-16 14:55:35 -04:00
Heng Li
addb61bcb2
r515: more conservative hit exclusion
...
When a hit covers a long query subsequence that has not been covered by better
primary hits, this hit is more likely to become a new primary hit.
2017-10-16 13:58:01 -04:00
Heng Li
adf6cd7f52
r513: merged pre- and post-cigar blen and mlen
...
This saves a bit memory and is cleaner.
2017-10-16 10:55:18 -04:00
Heng Li
e6f525edaf
r512: option to filter poorly aligned reads
2017-10-16 10:38:22 -04:00
Heng Li
858213d513
r511: fixed wrong primary sam record
2017-10-12 23:02:18 -04:00
Heng Li
dea3b60918
r510: fixed an off-by-1 bug for unmapped mate
2017-10-12 17:31:13 -04:00
Heng Li
7c555f9b7e
r508: use two I/O threads for mapping
...
-x sr applies this option by default
2017-10-12 14:56:01 -04:00
Heng Li
2801ed9b4b
r507: -K not working as is intended ( #36 )
2017-10-12 14:16:05 -04:00
Heng Li
ce06188203
r506: fixed a memory leak
2017-10-12 10:12:22 -04:00
Heng Li
9862a75cd3
r505: a bit code simplification
2017-10-11 21:54:32 -04:00
Heng Li
3073f4a758
r504: better heuristics to reduce excessive ext
2017-10-11 21:42:11 -04:00
Heng Li
9364bc64d7
r501: added end_bonus to extz2
2017-10-11 09:39:41 -04:00
Heng Li
65abdb8f3c
r500: temporarily disabled region trunc
...
because it is causing other problems.
2017-10-11 00:16:04 -04:00
Heng Li
7345621759
r499: end bonus working; DP region needs improve!
2017-10-11 00:14:25 -04:00
Heng Li
ca632f907b
r498: fixed a bug when merging like "4I5I"
2017-10-10 21:22:37 -04:00
Heng Li
6c78a980b6
r497: the previous change not working at the ends
2017-10-10 17:32:28 -04:00
Heng Li
c217eecdb7
r496: avoid DP extending into another chain
...
When deciding the region for DP, exclude regions in the adjacent chain
2017-10-10 17:25:12 -04:00
Heng Li
13b66aad4d
r495: fix impropriate CIGAR
...
1. Not left aligned
2. In one case, 50M24D50M becomes 24D100M. The leading D needs to be removed.
3. Avoid identical hits after DP
2017-10-10 11:59:44 -04:00
Heng Li
46fa520db9
r494: simpler and better SR gap filling
...
Still one thing to do: left alignment
2017-10-09 22:02:30 -04:00
Heng Li
1e53610fb4
r493: reduced calling extd2 for ungapped aln
...
Still need to improve in case of 3I5M3D
2017-10-09 21:13:34 -04:00
Heng Li
9396d9e11b
r452: typo in the last commit
2017-10-09 10:05:32 -04:00
Heng Li
198849a716
r491: an ambiguous base costs the same as gap ext
2017-10-09 09:59:42 -04:00
Heng Li
9fea4d16b3
r490: improved short-read extension heuristic
...
Now we find the best scoring ungapped seeded segment and then extend from it.
There is no gap filling for short reads.
2017-10-08 21:36:34 -04:00
Heng Li
f9415628a8
r489: don't use approximate zdrop
...
it doesn't work well
2017-10-08 19:29:09 -04:00
Heng Li
61e56c941d
r488: parameter to control max fragment length
2017-10-07 23:54:32 -04:00
Heng Li
f150257a0d
r487: demote "map10k"; improved README
2017-10-07 19:19:40 -04:00
Heng Li
bf2d4f7aec
r486: treat "U" as "T" for RNA reads ( #33 )
2017-10-07 18:53:25 -04:00
Heng Li
c6384ed2c8
r482: increased short-read bandwidth to 100
...
This has very minor effect on speed.
2017-10-06 10:20:32 -04:00
Heng Li
e0baf1ad54
r479: a bit code cleanup
2017-10-05 16:15:14 -04:00
Heng Li
f266092699
r478: simplied useless code, a tiny bit
2017-10-05 15:56:00 -04:00
Heng Li
9c5767f9ed
r477: renamed multi_seg to frag_mode
2017-10-05 15:48:17 -04:00
Heng Li
ae2adf04d4
r476: multi-file fragment mode working
2017-10-05 15:39:26 -04:00
Heng Li
b839758335
r475: added --cs=none; updated manpage
2017-10-05 15:27:37 -04:00
Heng Li
f4a5d3a692
r474: replaced -S and --cs-no-equal with --cs
2017-10-05 15:03:03 -04:00
Heng Li
3ff6eda3a4
r473: don't count introns into blen
2017-10-05 14:37:21 -04:00
Heng Li
1a90bc8603
r472: fixed a bug when printing MAPQ/CIGAR
2017-10-05 12:46:11 -04:00
Heng Li
abf2a90363
r471: all SAM features implemented; more tests!
2017-10-05 12:37:30 -04:00
Heng Li
7cc4f6f965
r469: first step towards PE SAM
2017-10-05 10:38:09 -04:00
Heng Li
16e6e589a8
r468: replaced ^ with ~ in cs
2017-10-04 22:17:12 -04:00
Heng Li
9aba11769c
r467: added : (equal length) and ^ (intron) ops
2017-10-04 21:55:37 -04:00
Heng Li
7d50e646dd
r466: detect multi-part index more smartly
...
though it might not work in an extremely rare case: the end of a sequence ends
at X*16384 and it is the last sequence in a batch. This can be resolved by
never letting the kstream_t buffer empty.
2017-10-04 17:32:58 -04:00
Heng Li
1554149158
r465: apply option -x before other options
2017-10-04 13:52:28 -04:00
Heng Li
19c39e704f
r464: fixed a bug in pairing, due to randomization
2017-10-04 13:37:40 -04:00
Heng Li
2581c44a21
r463: optionally disable secondary hits
2017-10-04 13:24:41 -04:00
Heng Li
5babf41a38
r462: SAM primary flag not properly set
2017-10-04 13:11:29 -04:00
Heng Li
2a1e738a94
r461: randomize repetitive hits
2017-10-04 13:05:18 -04:00
Heng Li
cf55c84056
r460: added option --no-long-join
2017-10-04 12:08:44 -04:00
Heng Li
841763ec24
Merge branch 'master' into sr
2017-10-04 11:42:44 -04:00
Heng Li
95eb1dec36
r458: fixed wrong chr for inversion aln ( #30 )
2017-10-04 11:32:06 -04:00
Heng Li
0fd0f2aed1
r457: fixed a bug on parsing -f
2017-09-30 00:00:44 -04:00
Heng Li
ee9b2773a8
r456: min chain score should >k-mer length
...
or chain_dp() wastes time on unnecessarily sorting chains with one k-mer.
2017-09-29 22:33:55 -04:00
Heng Li
340483821e
r455: set max_occ on command line
2017-09-29 22:18:43 -04:00
Heng Li
04fb2c2ec0
r454: rechain with higher max_occ if no good chain
2017-09-29 19:24:32 -04:00
Heng Li
0d4ecd19ee
r453: avoid duplicated strcmp() for ava
2017-09-28 15:52:05 -04:00
Heng Li
0c63325985
r452: fixed - -G not working with -x sr
2017-09-28 14:28:12 -04:00
Heng Li
2a554a92e9
r451: changed rep_len mapq heuristic
2017-09-28 14:23:14 -04:00
Heng Li
935a6e6064
r450: differentiate exact repeats via mapq
2017-09-27 23:51:05 -04:00
Heng Li
8301222174
r448: fixed a bug when computing PE quality
2017-09-27 21:54:07 -04:00
Heng Li
7e0d70bfd3
r445: pair coordinate adjustment working
...
Next: mapq adjustment, which will be tricky...
2017-09-27 15:38:18 -04:00
Heng Li
a349d85280
r444: changed the way orientation is specified
...
The old model doesn't work with RF or RR orientation. The new model only works
with paired-end reads. For >2 segments, only FF is supported.
2017-09-27 12:33:10 -04:00
Heng Li
f611edf6f2
r443: don't filter small cm for split seg
2017-09-26 16:17:58 -04:00
Heng Li
1b1dd0cd57
r442: default max_gap to 200 in the sr mode
2017-09-26 13:31:01 -04:00
Heng Li
55d1e4f638
r440: better chain filtering for PE reads
2017-09-26 11:03:36 -04:00
Heng Li
64c0ad6b35
r439: use splice-like chain gap cost between segs
...
This improves accuracy
2017-09-25 16:04:38 -04:00
Heng Li
9538c985aa
r438: fixed a rare case that leads to missing hits
...
It is a bug in chaining.
2017-09-25 14:59:34 -04:00
Heng Li
8f25cfa36e
r437: fixed uninialized memory on rep_len
2017-09-25 14:22:45 -04:00
Heng Li
81008dd371
r436: working on short reads
...
The result is mixed - lots of room for tuning
2017-09-25 14:06:29 -04:00
Heng Li
3bb66e1ed3
multi-seg working on toy examples
2017-09-25 13:42:04 -04:00
Heng Li
5b39a1b34b
Merge branch 'master' into sr
2017-09-20 12:24:08 -04:00
Heng Li
e3b5802b2e
r424: reduce memory for long query seqs
2017-09-20 12:22:13 -04:00
Heng Li
645db3350e
Merge branch 'master' into sr
2017-09-20 11:15:14 -04:00
Heng Li
75e6bbc9f6
r421: removed the MM_F_SPLICE_BOTH mode
...
In the default splice mode, minimap2 applies two rounds of spliced alignment:
first assuming GT-AG to be the splice signal across all splicing sites and then
assuming CT-AC to be the signal. This is the idea strategy.
In the MM_F_SPLICE_BOTH mode, minimap2 applies one round of spliced alignment,
assuming GT-AG and CT-AC to be the splice signals AT THE SAME TIME. This will
be faster but less accurate. I don't think anyone would like to run minimap2 in
this mode, so I am removing it for clarity.
2017-09-20 11:11:53 -04:00
Heng Li
7a9b4db874
replaced --approx-ext with --sr
...
--sr disables Z-drop and may come with other heurstics
2017-09-20 10:51:18 -04:00
Heng Li
b99c22840f
r414: avoid assertion failure for 0-length reads
2017-09-19 22:21:27 -04:00
Heng Li
11081c6c27
r411: refactored kalloc for clarity
...
The new version is closer to K&R's original implementation.
2017-09-18 19:49:15 -04:00
Heng Li
ea5a0cd17d
Release minimap2-2.2 (r409)
2017-09-17 20:08:47 -04:00
Heng Li
e9c57f6d8b
r402: exposed kseq (for API in mappy later)
2017-09-17 13:09:16 -04:00
Heng Li
c07f9f9a49
r372: default mm_verbose to 1, and change in main
2017-09-16 09:14:34 -04:00
Heng Li
14b853499f
r369: updated example with the latest API
2017-09-14 22:44:10 -04:00
Heng Li
75ff7ceec5
r368: API documentation
2017-09-14 22:23:04 -04:00
Heng Li
e2823d4aee
r367: index reader optionally writes index
2017-09-14 21:18:13 -04:00
Heng Li
eb00521d9b
redesigned indexing and option APIs
2017-09-14 17:02:01 -04:00
Heng Li
0f7455cefa
r365: documented the "sr" preset
2017-09-14 12:57:21 -04:00
Heng Li
4d3768bf26
r364: improved the mapq heuristics
...
* use repetitive seed lengths, not counts
* compute n_sub to higher accuracy
* use bwa-mem mapq heuristic as a backup
For short single-end reads, minimap2's ROC is not as good as bwa-mem's, but is
close.
2017-09-14 12:37:03 -04:00
Heng Li
47e9d76ca1
further mapq tuning
2017-09-14 10:46:14 -04:00
Heng Li
f4a8766283
r362: fixed overestimated chaining score
...
Caused by ilog2_32(0)=-1. This bug was fixed once and reoccurred as I was
tuning the score function but forgot to apply the fix.
2017-09-14 10:15:22 -04:00
Heng Li
6a82a21dee
r361: improved mapq for short reads
2017-09-13 15:32:39 -04:00
Heng Li
3c91d652dd
r360: allow to set integer max occ
2017-09-13 11:37:00 -04:00
Heng Li
d7f2ac1d4f
better parameters for short reads
...
It turns out the key problem is not the minimizer density. It is the max
occurrence that tends to affect results more, especially sensitivity. There is
still lots of work to do, but for now, it seems a good start.
2017-09-12 16:11:23 -04:00
Heng Li
eea9e851d8
Merge branch 'dev' into short
2017-09-11 09:32:28 -04:00
Heng Li
c7c3585531
r347: merged mm_map_frag() into mm_map()
...
mm_map_frag() was separated due to an earlier design that has been rejected.
2017-09-10 15:02:55 -04:00
Heng Li
87a278d06a
Merge branch 'dev' into short
2017-09-09 08:49:58 -04:00
Heng Li
f422175e4e
r344: avoid unnecessary refName retrieval
2017-09-08 22:44:14 -04:00
Heng Li
709b6ec1f1
increase seed occurrences
2017-09-08 22:42:39 -04:00
Heng Li
0031158936
Merge branch 'master' into short
2017-09-07 11:41:32 -04:00
Heng Li
ef3f7ea2f2
Release minimap2-2.1.1 (r341)
2017-09-06 13:46:51 -04:00
Heng Li
8b9f2aaf04
r339: improved SIMD detection
...
old code does not check AVX2
2017-09-05 13:10:30 -04:00
Heng Li
46e8b6a4f9
r338: portable CPU dispatch, which is the default
...
working with gcc, icc, clang and msvc.
2017-09-03 20:29:24 -04:00
Heng Li
3c997ca016
r337: support CPU dispatch for gcc-4.8+
...
using __builtin_cpu_supports()
2017-09-03 14:29:49 -04:00
Heng Li
f9ccc522cd
Merge branch 'master' into short
2017-09-03 11:58:15 -04:00
Heng Li
101b8bb97d
r335: report an error if query can't be opened
2017-09-03 11:54:38 -04:00
Heng Li
0a3ebdc916
for better windows compatibility
2017-09-02 17:52:33 -04:00
Heng Li
743d26eab0
Merge pull request #20 from nanoporetech/msvc14
...
ONT source code changes to compile with MSVC 14
2017-09-02 14:35:02 -07:00
Heng Li
62535ecd7f
Merge branch 'dev'
2017-09-01 10:06:21 -07:00
Simon Harris
4db1c0295c
Fixed segfault caused when reading from index file
2017-09-01 15:48:59 +01:00
Heng Li
d4074874ee
r316: get rid of a harmless gcc warning
2017-09-01 20:25:27 +08:00
Heng Li
eccdb3a1ca
r315: added getopt from musl
2017-09-01 20:20:34 +08:00
Stefan von Deylen
a3c3db6b9b
ONT source code changes to compile with MSVC 14
2017-08-30 16:25:20 +01:00
Heng Li
c4080aaf7e
Merge branch 'master' into short
2017-08-28 07:02:22 +08:00
Heng Li
bf8246f872
Release minimap2-2.1-r311
2017-08-25 13:35:55 +08:00
Heng Li
0fe1a224ab
r309: improved SAM header output
2017-08-25 10:35:58 +08:00
Heng Li
993a2bb521
r301: separate introns from deletions
...
When an intron is adjacent to a deletion, the old code count both as introns,
which lead to an inaccurate exon boundary.
2017-08-18 15:31:15 +08:00
Heng Li
64c1389e1a
Merge branch 'master' into splice
2017-08-17 23:39:27 +08:00
Heng Li
81cff97208
r299: support -h: output to stdout; return 0
2017-08-17 23:38:31 +08:00
Heng Li
bbb37d95f2
support inserting RG lines
2017-08-17 23:34:09 +08:00
Heng Li
2cde8d257c
r297: bidirectional RNA alignment
2017-08-17 06:02:44 -04:00
Heng Li
b5f5929bf9
r296: expose splicing related options to CLI
2017-08-13 21:37:51 -04:00
Heng Li
28f86688ab
r295: gap closure from the middle of non-HPC k
...
This WILL slightly affect the result of genomic mapping, but hopefully
in the good direction.
2017-08-12 23:48:43 -04:00
Heng Li
43506edbc5
backup: preliminary boundary alignment
2017-08-12 23:10:14 -04:00
Heng Li
53b3265d84
r290: in techrep, explain spliced alignment
2017-08-12 15:40:49 -04:00
Heng Li
a23df2dc91
r289: changed CLI help only
2017-08-12 12:40:07 -04:00
Heng Li
5a74088b74
r288: changed max intron length to 200k
2017-08-12 12:39:21 -04:00
Heng Li
d240318741
r287: refined CLI options and manpage
2017-08-12 12:26:04 -04:00
Heng Li
0f4c823b0c
r286: ignore introns when computing max seg score
2017-08-12 10:58:16 -04:00
Heng Li
a99358bc3d
r282: reduced intron cost; added eval script
2017-08-11 00:06:01 -04:00
Heng Li
163fa36ee6
r281: don't open long gaps on query
2017-08-10 15:04:59 -04:00
Heng Li
c59b0781bc
r280: output introns as "N" in the cdna mode
2017-08-09 11:45:02 -04:00
Heng Li
7429b12164
Merge branch 'master' into cdna
2017-08-08 22:00:24 -04:00
Heng Li
9e1125edda
r277: abort if query/-d missing ( #11 )
2017-08-08 21:46:15 -04:00
Heng Li
3dbe23b34e
Merge branch 'dev'
2017-08-08 21:30:32 -04:00
Heng Li
6840370f3c
Release minimap2-2.0 (r275)
2017-08-08 21:16:25 -04:00
Heng Li
7f9f659b6a
r274: CLI option to change max ref gap
2017-08-08 11:39:23 -04:00
Heng Li
1a7d782131
r273: cdna mapping mode for testing
...
Differences from the typical mapping mode:
* banded alignment disabled
* log gap cost during chaining
* zero long-gap extension during alignment
* up to 100kb (by default) reference gap
* bad seeding not filtered (to tune later)
2017-08-08 11:31:49 -04:00
Heng Li
079ec0d283
r271: added "short" preset; for testing only
2017-08-07 15:30:05 -04:00
Heng Li
12cea727b8
r238: bugfix to cs - rev sequence not complemented
2017-08-01 10:33:21 -04:00
Heng Li
cd105b47f2
r237: fixed a bug in outputting cs:Z
2017-07-31 14:49:39 -04:00
Heng Li
35f232c3fa
r236: in cs tag, output differences in lowercase
...
for easy eyeballing
2017-07-31 12:17:48 -04:00
Heng Li
4c0713ee14
r235: optionally output tag cs in PAF
...
cs encodes the query, the reference sequence and CIGAR.
2017-07-31 12:06:49 -04:00
Riku Walve
9e09c1ae72
fix self-comparison in index parameter override check
2017-07-30 21:46:25 +03:00
Heng Li
d8d4d29b68
Release minimap2-2.0rc1-r232
2017-07-30 14:32:40 -04:00
Heng Li
1f78e1ee53
r230: code formatting changes only
2017-07-30 12:31:40 -04:00
Heng Li
5934d68772
r229: a new way to prevent out-of-band backtrack
2017-07-29 23:52:30 -04:00
Heng Li
fa99d28d34
r228: reduced unnecessary INV alignment
2017-07-29 20:21:53 -04:00
Heng Li
d08b7a0c51
r227: use local alignment for INV alignment
2017-07-29 17:40:53 -04:00
Heng Li
da3db3c095
r226: only try inv alignment for primary
2017-07-29 14:09:35 -04:00
Heng Li
783ead6f47
r225: removed a debugging line
2017-07-29 13:21:38 -04:00
Heng Li
19d6ec885e
r224: inversion alignment around Z-drop break
2017-07-29 13:09:10 -04:00
Heng Li
5e3eecd6d4
r222: no effective changes
2017-07-29 10:31:46 -04:00
Heng Li
2179e9e24b
r221: output SA in the SAM output
2017-07-28 23:08:39 -04:00
Heng Li
ebbe9c1eb8
r219: fixed a bug caused by skipping tandem seeds
2017-07-28 14:06:56 -04:00
Heng Li
c672690564
r218: increase the frequency of SW slightly
2017-07-28 13:30:42 -04:00
Heng Li
f4fee60188
r217: ignore tandem seeds during alignment
...
This helps a tiny bit.
2017-07-28 12:26:56 -04:00
Heng Li
254280b8af
r216: a bit cleanup; identical output to r215
2017-07-28 11:54:18 -04:00
Heng Li
fc965805f7
r215: bring back a log gap component
...
Otherwise chaining may more often break a long gap into several gaps.
2017-07-28 00:17:19 -04:00
Heng Li
2c79580649
r213: more careful solution to wrong seeds
...
a little better, but not good enough!
2017-07-27 13:19:11 -04:00
Heng Li
b927838495
r212: better heuristic to fix wrong seeding
...
but not good enough. Will explore more.
2017-07-27 11:24:51 -04:00
Heng Li
371e20cc7c
r211: a better heurstic to reduce false seeds
2017-07-26 23:56:38 -04:00
Heng Li
a01d758af6
r206: mapq penalize short chains further
...
The old code penalized at the log() scale. Now added a linear-scaled factor. If
the chain consists of few minimizers, its quality is really not good.
2017-07-26 11:50:04 -04:00
Heng Li
e9dc1ce2b6
r205: when computing mapq, consider min_chain_sc
...
Not doing this was a mistake.
2017-07-26 11:34:14 -04:00
Heng Li
00c6db5073
r203: check more subopt aln if score small
2017-07-25 20:02:44 -04:00
Heng Li
f2ef48878a
r202: trim bad chain ends before extension
...
This fixes a few more FP long INDELs towards the end of alignments.
2017-07-25 19:53:19 -04:00
Heng Li
21ca564112
r201: fixed a minor chaining issue
...
Chaining looked at the end of a chain, but the end may not be the best. We now
go back to find the max.
2017-07-25 18:26:51 -04:00
Heng Li
215e92ed7b
r200: reduce long gaps in chaining
...
Every seed can initiate a chain.
2017-07-25 17:32:54 -04:00
Heng Li
b530ade333
r199: changed to linear gap cost for chaining
...
The old cost doesn't penalize long gaps enough. Will also drop seeds close to
the edge in the next commit.
2017-07-25 15:35:10 -04:00
Heng Li
f81f37fef1
r197: allocate index seq names from kalloc
...
to reduce malloc() overhead.
2017-07-24 19:36:05 -04:00
Heng Li
5c4d040b13
r191: warning if CLI index opt diff from prebuilt
...
Also added index testing API (moved from main.c to index.c)
2017-07-19 10:25:11 -04:00
Heng Li
4aff301ef4
r190: default -k to 15; added -x map-ont
2017-07-19 10:11:14 -04:00
Heng Li
470021fd27
r189: sync with ksw2 (no effective changes)
2017-07-19 09:28:25 -04:00
Heng Li
71c988f6ab
r188: renamed bseq* to mm_bseq*
...
to avoid naming collisions between minimap2 and bwa/fermi-lite/etc
2017-07-19 09:26:46 -04:00
Heng Li
b9b0b6f49c
r187: fixed non-terminated sam output ( #3 )
...
Only happen to unmapped read, with quality, and in the SAM output
2017-07-18 15:20:29 -04:00
Heng Li
495a78e40a
Get documentation ready for release
2017-07-18 11:04:09 -04:00
Heng Li
71e2a97a4c
r180: changed -x asm5 settings
2017-07-18 00:00:36 -04:00
Heng Li
941059292e
r179: changed the preset for assembly alignment
2017-07-17 22:41:46 -04:00
Heng Li
38aa66fa30
r178: fixed integer overflow in mapq calculation
2017-07-16 21:45:39 -04:00
Heng Li
b4280d186f
r176: removed seedcov_ratio; changed default opt
...
min_seedcov_ratio is not used
2017-07-12 12:47:46 -04:00
Heng Li
52caf79395
r175: halved max-chain-skip in the ava mode
2017-07-12 10:42:19 -04:00
Heng Li
eeeb2ffb68
r174: make max-chain-skip work
...
The max-chain-skip heuristics did not work due to a bug. Without this
heuristics, chaining is too slow for long-read overlap.
2017-07-12 10:08:06 -04:00
Heng Li
33451aba45
r173: changed the debugging output format
2017-07-11 15:23:28 -04:00
Heng Li
cfa083a98b
r172: separated PacBio and ONT read overlapping
...
HPC k-mer works better for PacBio, but worse for ONT. Interesting...
2017-07-11 15:12:35 -04:00
Heng Li
7598809577
r171: reduced log gap cost at chaining
...
The cost is so large that it discards too many valid seeds without HPC k-mers.
This change may introduce false long gaps to reference mapping. We have another
mechanism mm_filter_bad_seeds() to protect against this. In addition, minimap2
is not that bad to have long gaps. Some other aligners are worse.
Still need tuning in future.
2017-07-11 14:57:49 -04:00
Heng Li
826c8ba892
r170: added a debugging flag
...
something wrong with chaining
2017-07-11 14:47:35 -04:00
Heng Li
801bc84b01
r169: output more accurate col. 10&11 to PAF
...
In r168, col.10 is smaller than what it should be. This confuses miniasm.
2017-07-11 14:09:51 -04:00
Heng Li
782449975d
r168: fixed a bug in long join: a[] not sorted
...
Also added length requirement for long join and changed -g in the ava mode
2017-07-09 12:14:20 -04:00
Heng Li
1ac48556ae
r167: long join threshold depends on gap
...
also caught a bug for reverse strand join
2017-07-09 10:38:51 -04:00
Heng Li
4ee3202539
r164: unmapped read not properly flagged
2017-07-08 18:16:18 -04:00
Heng Li
42846ce65d
r163: reduced long join score requirement
...
because the chaining score is generally smaller with the last few commits.
2017-07-08 15:51:52 -04:00
Heng Li
3f6a0b0b5c
r162: improved chaining accuracy
2017-07-08 14:29:36 -04:00
Heng Li
38b2830e18
r161: filter bad seeds; changed default -g/-r
2017-07-08 13:31:27 -04:00
Heng Li
1fee5f8edc
r160: -O and -E accept two numbers
2017-07-08 11:34:52 -04:00
Heng Li
cc554aee43
r159: use two-piece gap penalty
2017-07-08 10:26:00 -04:00
Heng Li
9823317e8f
r158: optionally ignore base quality
2017-07-05 18:23:50 -04:00
Heng Li
e07daad7ad
r153: sam primary record not set sometimes
2017-07-03 13:18:57 -04:00
Heng Li
a94bc31311
r151: documentations
2017-07-03 12:11:07 -04:00
Heng Li
b625247300
r150: mm_sync_regs() doesn't work with negative id
2017-07-03 11:36:34 -04:00
Heng Li
53c4bf5e4f
r149: introduced debugging flags on CLI
2017-07-03 11:02:32 -04:00
Heng Li
2e4fd9f1d0
r148: revamped regs handling after cigar
2017-07-03 10:44:26 -04:00
Heng Li
e06c342659
r146: in filtering, drop children if parent out
...
This has been causing several segfaults.
2017-07-03 00:28:12 -04:00
Heng Li
51cfb60520
r145: changed default -p from 2 to 0.8
...
For long reads, secondary alignments can be very information.
2017-07-02 22:51:45 -04:00
Heng Li
632b8638d2
r144: adjust primary aln after cigar
2017-07-02 22:43:02 -04:00
Heng Li
2b45ba7a0b
r143: fixed a segfault and incorrect .parent
2017-07-02 19:56:21 -04:00
Heng Li
74d306a596
fixed bug when retaining 2ndary aln; still buggy
2017-07-02 19:08:30 -04:00
Heng Li
da90b614db
r141: replaced -b with -a (for SAM output)
...
-b sounds like BAM. I like -a better.
2017-07-01 16:54:59 -04:00
Heng Li
2338e887d9
finished the first draft of manpage
2017-07-01 11:25:54 -04:00
Heng Li
a9f089f0aa
r131: wrong EOF test; make mb_size <= batch_size
2017-07-01 09:26:09 -04:00
Heng Li
41efd03d7a
r129: fixed memory leak caused by qualities
2017-06-30 23:48:00 -04:00
Heng Li
426c2975f6
r126: filter by fraction of seed coverage
...
otherwise we may get too many poor overlap mappings.
2017-06-30 22:15:45 -04:00
Heng Li
d73bb28097
r125: changed CLI options
2017-06-30 19:08:47 -04:00
Heng Li
b08591c7a0
r124: a bit better CLI prompt
2017-06-30 15:46:52 -04:00
Heng Li
3a5486325a
r123: fixed a mem leak; more presets
2017-06-30 15:39:05 -04:00
Heng Li
646a746cdc
r122: filter contained aln after DP extension
2017-06-30 15:23:30 -04:00
Heng Li
fce87ce7bd
r121: output QUAL and unmapped to SAM
2017-06-30 14:40:54 -04:00
Heng Li
d11049eb32
r120: use max-scoring seg to control output
...
much better now
2017-06-30 14:21:44 -04:00
Heng Li
08a61c3cfc
r119: fixed a bug hidden by a previous bug
2017-06-30 13:27:47 -04:00
Heng Li
1a903486b9
r118: bugfix - regs unsorted before filtering
2017-06-30 12:52:28 -04:00
Heng Li
5dcd8f8965
r117: fixed a bug in logic
2017-06-30 11:52:42 -04:00
Heng Li
91e1c4d6db
r116: fixed another bug caused by refactoring
2017-06-30 00:03:45 -04:00
Heng Li
52b4d8e2c9
r115: set primary tag; still buggy
2017-06-29 23:48:35 -04:00
Heng Li
c4871f380c
r114: make SAM output better
2017-06-29 23:08:41 -04:00
Heng Li
03267e8fa7
r113: fixed a sam header bug
2017-06-29 22:43:06 -04:00
Heng Li
11167f511b
r112: output z-drop
2017-06-29 22:08:46 -04:00
Heng Li
3825feeeac
r111: changed the default z-drop to 200
2017-06-29 21:37:56 -04:00
Heng Li
e2b86d0332
r110: fixed a bug caused by refactoring
2017-06-29 21:12:31 -04:00
Heng Li
08cbb09fcc
r109: changed the default scoring
2017-06-29 20:21:57 -04:00
Heng Li
4cd456b9ba
r108: refactoring, move reg1 routines to hit.c
2017-06-29 19:44:11 -04:00
Heng Li
337c2a21cd
r105: fixed a bug in repeated right ext when zdrop
2017-06-29 15:45:07 -04:00
Heng Li
b9075d39a8
r104: long gap patching
2017-06-29 14:54:54 -04:00
Heng Li
9fbf7e41e1
r99: report progress
2017-06-28 23:56:33 -04:00
Heng Li
38070e8a05
r98: fixed segfault for certain scoring
...
due to unsigned comparisons between -1 and chromosome length
2017-06-28 22:18:51 -04:00
Heng Li
a25866c25c
r96: min_cnt still wrong in chaining
2017-06-28 11:03:03 -04:00
Heng Li
bf0e8199e2
r94: min_cnt is tested in a wrong way in chain
2017-06-28 10:39:27 -04:00
Heng Li
bcd9b1c621
r93: fixed various small issues
2017-06-28 10:35:21 -04:00
Heng Li
cdc2a1e29f
r92: fixed a bug for overlapping alignment
...
On the PBcR example E. coli reads, miniasm gives one circular unitig.
2017-06-27 22:03:31 -04:00
Heng Li
51057ab673
expose scoring
2017-06-27 21:37:25 -04:00
Heng Li
533150d49d
r90: revert default band width to 1000
...
10000 is excessively tolerant with bad hits.
2017-06-27 20:29:39 -04:00
Heng Li
fa80177e58
r89: added minimal number of minimizer counts
2017-06-27 18:43:15 -04:00
Heng Li
8977f07269
r88: fixed an out-of-boundary bug in ksw2
2017-06-27 14:50:31 -04:00
Heng Li
42283ef10c
r87: fixed a bug in ksw2
2017-06-27 13:29:48 -04:00
Heng Li
c02ff4662c
r85: two-round z-drop
2017-06-27 10:36:24 -04:00
Heng Li
99c57b86c5
r79: drop bad hits
2017-06-26 15:28:04 -04:00
Heng Li
5b614ae828
r78: fixed a split bug
2017-06-26 14:45:23 -04:00
Heng Li
de54c9dac2
r77: fixed an index loading bug (offset not set)
2017-06-26 13:56:25 -04:00
Heng Li
10644f2165
r76: missing header file
2017-06-26 12:36:37 -04:00
Heng Li
4b8e88a5f4
use long options
2017-06-26 12:31:36 -04:00
Heng Li
640b1a1727
command-line option to control CIGAR output
2017-06-26 11:41:09 -04:00
Heng Li
b1077ff14c
sam output
2017-06-25 22:05:20 -04:00
Heng Li
72dfb0c99e
fixed a bug in ksw2
2017-06-25 10:22:13 -04:00
Heng Li
b04e4b9215
r36: bring back primary; don't output all mappings
2017-06-08 15:28:19 -04:00
Heng Li
19e43571c1
r34: removed a bit unused code
2017-06-07 14:35:57 -04:00
Heng Li
d816e48fce
fixed a bug in chaining
2017-06-06 14:33:43 -04:00
Heng Li
6d4348db44
dp chaining mostly works, but fails sometimes
...
which means there are bugs that need to be fixed
2017-06-06 14:19:50 -04:00
Heng Li
1a9fc04cf0
backup
2017-06-06 10:16:33 -04:00
Heng Li
acc7382a30
backup
2017-06-04 16:09:45 -04:00
Heng Li
06adabd0dc
clean bill from valgrind
2017-05-04 12:44:49 +08:00
Heng Li
f2ae8eb670
mostly debugging code
2017-05-01 16:50:09 +08:00
Heng Li
7b7fabef4d
added idx_stat
2017-04-26 22:52:28 +08:00
Heng Li
de367a340c
compilable again
2017-04-26 19:36:46 +08:00
Heng Li
56723ad580
moved `sum_len` out of the index
...
as it can be inferred.
2017-04-19 11:06:24 -04:00
Heng Li
f35e152e99
fixed a few memory leaks
2017-04-13 23:05:19 -04:00
Heng Li
79c9478f46
backup
2017-04-09 14:59:39 -04:00
Heng Li
8c230563cc
can be compiled
2017-04-07 15:56:10 -04:00
Heng Li
f5cdd3f72f
is_hpc is a property of the index
2017-04-07 15:42:33 -04:00