Simon Harris
4db1c0295c
Fixed segfault caused when reading from index file
2017-09-01 15:48:59 +01:00
Heng Li
d4074874ee
r316: get rid of a harmless gcc warning
2017-09-01 20:25:27 +08:00
Heng Li
eccdb3a1ca
r315: added getopt from musl
2017-09-01 20:20:34 +08:00
Stefan von Deylen
a3c3db6b9b
ONT source code changes to compile with MSVC 14
2017-08-30 16:25:20 +01:00
Heng Li
c4080aaf7e
Merge branch 'master' into short
2017-08-28 07:02:22 +08:00
Heng Li
bf8246f872
Release minimap2-2.1-r311
2017-08-25 13:35:55 +08:00
Heng Li
0fe1a224ab
r309: improved SAM header output
2017-08-25 10:35:58 +08:00
Heng Li
993a2bb521
r301: separate introns from deletions
...
When an intron is adjacent to a deletion, the old code count both as introns,
which lead to an inaccurate exon boundary.
2017-08-18 15:31:15 +08:00
Heng Li
64c1389e1a
Merge branch 'master' into splice
2017-08-17 23:39:27 +08:00
Heng Li
81cff97208
r299: support -h: output to stdout; return 0
2017-08-17 23:38:31 +08:00
Heng Li
bbb37d95f2
support inserting RG lines
2017-08-17 23:34:09 +08:00
Heng Li
2cde8d257c
r297: bidirectional RNA alignment
2017-08-17 06:02:44 -04:00
Heng Li
b5f5929bf9
r296: expose splicing related options to CLI
2017-08-13 21:37:51 -04:00
Heng Li
28f86688ab
r295: gap closure from the middle of non-HPC k
...
This WILL slightly affect the result of genomic mapping, but hopefully
in the good direction.
2017-08-12 23:48:43 -04:00
Heng Li
43506edbc5
backup: preliminary boundary alignment
2017-08-12 23:10:14 -04:00
Heng Li
53b3265d84
r290: in techrep, explain spliced alignment
2017-08-12 15:40:49 -04:00
Heng Li
a23df2dc91
r289: changed CLI help only
2017-08-12 12:40:07 -04:00
Heng Li
5a74088b74
r288: changed max intron length to 200k
2017-08-12 12:39:21 -04:00
Heng Li
d240318741
r287: refined CLI options and manpage
2017-08-12 12:26:04 -04:00
Heng Li
0f4c823b0c
r286: ignore introns when computing max seg score
2017-08-12 10:58:16 -04:00
Heng Li
a99358bc3d
r282: reduced intron cost; added eval script
2017-08-11 00:06:01 -04:00
Heng Li
163fa36ee6
r281: don't open long gaps on query
2017-08-10 15:04:59 -04:00
Heng Li
c59b0781bc
r280: output introns as "N" in the cdna mode
2017-08-09 11:45:02 -04:00
Heng Li
7429b12164
Merge branch 'master' into cdna
2017-08-08 22:00:24 -04:00
Heng Li
9e1125edda
r277: abort if query/-d missing ( #11 )
2017-08-08 21:46:15 -04:00
Heng Li
3dbe23b34e
Merge branch 'dev'
2017-08-08 21:30:32 -04:00
Heng Li
6840370f3c
Release minimap2-2.0 (r275)
2017-08-08 21:16:25 -04:00
Heng Li
7f9f659b6a
r274: CLI option to change max ref gap
2017-08-08 11:39:23 -04:00
Heng Li
1a7d782131
r273: cdna mapping mode for testing
...
Differences from the typical mapping mode:
* banded alignment disabled
* log gap cost during chaining
* zero long-gap extension during alignment
* up to 100kb (by default) reference gap
* bad seeding not filtered (to tune later)
2017-08-08 11:31:49 -04:00
Heng Li
079ec0d283
r271: added "short" preset; for testing only
2017-08-07 15:30:05 -04:00
Heng Li
12cea727b8
r238: bugfix to cs - rev sequence not complemented
2017-08-01 10:33:21 -04:00
Heng Li
cd105b47f2
r237: fixed a bug in outputting cs:Z
2017-07-31 14:49:39 -04:00
Heng Li
35f232c3fa
r236: in cs tag, output differences in lowercase
...
for easy eyeballing
2017-07-31 12:17:48 -04:00
Heng Li
4c0713ee14
r235: optionally output tag cs in PAF
...
cs encodes the query, the reference sequence and CIGAR.
2017-07-31 12:06:49 -04:00
Riku Walve
9e09c1ae72
fix self-comparison in index parameter override check
2017-07-30 21:46:25 +03:00
Heng Li
d8d4d29b68
Release minimap2-2.0rc1-r232
2017-07-30 14:32:40 -04:00
Heng Li
1f78e1ee53
r230: code formatting changes only
2017-07-30 12:31:40 -04:00
Heng Li
5934d68772
r229: a new way to prevent out-of-band backtrack
2017-07-29 23:52:30 -04:00
Heng Li
fa99d28d34
r228: reduced unnecessary INV alignment
2017-07-29 20:21:53 -04:00
Heng Li
d08b7a0c51
r227: use local alignment for INV alignment
2017-07-29 17:40:53 -04:00
Heng Li
da3db3c095
r226: only try inv alignment for primary
2017-07-29 14:09:35 -04:00
Heng Li
783ead6f47
r225: removed a debugging line
2017-07-29 13:21:38 -04:00
Heng Li
19d6ec885e
r224: inversion alignment around Z-drop break
2017-07-29 13:09:10 -04:00
Heng Li
5e3eecd6d4
r222: no effective changes
2017-07-29 10:31:46 -04:00
Heng Li
2179e9e24b
r221: output SA in the SAM output
2017-07-28 23:08:39 -04:00
Heng Li
ebbe9c1eb8
r219: fixed a bug caused by skipping tandem seeds
2017-07-28 14:06:56 -04:00
Heng Li
c672690564
r218: increase the frequency of SW slightly
2017-07-28 13:30:42 -04:00
Heng Li
f4fee60188
r217: ignore tandem seeds during alignment
...
This helps a tiny bit.
2017-07-28 12:26:56 -04:00
Heng Li
254280b8af
r216: a bit cleanup; identical output to r215
2017-07-28 11:54:18 -04:00
Heng Li
fc965805f7
r215: bring back a log gap component
...
Otherwise chaining may more often break a long gap into several gaps.
2017-07-28 00:17:19 -04:00
Heng Li
2c79580649
r213: more careful solution to wrong seeds
...
a little better, but not good enough!
2017-07-27 13:19:11 -04:00
Heng Li
b927838495
r212: better heuristic to fix wrong seeding
...
but not good enough. Will explore more.
2017-07-27 11:24:51 -04:00
Heng Li
371e20cc7c
r211: a better heurstic to reduce false seeds
2017-07-26 23:56:38 -04:00
Heng Li
a01d758af6
r206: mapq penalize short chains further
...
The old code penalized at the log() scale. Now added a linear-scaled factor. If
the chain consists of few minimizers, its quality is really not good.
2017-07-26 11:50:04 -04:00
Heng Li
e9dc1ce2b6
r205: when computing mapq, consider min_chain_sc
...
Not doing this was a mistake.
2017-07-26 11:34:14 -04:00
Heng Li
00c6db5073
r203: check more subopt aln if score small
2017-07-25 20:02:44 -04:00
Heng Li
f2ef48878a
r202: trim bad chain ends before extension
...
This fixes a few more FP long INDELs towards the end of alignments.
2017-07-25 19:53:19 -04:00
Heng Li
21ca564112
r201: fixed a minor chaining issue
...
Chaining looked at the end of a chain, but the end may not be the best. We now
go back to find the max.
2017-07-25 18:26:51 -04:00
Heng Li
215e92ed7b
r200: reduce long gaps in chaining
...
Every seed can initiate a chain.
2017-07-25 17:32:54 -04:00
Heng Li
b530ade333
r199: changed to linear gap cost for chaining
...
The old cost doesn't penalize long gaps enough. Will also drop seeds close to
the edge in the next commit.
2017-07-25 15:35:10 -04:00
Heng Li
f81f37fef1
r197: allocate index seq names from kalloc
...
to reduce malloc() overhead.
2017-07-24 19:36:05 -04:00
Heng Li
5c4d040b13
r191: warning if CLI index opt diff from prebuilt
...
Also added index testing API (moved from main.c to index.c)
2017-07-19 10:25:11 -04:00
Heng Li
4aff301ef4
r190: default -k to 15; added -x map-ont
2017-07-19 10:11:14 -04:00
Heng Li
470021fd27
r189: sync with ksw2 (no effective changes)
2017-07-19 09:28:25 -04:00
Heng Li
71c988f6ab
r188: renamed bseq* to mm_bseq*
...
to avoid naming collisions between minimap2 and bwa/fermi-lite/etc
2017-07-19 09:26:46 -04:00
Heng Li
b9b0b6f49c
r187: fixed non-terminated sam output ( #3 )
...
Only happen to unmapped read, with quality, and in the SAM output
2017-07-18 15:20:29 -04:00
Heng Li
495a78e40a
Get documentation ready for release
2017-07-18 11:04:09 -04:00
Heng Li
71e2a97a4c
r180: changed -x asm5 settings
2017-07-18 00:00:36 -04:00
Heng Li
941059292e
r179: changed the preset for assembly alignment
2017-07-17 22:41:46 -04:00
Heng Li
38aa66fa30
r178: fixed integer overflow in mapq calculation
2017-07-16 21:45:39 -04:00
Heng Li
b4280d186f
r176: removed seedcov_ratio; changed default opt
...
min_seedcov_ratio is not used
2017-07-12 12:47:46 -04:00
Heng Li
52caf79395
r175: halved max-chain-skip in the ava mode
2017-07-12 10:42:19 -04:00
Heng Li
eeeb2ffb68
r174: make max-chain-skip work
...
The max-chain-skip heuristics did not work due to a bug. Without this
heuristics, chaining is too slow for long-read overlap.
2017-07-12 10:08:06 -04:00
Heng Li
33451aba45
r173: changed the debugging output format
2017-07-11 15:23:28 -04:00
Heng Li
cfa083a98b
r172: separated PacBio and ONT read overlapping
...
HPC k-mer works better for PacBio, but worse for ONT. Interesting...
2017-07-11 15:12:35 -04:00
Heng Li
7598809577
r171: reduced log gap cost at chaining
...
The cost is so large that it discards too many valid seeds without HPC k-mers.
This change may introduce false long gaps to reference mapping. We have another
mechanism mm_filter_bad_seeds() to protect against this. In addition, minimap2
is not that bad to have long gaps. Some other aligners are worse.
Still need tuning in future.
2017-07-11 14:57:49 -04:00
Heng Li
826c8ba892
r170: added a debugging flag
...
something wrong with chaining
2017-07-11 14:47:35 -04:00
Heng Li
801bc84b01
r169: output more accurate col. 10&11 to PAF
...
In r168, col.10 is smaller than what it should be. This confuses miniasm.
2017-07-11 14:09:51 -04:00
Heng Li
782449975d
r168: fixed a bug in long join: a[] not sorted
...
Also added length requirement for long join and changed -g in the ava mode
2017-07-09 12:14:20 -04:00
Heng Li
1ac48556ae
r167: long join threshold depends on gap
...
also caught a bug for reverse strand join
2017-07-09 10:38:51 -04:00
Heng Li
4ee3202539
r164: unmapped read not properly flagged
2017-07-08 18:16:18 -04:00
Heng Li
42846ce65d
r163: reduced long join score requirement
...
because the chaining score is generally smaller with the last few commits.
2017-07-08 15:51:52 -04:00
Heng Li
3f6a0b0b5c
r162: improved chaining accuracy
2017-07-08 14:29:36 -04:00
Heng Li
38b2830e18
r161: filter bad seeds; changed default -g/-r
2017-07-08 13:31:27 -04:00
Heng Li
1fee5f8edc
r160: -O and -E accept two numbers
2017-07-08 11:34:52 -04:00
Heng Li
cc554aee43
r159: use two-piece gap penalty
2017-07-08 10:26:00 -04:00
Heng Li
9823317e8f
r158: optionally ignore base quality
2017-07-05 18:23:50 -04:00
Heng Li
e07daad7ad
r153: sam primary record not set sometimes
2017-07-03 13:18:57 -04:00
Heng Li
a94bc31311
r151: documentations
2017-07-03 12:11:07 -04:00
Heng Li
b625247300
r150: mm_sync_regs() doesn't work with negative id
2017-07-03 11:36:34 -04:00
Heng Li
53c4bf5e4f
r149: introduced debugging flags on CLI
2017-07-03 11:02:32 -04:00
Heng Li
2e4fd9f1d0
r148: revamped regs handling after cigar
2017-07-03 10:44:26 -04:00
Heng Li
e06c342659
r146: in filtering, drop children if parent out
...
This has been causing several segfaults.
2017-07-03 00:28:12 -04:00
Heng Li
51cfb60520
r145: changed default -p from 2 to 0.8
...
For long reads, secondary alignments can be very information.
2017-07-02 22:51:45 -04:00
Heng Li
632b8638d2
r144: adjust primary aln after cigar
2017-07-02 22:43:02 -04:00
Heng Li
2b45ba7a0b
r143: fixed a segfault and incorrect .parent
2017-07-02 19:56:21 -04:00
Heng Li
74d306a596
fixed bug when retaining 2ndary aln; still buggy
2017-07-02 19:08:30 -04:00
Heng Li
da90b614db
r141: replaced -b with -a (for SAM output)
...
-b sounds like BAM. I like -a better.
2017-07-01 16:54:59 -04:00
Heng Li
2338e887d9
finished the first draft of manpage
2017-07-01 11:25:54 -04:00
Heng Li
a9f089f0aa
r131: wrong EOF test; make mb_size <= batch_size
2017-07-01 09:26:09 -04:00
Heng Li
41efd03d7a
r129: fixed memory leak caused by qualities
2017-06-30 23:48:00 -04:00
Heng Li
426c2975f6
r126: filter by fraction of seed coverage
...
otherwise we may get too many poor overlap mappings.
2017-06-30 22:15:45 -04:00
Heng Li
d73bb28097
r125: changed CLI options
2017-06-30 19:08:47 -04:00
Heng Li
b08591c7a0
r124: a bit better CLI prompt
2017-06-30 15:46:52 -04:00
Heng Li
3a5486325a
r123: fixed a mem leak; more presets
2017-06-30 15:39:05 -04:00
Heng Li
646a746cdc
r122: filter contained aln after DP extension
2017-06-30 15:23:30 -04:00
Heng Li
fce87ce7bd
r121: output QUAL and unmapped to SAM
2017-06-30 14:40:54 -04:00
Heng Li
d11049eb32
r120: use max-scoring seg to control output
...
much better now
2017-06-30 14:21:44 -04:00
Heng Li
08a61c3cfc
r119: fixed a bug hidden by a previous bug
2017-06-30 13:27:47 -04:00
Heng Li
1a903486b9
r118: bugfix - regs unsorted before filtering
2017-06-30 12:52:28 -04:00
Heng Li
5dcd8f8965
r117: fixed a bug in logic
2017-06-30 11:52:42 -04:00
Heng Li
91e1c4d6db
r116: fixed another bug caused by refactoring
2017-06-30 00:03:45 -04:00
Heng Li
52b4d8e2c9
r115: set primary tag; still buggy
2017-06-29 23:48:35 -04:00
Heng Li
c4871f380c
r114: make SAM output better
2017-06-29 23:08:41 -04:00
Heng Li
03267e8fa7
r113: fixed a sam header bug
2017-06-29 22:43:06 -04:00
Heng Li
11167f511b
r112: output z-drop
2017-06-29 22:08:46 -04:00
Heng Li
3825feeeac
r111: changed the default z-drop to 200
2017-06-29 21:37:56 -04:00
Heng Li
e2b86d0332
r110: fixed a bug caused by refactoring
2017-06-29 21:12:31 -04:00
Heng Li
08cbb09fcc
r109: changed the default scoring
2017-06-29 20:21:57 -04:00
Heng Li
4cd456b9ba
r108: refactoring, move reg1 routines to hit.c
2017-06-29 19:44:11 -04:00
Heng Li
337c2a21cd
r105: fixed a bug in repeated right ext when zdrop
2017-06-29 15:45:07 -04:00
Heng Li
b9075d39a8
r104: long gap patching
2017-06-29 14:54:54 -04:00
Heng Li
9fbf7e41e1
r99: report progress
2017-06-28 23:56:33 -04:00
Heng Li
38070e8a05
r98: fixed segfault for certain scoring
...
due to unsigned comparisons between -1 and chromosome length
2017-06-28 22:18:51 -04:00
Heng Li
a25866c25c
r96: min_cnt still wrong in chaining
2017-06-28 11:03:03 -04:00
Heng Li
bf0e8199e2
r94: min_cnt is tested in a wrong way in chain
2017-06-28 10:39:27 -04:00
Heng Li
bcd9b1c621
r93: fixed various small issues
2017-06-28 10:35:21 -04:00
Heng Li
cdc2a1e29f
r92: fixed a bug for overlapping alignment
...
On the PBcR example E. coli reads, miniasm gives one circular unitig.
2017-06-27 22:03:31 -04:00
Heng Li
51057ab673
expose scoring
2017-06-27 21:37:25 -04:00
Heng Li
533150d49d
r90: revert default band width to 1000
...
10000 is excessively tolerant with bad hits.
2017-06-27 20:29:39 -04:00
Heng Li
fa80177e58
r89: added minimal number of minimizer counts
2017-06-27 18:43:15 -04:00
Heng Li
8977f07269
r88: fixed an out-of-boundary bug in ksw2
2017-06-27 14:50:31 -04:00
Heng Li
42283ef10c
r87: fixed a bug in ksw2
2017-06-27 13:29:48 -04:00
Heng Li
c02ff4662c
r85: two-round z-drop
2017-06-27 10:36:24 -04:00
Heng Li
99c57b86c5
r79: drop bad hits
2017-06-26 15:28:04 -04:00
Heng Li
5b614ae828
r78: fixed a split bug
2017-06-26 14:45:23 -04:00
Heng Li
de54c9dac2
r77: fixed an index loading bug (offset not set)
2017-06-26 13:56:25 -04:00
Heng Li
10644f2165
r76: missing header file
2017-06-26 12:36:37 -04:00
Heng Li
4b8e88a5f4
use long options
2017-06-26 12:31:36 -04:00
Heng Li
640b1a1727
command-line option to control CIGAR output
2017-06-26 11:41:09 -04:00
Heng Li
b1077ff14c
sam output
2017-06-25 22:05:20 -04:00
Heng Li
72dfb0c99e
fixed a bug in ksw2
2017-06-25 10:22:13 -04:00
Heng Li
b04e4b9215
r36: bring back primary; don't output all mappings
2017-06-08 15:28:19 -04:00
Heng Li
19e43571c1
r34: removed a bit unused code
2017-06-07 14:35:57 -04:00
Heng Li
d816e48fce
fixed a bug in chaining
2017-06-06 14:33:43 -04:00
Heng Li
6d4348db44
dp chaining mostly works, but fails sometimes
...
which means there are bugs that need to be fixed
2017-06-06 14:19:50 -04:00
Heng Li
1a9fc04cf0
backup
2017-06-06 10:16:33 -04:00
Heng Li
acc7382a30
backup
2017-06-04 16:09:45 -04:00
Heng Li
06adabd0dc
clean bill from valgrind
2017-05-04 12:44:49 +08:00
Heng Li
f2ae8eb670
mostly debugging code
2017-05-01 16:50:09 +08:00
Heng Li
7b7fabef4d
added idx_stat
2017-04-26 22:52:28 +08:00
Heng Li
de367a340c
compilable again
2017-04-26 19:36:46 +08:00
Heng Li
56723ad580
moved `sum_len` out of the index
...
as it can be inferred.
2017-04-19 11:06:24 -04:00
Heng Li
f35e152e99
fixed a few memory leaks
2017-04-13 23:05:19 -04:00
Heng Li
79c9478f46
backup
2017-04-09 14:59:39 -04:00
Heng Li
8c230563cc
can be compiled
2017-04-07 15:56:10 -04:00
Heng Li
f5cdd3f72f
is_hpc is a property of the index
2017-04-07 15:42:33 -04:00