Commit Graph

127 Commits (9ab95be1bb36d2c1925f3f31b92ea5fbf2bf9d25)

Author SHA1 Message Date
Heng Li e6f525edaf r512: option to filter poorly aligned reads 2017-10-16 10:38:22 -04:00
Heng Li 7c555f9b7e r508: use two I/O threads for mapping
-x sr applies this option by default
2017-10-12 14:56:01 -04:00
Heng Li 7345621759 r499: end bonus working; DP region needs improve! 2017-10-11 00:14:25 -04:00
Heng Li 61e56c941d r488: parameter to control max fragment length 2017-10-07 23:54:32 -04:00
Heng Li 9c5767f9ed r477: renamed multi_seg to frag_mode 2017-10-05 15:48:17 -04:00
Heng Li ae2adf04d4 r476: multi-file fragment mode working 2017-10-05 15:39:26 -04:00
Heng Li f4a5d3a692 r474: replaced -S and --cs-no-equal with --cs 2017-10-05 15:03:03 -04:00
Heng Li 5ab99eb26e more accurate SAM flag 2017-10-05 10:59:38 -04:00
Heng Li 9aba11769c r467: added : (equal length) and ^ (intron) ops 2017-10-04 21:55:37 -04:00
Heng Li 7d50e646dd r466: detect multi-part index more smartly
though it might not work in an extremely rare case: the end of a sequence ends
at X*16384 and it is the last sequence in a batch. This can be resolved by
never letting the kstream_t buffer empty.
2017-10-04 17:32:58 -04:00
Heng Li 2581c44a21 r463: optionally disable secondary hits 2017-10-04 13:24:41 -04:00
Heng Li 2a1e738a94 r461: randomize repetitive hits 2017-10-04 13:05:18 -04:00
Heng Li cf55c84056 r460: added option --no-long-join 2017-10-04 12:08:44 -04:00
Heng Li 04fb2c2ec0 r454: rechain with higher max_occ if no good chain 2017-09-29 19:24:32 -04:00
Heng Li 7e0d70bfd3 r445: pair coordinate adjustment working
Next: mapq adjustment, which will be tricky...
2017-09-27 15:38:18 -04:00
Heng Li a349d85280 r444: changed the way orientation is specified
The old model doesn't work with RF or RR orientation. The new model only works
with paired-end reads. For >2 segments, only FF is supported.
2017-09-27 12:33:10 -04:00
Heng Li f611edf6f2 r443: don't filter small cm for split seg 2017-09-26 16:17:58 -04:00
Heng Li 3bb66e1ed3 multi-seg working on toy examples 2017-09-25 13:42:04 -04:00
Heng Li f0951141a1 allow to read multiple files interleaved 2017-09-24 14:33:05 -04:00
Heng Li 645db3350e Merge branch 'master' into sr 2017-09-20 11:15:14 -04:00
Heng Li 75e6bbc9f6 r421: removed the MM_F_SPLICE_BOTH mode
In the default splice mode, minimap2 applies two rounds of spliced alignment:
first assuming GT-AG to be the splice signal across all splicing sites and then
assuming CT-AC to be the signal. This is the idea strategy.

In the MM_F_SPLICE_BOTH mode, minimap2 applies one round of spliced alignment,
assuming GT-AG and CT-AC to be the splice signals AT THE SAME TIME. This will
be faster but less accurate. I don't think anyone would like to run minimap2 in
this mode, so I am removing it for clarity.
2017-09-20 11:11:53 -04:00
Heng Li 7a9b4db874 replaced --approx-ext with --sr
--sr disables Z-drop and may come with other heurstics
2017-09-20 10:51:18 -04:00
Heng Li fb1bcc0084 early exploration 2017-09-19 16:18:28 -04:00
Heng Li 75ff7ceec5 r368: API documentation 2017-09-14 22:23:04 -04:00
Heng Li e2823d4aee r367: index reader optionally writes index 2017-09-14 21:18:13 -04:00
Heng Li eb00521d9b redesigned indexing and option APIs 2017-09-14 17:02:01 -04:00
Heng Li 0f7455cefa r365: documented the "sr" preset 2017-09-14 12:57:21 -04:00
Heng Li 3c91d652dd r360: allow to set integer max occ 2017-09-13 11:37:00 -04:00
Heng Li d7f2ac1d4f better parameters for short reads
It turns out the key problem is not the minimizer density. It is the max
occurrence that tends to affect results more, especially sensitivity. There is
still lots of work to do, but for now, it seems a good start.
2017-09-12 16:11:23 -04:00
Heng Li 0fe1a224ab r309: improved SAM header output 2017-08-25 10:35:58 +08:00
Heng Li 2cde8d257c r297: bidirectional RNA alignment 2017-08-17 06:02:44 -04:00
Heng Li b5f5929bf9 r296: expose splicing related options to CLI 2017-08-13 21:37:51 -04:00
Heng Li 43506edbc5 backup: preliminary boundary alignment 2017-08-12 23:10:14 -04:00
Heng Li d240318741 r287: refined CLI options and manpage 2017-08-12 12:26:04 -04:00
Heng Li 1a7d782131 r273: cdna mapping mode for testing
Differences from the typical mapping mode:

* banded alignment disabled
* log gap cost during chaining
* zero long-gap extension during alignment
* up to 100kb (by default) reference gap
* bad seeding not filtered (to tune later)
2017-08-08 11:31:49 -04:00
Heng Li 4c0713ee14 r235: optionally output tag cs in PAF
cs encodes the query, the reference sequence and CIGAR.
2017-07-31 12:06:49 -04:00
Heng Li 19d6ec885e r224: inversion alignment around Z-drop break 2017-07-29 13:09:10 -04:00
Heng Li f81f37fef1 r197: allocate index seq names from kalloc
to reduce malloc() overhead.
2017-07-24 19:36:05 -04:00
Heng Li 5c4d040b13 r191: warning if CLI index opt diff from prebuilt
Also added index testing API (moved from main.c to index.c)
2017-07-19 10:25:11 -04:00
Heng Li 71c988f6ab r188: renamed bseq* to mm_bseq*
to avoid naming collisions between minimap2 and bwa/fermi-lite/etc
2017-07-19 09:26:46 -04:00
Heng Li b4280d186f r176: removed seedcov_ratio; changed default opt
min_seedcov_ratio is not used
2017-07-12 12:47:46 -04:00
Heng Li 801bc84b01 r169: output more accurate col. 10&11 to PAF
In r168, col.10 is smaller than what it should be. This confuses miniasm.
2017-07-11 14:09:51 -04:00
Heng Li cc554aee43 r159: use two-piece gap penalty 2017-07-08 10:26:00 -04:00
Heng Li 9823317e8f r158: optionally ignore base quality 2017-07-05 18:23:50 -04:00
Heng Li 53c4bf5e4f r149: introduced debugging flags on CLI 2017-07-03 11:02:32 -04:00
Heng Li 632b8638d2 r144: adjust primary aln after cigar 2017-07-02 22:43:02 -04:00
Heng Li 74d306a596 fixed bug when retaining 2ndary aln; still buggy 2017-07-02 19:08:30 -04:00
Heng Li 426c2975f6 r126: filter by fraction of seed coverage
otherwise we may get too many poor overlap mappings.
2017-06-30 22:15:45 -04:00
Heng Li d11049eb32 r120: use max-scoring seg to control output
much better now
2017-06-30 14:21:44 -04:00
Heng Li 52b4d8e2c9 r115: set primary tag; still buggy 2017-06-29 23:48:35 -04:00
Heng Li 11167f511b r112: output z-drop 2017-06-29 22:08:46 -04:00
Heng Li c8d122bcdb backup 2017-06-29 11:11:15 -04:00
Heng Li bcd9b1c621 r93: fixed various small issues 2017-06-28 10:35:21 -04:00
Heng Li fa80177e58 r89: added minimal number of minimizer counts 2017-06-27 18:43:15 -04:00
Heng Li 640b1a1727 command-line option to control CIGAR output 2017-06-26 11:41:09 -04:00
Heng Li b1077ff14c sam output 2017-06-25 22:05:20 -04:00
Heng Li aa5881e7bb backup 2017-06-24 22:51:31 -04:00
Heng Li 35b84f88c6 backup 2017-06-23 22:42:15 -04:00
Heng Li 4fea3d778a backup 2017-06-23 18:57:00 -04:00
Heng Li 6c8368c24c get the left-extension sequence correctly 2017-06-23 18:25:47 -04:00
Heng Li 990f7b0b71 backup 2017-06-23 15:13:53 -04:00
Heng Li 4ae0b46972 min_ksw_len 2017-06-23 14:38:28 -04:00
Heng Li 9cd313eae1 sequence retrieval working 2017-06-23 14:11:56 -04:00
Heng Li 326d91deb0 backup 2017-06-23 14:06:00 -04:00
Heng Li 44cdd18de0 start to work on alignment 2017-06-23 13:44:45 -04:00
Heng Li b04e4b9215 r36: bring back primary; don't output all mappings 2017-06-08 15:28:19 -04:00
Heng Li 19e43571c1 r34: removed a bit unused code 2017-06-07 14:35:57 -04:00
Heng Li 8ad5cfde42 output PAF 2017-06-07 14:18:32 -04:00
Heng Li 6d4348db44 dp chaining mostly works, but fails sometimes
which means there are bugs that need to be fixed
2017-06-06 14:19:50 -04:00
Heng Li 1a9fc04cf0 backup 2017-06-06 10:16:33 -04:00
Heng Li acc7382a30 backup 2017-06-04 16:09:45 -04:00
Heng Li 7b7fabef4d added idx_stat 2017-04-26 22:52:28 +08:00
Heng Li de367a340c compilable again 2017-04-26 19:36:46 +08:00
Heng Li 56723ad580 moved `sum_len` out of the index
as it can be inferred.
2017-04-19 11:06:24 -04:00
Heng Li f5cdd3f72f is_hpc is a property of the index 2017-04-07 15:42:33 -04:00
Heng Li b3bc4911ba index can be compiled; not tested yet 2017-04-07 15:30:30 -04:00
Heng Li 01baa847a1 Homopolymer-compressed k-mer sketch 2017-04-06 15:37:34 -04:00