minimap2

Commit Graph

Author	SHA1	Message	Date
Heng Li	123bc1d91d	put option operations in another file	2018-01-26 08:38:37 -05:00
Heng Li	33f8157961	r655: options to map to one strand of the ref #91	2018-01-16 10:34:30 -05:00
Heng Li	e420b17496	r629: API to construct index from strings	2017-12-18 22:29:46 -05:00
Heng Li	ab345e600b	r626: function to check incorrect scoring system	2017-12-13 12:23:43 -05:00
Heng Li	98a6e52c06	r618: heuristics to avoid tiny terminal exons	2017-12-11 00:57:55 -05:00
Heng Li	704ff9f4c6	r607: estimate sequence divergence Currently using the simplest method. There may be a more accurate estimate.	2017-12-06 16:14:39 -05:00
Heng Li	2f463b1db0	r573: prepare to generalize index	2017-11-11 19:54:06 -05:00
mvdbeek	1cb0bf4bef	Implement -Y for soft clipping of supp. alignments I tried to base this on bwa-mem and it seems to work for sam alignments.	2017-11-09 19:22:36 +01:00
Heng Li	b24d68ae9f	r557: fixed another mapq underestimate When a chain is split during base-level alignment, its chaining score is reduced. However, the chaining score of its suboptimal chain remains the same. This leads to underestimated mapping quality.	2017-11-07 23:20:49 -05:00
Heng Li	fa5a645ca5	r552: fixed a tiny typo on struct packing The old packing wastes memory, thought very small.	2017-11-05 08:27:26 -05:00
Heng Li	cd24dc8834	r545: removed option -i, not working well	2017-10-31 22:23:27 -04:00
Heng Li	79b0caca95	r537: model the next base to GT/AG [PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in both human and yeast, and that the base preceeding AG tends to be C or T (i.e. Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost if there is no r or y. This improves the junction accuracy when mapping to human and mouse and decreases the accuacy when mapping to SIRV. My guess is that SIRV does not honor this trend. Need to investigate in future. Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed to 9 instead of 5. I also added --splice-flank to enable the above model. This may become the default once I confirm my hypothesis on SIRV.	2017-10-28 00:25:01 -04:00
Heng Li	d4b5dfc297	r533: added --no-pairing to prevent the use of any pairing information for paired-end reads.	2017-10-23 14:09:32 -04:00
Heng Li	306e4541f8	Released minimap2-2.3 (r531)	2017-10-22 23:13:35 -04:00
Heng Li	4683da2455	r520: added option -L to write long cigar to CG	2017-10-17 17:32:44 -04:00
Heng Li	adf6cd7f52	r513: merged pre- and post-cigar blen and mlen This saves a bit memory and is cleaner.	2017-10-16 10:55:18 -04:00
Heng Li	e6f525edaf	r512: option to filter poorly aligned reads	2017-10-16 10:38:22 -04:00
Heng Li	7c555f9b7e	r508: use two I/O threads for mapping -x sr applies this option by default	2017-10-12 14:56:01 -04:00
Heng Li	7345621759	r499: end bonus working; DP region needs improve!	2017-10-11 00:14:25 -04:00
Heng Li	61e56c941d	r488: parameter to control max fragment length	2017-10-07 23:54:32 -04:00
Heng Li	9c5767f9ed	r477: renamed multi_seg to frag_mode	2017-10-05 15:48:17 -04:00
Heng Li	ae2adf04d4	r476: multi-file fragment mode working	2017-10-05 15:39:26 -04:00
Heng Li	f4a5d3a692	r474: replaced -S and --cs-no-equal with --cs	2017-10-05 15:03:03 -04:00
Heng Li	5ab99eb26e	more accurate SAM flag	2017-10-05 10:59:38 -04:00
Heng Li	9aba11769c	r467: added : (equal length) and ^ (intron) ops	2017-10-04 21:55:37 -04:00
Heng Li	7d50e646dd	r466: detect multi-part index more smartly though it might not work in an extremely rare case: the end of a sequence ends at X*16384 and it is the last sequence in a batch. This can be resolved by never letting the kstream_t buffer empty.	2017-10-04 17:32:58 -04:00
Heng Li	2581c44a21	r463: optionally disable secondary hits	2017-10-04 13:24:41 -04:00
Heng Li	2a1e738a94	r461: randomize repetitive hits	2017-10-04 13:05:18 -04:00
Heng Li	cf55c84056	r460: added option --no-long-join	2017-10-04 12:08:44 -04:00
Heng Li	04fb2c2ec0	r454: rechain with higher max_occ if no good chain	2017-09-29 19:24:32 -04:00
Heng Li	7e0d70bfd3	r445: pair coordinate adjustment working Next: mapq adjustment, which will be tricky...	2017-09-27 15:38:18 -04:00
Heng Li	a349d85280	r444: changed the way orientation is specified The old model doesn't work with RF or RR orientation. The new model only works with paired-end reads. For >2 segments, only FF is supported.	2017-09-27 12:33:10 -04:00
Heng Li	f611edf6f2	r443: don't filter small cm for split seg	2017-09-26 16:17:58 -04:00
Heng Li	3bb66e1ed3	multi-seg working on toy examples	2017-09-25 13:42:04 -04:00
Heng Li	f0951141a1	allow to read multiple files interleaved	2017-09-24 14:33:05 -04:00
Heng Li	645db3350e	Merge branch 'master' into sr	2017-09-20 11:15:14 -04:00
Heng Li	75e6bbc9f6	r421: removed the MM_F_SPLICE_BOTH mode In the default splice mode, minimap2 applies two rounds of spliced alignment: first assuming GT-AG to be the splice signal across all splicing sites and then assuming CT-AC to be the signal. This is the idea strategy. In the MM_F_SPLICE_BOTH mode, minimap2 applies one round of spliced alignment, assuming GT-AG and CT-AC to be the splice signals AT THE SAME TIME. This will be faster but less accurate. I don't think anyone would like to run minimap2 in this mode, so I am removing it for clarity.	2017-09-20 11:11:53 -04:00
Heng Li	7a9b4db874	replaced --approx-ext with --sr --sr disables Z-drop and may come with other heurstics	2017-09-20 10:51:18 -04:00
Heng Li	fb1bcc0084	early exploration	2017-09-19 16:18:28 -04:00
Heng Li	75ff7ceec5	r368: API documentation	2017-09-14 22:23:04 -04:00
Heng Li	e2823d4aee	r367: index reader optionally writes index	2017-09-14 21:18:13 -04:00
Heng Li	eb00521d9b	redesigned indexing and option APIs	2017-09-14 17:02:01 -04:00
Heng Li	0f7455cefa	r365: documented the "sr" preset	2017-09-14 12:57:21 -04:00
Heng Li	3c91d652dd	r360: allow to set integer max occ	2017-09-13 11:37:00 -04:00
Heng Li	d7f2ac1d4f	better parameters for short reads It turns out the key problem is not the minimizer density. It is the max occurrence that tends to affect results more, especially sensitivity. There is still lots of work to do, but for now, it seems a good start.	2017-09-12 16:11:23 -04:00
Heng Li	0fe1a224ab	r309: improved SAM header output	2017-08-25 10:35:58 +08:00
Heng Li	2cde8d257c	r297: bidirectional RNA alignment	2017-08-17 06:02:44 -04:00
Heng Li	b5f5929bf9	r296: expose splicing related options to CLI	2017-08-13 21:37:51 -04:00
Heng Li	43506edbc5	backup: preliminary boundary alignment	2017-08-12 23:10:14 -04:00
Heng Li	d240318741	r287: refined CLI options and manpage	2017-08-12 12:26:04 -04:00

1 2

93 Commits (12a5a5fa3cb1d92e42e739ed807acdeeb9723656)