fast-bwa

Commit Graph

Author	SHA1	Message	Date
Heng Li	3d129be642	r943: change the default -y to 20, but ... for GRCh38 ALT, this is not enough. We need -y at least 40 to get high accuracy because a locus at chr19 has 35 copies.	2014-10-22 12:42:58 -04:00
Heng Li	4177d6c2c7	r942: ignore ALT hits when counting n_sub for ... non-ALT hits. Counting leads to underestimated mapQ.	2014-10-22 10:24:16 -04:00
Heng Li	60b728487a	r941: set a min length for 3rd-round seeding	2014-10-21 13:15:42 -04:00
Heng Li	282130a64e	r940: fixed a bug - missing primary hit	2014-10-21 12:57:49 -04:00
Heng Li	76a15ea91b	r933: with bwa-postalt ready, drop option -g	2014-10-21 00:23:14 -04:00
Heng Li	a6b5a30dab	r930: use 3rd round seeding by default This strategy is similar to the seeding heuristic used by LAST. When it is used alone, it is not as accurate as the current seeding strategy at least for short reads. However, it may do a better job for a long contig mapped to multiple ALT contigs. This seeding strategy is also relatively cheap to perform.	2014-10-20 17:34:15 -04:00
Heng Li	038af2a551	r929: added simplified LAST-like seeding	2014-10-20 17:00:31 -04:00
Heng Li	3370ae9e35	r926: prepare to move -g to bwa-postalt.js	2014-10-19 20:43:53 -04:00
Heng Li	76a365a95f	r907: revert to -g.8 by default	2014-10-16 15:56:33 -04:00
Heng Li	d8d8b230d1	r906: don't reduce non-ALT mapQ by default	2014-10-16 15:15:23 -04:00
Heng Li	2a18fa114f	r895: increase the default max_XA_hits_alt to 200 Because there are >100 HLA haplotypes	2014-10-14 16:58:42 -04:00
Heng Li	a03d01f944	r878: XA is given to the best alignment Non-ALT hits may get ALT hits in the XA tag. This will simplify haplotype assignment.	2014-09-30 13:50:51 -04:00
Heng Li	dae4ca3ced	r875: invalid SAM output for ALT hits	2014-09-26 15:29:08 -04:00
Heng Li	7426a750ec	r868: use soft clip for ALT hits	2014-09-19 16:58:18 -04:00
Heng Li	9af36064e8	r867: fixed a few bugs; added ALT hits to XA	2014-09-19 16:50:21 -04:00
Heng Li	a41afe4c97	These files were committed on a wrong branch	2014-09-18 10:49:35 -04:00
Heng Li	c982443210	r854: improved the calculation of pa and build pa filtering into BWA-MEM	2014-09-17 16:26:28 -04:00
Heng Li	825ae92e58	r849: the pa tag now gives a number ... which is the ratio of this hit to the best ALT hit.	2014-09-17 13:05:35 -04:00
Heng Li	6f37c14f26	r848: tag alignments with primary ALT	2014-09-16 18:52:49 -04:00
Heng Li	4b6eeb34c8	r830: optionally fixed chunk size	2014-09-15 23:42:24 -04:00
Heng Li	624687b072	r829: killed a harmless gcc warning	2014-09-15 23:33:22 -04:00
Heng Li	b07587f806	r827: an alt hit as good as a pri hit as supp	2014-09-15 16:07:51 -04:00
Heng Li	aee53f1334	r824: ALT mapping seems working	2014-09-15 00:29:05 -04:00
Heng Li	015ab3f6c3	r823: towards ALT support	2014-09-14 16:41:14 -04:00
Heng Li	8116bcc786	Merge branch 'dev' into alt	2014-09-14 15:40:52 -04:00
Heng Li	8d2b93156b	r821: more relax on containing seeds	2014-09-12 10:35:49 -04:00
Heng Li	6739b713dd	Merge branch 'hotfix-utgaln' into dev Conflicts: main.c	2014-09-08 12:44:42 -04:00
Heng Li	f4aedddee6	r819: bugfix - added too many sub-SMEMs	2014-09-08 11:32:48 -04:00
Heng Li	ca61fe3ad5	code backup	2014-09-08 08:52:02 -04:00
Heng Li	1934f0cf24	code backup	2014-09-05 13:20:52 -04:00
Heng Li	35ac99b4f7	r815: optionally output ref fasta header Also fixed a bug in reading .ann files	2014-08-29 10:51:23 -04:00
Heng Li	b5cba257c1	r809: new strategy for the -a mode	2014-08-25 11:59:27 -04:00
Heng Li	7fd6a11569	r788: segfault when the last ref is "weird" mem_patch_reg() did not check if two hits are on the same strand, which may lead to an alignment bridging the forward-backward boundary.	2014-07-10 10:53:56 -04:00
Heng Li	cffff4338f	r787: use mem_seed_sw() also for non-PacBio reads In the previous version, mem_seed_sw() is only used for PacBio reads to filter bad seeds. For non-PacBio long queries, bwa-mem uses mem_chain2aln_short() for a similar purpose. However, it turns out that mem_chain2aln_short() is not effective given long near-tandem repeats. Bwa-mem still wastes a lot of time of futile ref substring and extensions. In this commit, mem_chain2aln_short() has been removed. mem_seed_sw() is used if the query sequence is long enough (~700bp). For shorter reads, the results should be almost identical to the previous version.	2014-07-10 10:30:22 -04:00
Heng Li	e4752b321b	Release bwa-0.7.9-r782	2014-05-19 09:08:07 -04:00
Heng Li	f00cc94e1d	r779: fixed a memory leak in SE	2014-05-16 00:06:34 -04:00
Heng Li	a5ad0cff7f	r778: reduced the number of alloc() calls a bit	2014-05-15 23:23:04 -04:00
Heng Li	061c63f36a	r766: removed useless code	2014-05-13 13:09:29 -04:00
Heng Li	39a6cd5bb0	r762: cleanup for the new release; unfinished It will take to make the documentation ready.	2014-05-11 15:15:44 -04:00
Heng Li	cfe6996173	r760: removed commented code It is slow and is not very effective. And I hate useless code.	2014-05-09 14:59:07 -04:00
Heng Li	43b498a37e	r759: bugfix - frac_rep not working Also added commented code for a 3rd round seeding. Not used.	2014-05-09 14:56:59 -04:00
Heng Li	c9b33502f3	r758: fixed a typo mostly negligible in practice	2014-05-07 15:07:29 -04:00
Heng Li	ce3c198245	r749: max_hits tunable on CMD; default to 5	2014-05-04 10:17:03 -04:00
Heng Li	f21d6498bc	r748: reduced the default -m to 50	2014-05-02 16:49:19 -04:00
Heng Li	e8f28cb529	r747: fixed a minor issue in the last (mis)commit	2014-05-02 16:17:50 -04:00
Heng Li	6db761e269	r746: tuned heuristic for GRCh38 Reduced -c to 500 by default. As a compensation, we choose up to 1000 positions if a seed has 500 or more occurrences. In addition, a read with big portion from such seeds will have lower mapping quality.	2014-05-02 16:06:27 -04:00
Heng Li	fa20c71920	r742: further control the max bandwidth I am looking at 6kb bandwidth...	2014-05-01 14:27:38 -04:00
Heng Li	4b2441069f	r740: don't attempt merge if bandwidth too large Sometimes the bandwidth can be >10k.	2014-05-01 11:01:52 -04:00
Heng Li	c6c943f9d7	r738: output multi-map in the XA tag (SE only) ... PE support coming soon	2014-04-30 16:46:05 -04:00
Heng Li	88f89be60e	r736: improved in low-complexity regions Example: GGAGGGGAAGGGTGGGCTGGAGGGGACGGGTGGGCTGGAGGGGAAGGGTGTGCTGGAGGGAAAAGGTGGACTGGAGGGGAAGGGTGGGCTGGAGGGGAAGG This read has 5 chains, two of which are: weight=80 26;26;0,4591439948(10:-3095894) 23;23;27,4591439957(10:-3095888) 31;31;70,4591439964(10:-3095873) weight=50 45;45;51,4591440017(10:-3095806) 50;50;51,4591440017(10:-3095801) 31;31;70,4591440090(10:-3095747) Extension from the 26bp seed in the 1st chain gives an alignment [0,101) <=> [4591439948,4591440067), which contains the 50bp seed in the second chain. However, if we extend the 50bp seed, it yields a better alignment [0,101) <=> [4591439966,4591440067) with a different starting position. The 26bp seed is wrong. This commit adds a heuristic to fix this issue.	2014-04-30 14:14:20 -04:00

1 2 3 4 5 ...

259 Commits (c2efebefa7031ef04a1bff87700377b506cfa629)