fast-bwa

Commit Graph

Author	SHA1	Message	Date
Heng Li	66c9783daf	r345: bugfix in mem - wrong mate strand for unmap Received a clean bill from Picard	2013-03-08 13:15:43 -05:00
Heng Li	274c0ac96c	r343: bugfix in mem - wrong mate info for unmap SAM generation is always among the nastiest bits. I would need to refactor at some point (hardly happening).	2013-03-08 12:40:31 -05:00
Heng Li	017be45407	r342: bugfix in bwasw - AS is off by one but I do not understand why the old code does not have the same problem.	2013-03-08 12:06:45 -05:00
Heng Li	b5b50ac8da	r341: bugfix - wrong mate position when one end is mapped with a score less than -T. Caused by the -T option.	2013-03-07 21:35:57 -05:00
Heng Li	b0a76884e8	r340: feature freeze; updated the manpage I will stop adding new features to bwa and prepare for the next release. I will briefly evaluate the variant calling accuracy before the release.	2013-03-07 11:51:23 -05:00
Heng Li	503ca9ed2e	r339: pemerge - expose some settings to CLI	2013-03-07 11:22:19 -05:00
Heng Li	1cadfa1552	r338: pemerge - fixed memory leaks; multithreading pemerge is actually quite slow.	2013-03-07 11:14:52 -05:00
Heng Li	3e3236dfc4	r337: mem - always read even number of reads In the old code, we may read odd number of reads from an interleaved fastq.	2013-03-07 11:00:15 -05:00
Heng Li	72817b664e	r336: fine tuning pemerge	2013-03-06 23:38:07 -05:00
Heng Li	557d50c7e1	r335: fixed a compiling error Caused by the last change	2013-03-06 21:57:13 -05:00
Heng Li	042e1f4442	r334: added pemerge to bwa	2013-03-06 21:55:02 -05:00
Heng Li	5fbd454682	r332: added output threshold Otherwise there are far too many short hits	2013-03-05 22:49:38 -05:00
Heng Li	6476343a83	r331: rewrote CIGAR generation for bwa-short When backtracking, bwa-short does not keep the detailed alignment or the exact start and end positions. To find the boundary and the CIGAR, the old code does a global alignment with a small end-gap penalty. It then deals with a lot of special cases to derive the right position and CIGAR, which are actually not always right. It is a mess. As the new ksw.{c,h} does not support a different end-gap penalty, the old strategy does not work. But we get something better. The new code finds the boundaries with ksw_extend(). It is cleaner and gives more accurate CIGAR in most cases.	2013-03-05 19:56:37 -05:00
Heng Li	98f8966750	r329: ditch stdaln.{c,h}; no changes to bwa-mem stdaln.{c,h} was written ten years ago. Its local and SW extension code are actually buggy (though that rarely happens and usually does not affect the results too much). ksw.{c,h} is more concise, potentially faster, less buggy, and richer in features.	2013-03-05 12:00:24 -05:00
Heng Li	efd9769b07	r324: a little code cleanup The changes after r317 aim to improve the performance and accuracy for very long query alignment. The short-read alignment should not be affected. The changes include: 1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this heuristic only improves speed, but now I realize it also reduces poor alignment with long good flanking alignments. The difference from blast's X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not. 2) Band width doubling. When band width is too small, we will get a poor alignment in the middle. Sometimes such alignments cannot be fully excluded with Z-dropoff. Band width doubling is an alternative heuristic. It is based on the observation that the existing of close-to-boundary high score possibly implies inadequate band width. When we see such a signal, we double the band width.	2013-03-05 00:57:16 -05:00
Heng Li	e0991d6a45	r323: added Z-dropoff, a variant of blast's X-drop	2013-03-05 00:34:33 -05:00
Heng Li	733410b50d	r320: speed up very long sequence alignment 100-200bp read alignment should not be affected at all.	2013-03-04 14:43:49 -05:00
Heng Li	7e00dbcac5	r317: bugfix - out-of-range extension This happens when target region crosses the forward-reverse boundary. This will almost never happen to short-read alignment.	2013-03-04 11:35:23 -05:00
Heng Li	d35f33b513	r316: don't allocate zero-length memory It is not a bug, but Electric Fence does not like that.	2013-03-04 10:22:18 -05:00
Heng Li	35fb7f9fdf	r315: move kopen.o out of libbwa.a	2013-03-01 11:47:51 -05:00
Heng Li	3e4a178e08	r314: cleanup bwamem API Don't modify input sequences; more documentations	2013-03-01 11:14:51 -05:00
Heng Li	c5434ac865	r313: release bwa-0.7.0	2013-02-28 15:56:05 -05:00
Heng Li	f3cff1c609	r311: even tighter bw for CIGAR	2013-02-27 23:59:50 -05:00
Heng Li	6a4d8c79d8	r309: bugfix - soft clipping missing in example.c	2013-02-27 22:45:18 -05:00
Heng Li	df7c3f0000	r308: added a new API to convert region to CIGAR and an example program demonstrating how to do single-end alignment in <50 lines of C code.	2013-02-27 22:28:29 -05:00
Heng Li	4bb0bdddca	r306: introduce clipping penalty More clipping leads to more severe reference bias. We should not clip the alignment unless necessary.	2013-02-27 21:13:39 -05:00
Heng Li	292e92b602	r303: bugfix - wrong band width when CIGAR	2013-02-27 15:39:15 -05:00
Heng Li	e620f0ff4e	r302: updated the manpage	2013-02-27 13:16:22 -05:00
Heng Li	b621d3ae38	r301: left-align indels Don't know why the change is working...	2013-02-27 00:42:19 -05:00
Heng Li	65e099df34	r300: fixed an out-of-boundary bug in rare case	2013-02-27 00:37:17 -05:00
Heng Li	0b533385ef	r299: better way to exclude seed	2013-02-27 00:29:11 -05:00
Heng Li	acd1ab607b	r297: reduce wasteful SW extension This is particularly important for long sequences	2013-02-26 16:26:46 -05:00
Heng Li	98787f0ae0	r295: generate NM	2013-02-26 13:36:01 -05:00
Heng Li	32f2d60a2e	r294: bugfix - -M not working	2013-02-26 13:14:33 -05:00
Heng Li	619ac4f93d	r293: bugfix - wrong RG type in SAM output	2013-02-26 13:03:35 -05:00
Heng Li	c6b226d719	r292: fixed a very stupid bug on CLI I was thinking 0x10 or 16, but wrote 0x16...	2013-02-26 12:49:48 -05:00
Heng Li	bfb2583d7f	r291: summary - bwt.c micro optimization	2013-02-26 12:10:19 -05:00
Heng Li	e70c7c2a71	r284: amend cross-reference hit I really hate this: complex and twisted logic for a nasty scenario that almost never happens to short reads - but it may become serious when the reference genome consists of many contigs. On toy examples, the code seems to work. Don't know if it really works...	2013-02-26 00:03:49 -05:00
Heng Li	61dd3bf13a	r283: prepare for fixing cross-ref aln	2013-02-25 22:49:15 -05:00
Heng Li	77b5b586ad	r282: set min split_len to read length	2013-02-25 17:29:35 -05:00
Heng Li	30cc8a95d1	fixed an unimportant memory leak	2013-02-25 16:34:19 -05:00
Heng Li	d19e834d84	r280: align two ends in the same thread Otherwise odd-number threads may be of different speed from even-number threads.	2013-02-25 15:40:15 -05:00
Heng Li	20aa848b3c	r279: for PE mapq, consider the number of pairs If there are a lot of proper pairs, it is more likely that the best pair is wrong.	2013-02-25 13:00:35 -05:00
Heng Li	9957e04590	r278: don't perform too many mate-sw	2013-02-25 11:56:02 -05:00
Heng Li	e9e5ee6a3d	r277: updated the revision number	2013-02-25 11:34:06 -05:00
Heng Li	0b4a40dc25	updated revision number; to merge into master	2013-02-24 13:34:20 -05:00
Heng Li	545fb87feb	removed another part related to color-space	2013-02-22 17:15:57 -05:00
Heng Li	6ad5a3c086	removed color-space support which has been broken since 0.6.x	2013-02-12 10:21:17 -05:00
Heng Li	91debf412b	move smem iterators to bwamem.{c,h}	2013-01-31 13:59:48 -05:00
Heng Li	292f9061ab	r132: optionally copy FASTA/Q comment to SAM	2012-10-26 12:54:32 -04:00

1 2

94 Commits (66c9783dafccd0b202eef2c331c4a9dc8d44fcba)