Commit Graph

358 Commits (5581cb9152876913685fae153dad976c24552ca2)

Author SHA1 Message Date
Heng Li 5581cb9152 Release bwa-0.7.2-r351
For the TLEN sign fix. Sorry for the significant bug in 0.7.0/0.7.1
2013-03-09 18:15:41 -05:00
Heng Li 2d01a297fb Improving 'properly paired' flag.
If one end has a low quality tail that happens to have a score-20 hit,
the pair won't be flagged as properly paired because bwa-mem thought it has
multiple hits. By filtering with -T, we won't have this problem.
2013-03-09 18:05:50 -05:00
Heng Li 740d2c1314 Match to 'N' costs -1, instead of 0.
This is to prevent alignment through 'N'.
2013-03-09 18:03:57 -05:00
Heng Li 9ea7f83974 Emergent bugfix: wrong TLEN sign
It is interesting that Picard did not find the issue.
2013-03-09 18:03:15 -05:00
Heng Li 1d132a546d Release 0.7.1-r347 2013-03-08 15:30:06 -05:00
Heng Li 5370bb23a3 Updated NEWS; added stddef.h for size_t
I thought size_t is defined in stdlib.h, but it is not always.
2013-03-08 14:14:42 -05:00
Heng Li 66c9783daf r345: bugfix in mem - wrong mate strand for unmap
Received a clean bill from Picard
2013-03-08 13:15:43 -05:00
Heng Li af7b4d8980 gcc wrongly thinks a variable may be uninitialized
It should always be initialized. To avoid a warning, made a change.
2013-03-08 12:45:50 -05:00
Heng Li 274c0ac96c r343: bugfix in mem - wrong mate info for unmap
SAM generation is always among the nastiest bits. I would need to refactor at
some point (hardly happening).
2013-03-08 12:40:31 -05:00
Heng Li 017be45407 r342: bugfix in bwasw - AS is off by one
but I do not understand why the old code does not have the same problem.
2013-03-08 12:06:45 -05:00
Heng Li b5b50ac8da r341: bugfix - wrong mate position
when one end is mapped with a score less than -T. Caused by the -T option.
2013-03-07 21:35:57 -05:00
Heng Li b0a76884e8 r340: feature freeze; updated the manpage
I will stop adding new features to bwa and prepare for the next release. I will
briefly evaluate the variant calling accuracy before the release.
2013-03-07 11:51:23 -05:00
Heng Li 503ca9ed2e r339: pemerge - expose some settings to CLI 2013-03-07 11:22:19 -05:00
Heng Li 1cadfa1552 r338: pemerge - fixed memory leaks; multithreading
pemerge is actually quite slow.
2013-03-07 11:14:52 -05:00
Heng Li 3e3236dfc4 r337: mem - always read even number of reads
In the old code, we may read odd number of reads from an interleaved fastq.
2013-03-07 11:00:15 -05:00
Heng Li 72817b664e r336: fine tuning pemerge 2013-03-06 23:38:07 -05:00
Heng Li 557d50c7e1 r335: fixed a compiling error
Caused by the last change
2013-03-06 21:57:13 -05:00
Heng Li 042e1f4442 r334: added pemerge to bwa 2013-03-06 21:55:02 -05:00
Heng Li 773b86331b De-overlap paired-end reads 2013-03-06 19:23:45 -05:00
Heng Li 5fbd454682 r332: added output threshold
Otherwise there are far too many short hits
2013-03-05 22:49:38 -05:00
Heng Li 6476343a83 r331: rewrote CIGAR generation for bwa-short
When backtracking, bwa-short does not keep the detailed alignment or the exact
start and end positions. To find the boundary and the CIGAR, the old code does
a global alignment with a small end-gap penalty. It then deals with a lot of
special cases to derive the right position and CIGAR, which are actually not
always right. It is a mess.

As the new ksw.{c,h} does not support a different end-gap penalty, the old
strategy does not work. But we get something better. The new code finds the
boundaries with ksw_extend(). It is cleaner and gives more accurate CIGAR in
most cases.
2013-03-05 19:56:37 -05:00
Heng Li a76b75f41e Merge pull request #14 from drkeoni/master
Small fix for possible compile problem on Ubuntu systems
2013-03-05 12:57:10 -08:00
Jon Sorenson 25366c7220 Fixing problem with linking to libm on some Ubuntu systems (I see this on machine running 11.04, kernel 3.0.0-14-virtual). Changing order of -lm on the command line seems to do the trick and should be tolerated in other environments. 2013-03-05 20:48:16 +00:00
Heng Li 98f8966750 r329: ditch stdaln.{c,h}; no changes to bwa-mem
stdaln.{c,h} was written ten years ago. Its local and SW extension code are
actually buggy (though that rarely happens and usually does not affect the
results too much). ksw.{c,h} is more concise, potentially faster, less buggy,
and richer in features.
2013-03-05 12:00:24 -05:00
Heng Li bb37e14d02 replace aln_global in bwase.c 2013-03-05 10:38:47 -05:00
Heng Li e6c262594f bwa-sw: ditch stdaln 2013-03-05 10:12:38 -05:00
Heng Li 086c9d0e7d bwa-sw: use bwa_gen_cigar() for cigar generation 2013-03-05 09:54:49 -05:00
Heng Li 07921659cf move mem_fill_scmat() to bwa.{h,c} 2013-03-05 09:38:12 -05:00
Heng Li efd9769b07 r324: a little code cleanup
The changes after r317 aim to improve the performance and accuracy for very
long query alignment. The short-read alignment should not be affected. The
changes include:

1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this
   heuristic only improves speed, but now I realize it also reduces poor
   alignment with long good flanking alignments. The difference from blast's
   X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not.

2) Band width doubling. When band width is too small, we will get a poor
   alignment in the middle. Sometimes such alignments cannot be fully excluded
   with Z-dropoff. Band width doubling is an alternative heuristic. It is based
   on the observation that the existing of close-to-boundary high score
   possibly implies inadequate band width. When we see such a signal, we double
   the band width.
2013-03-05 00:57:16 -05:00
Heng Li e0991d6a45 r323: added Z-dropoff, a variant of blast's X-drop 2013-03-05 00:34:33 -05:00
Heng Li d6096c3f99 bugfix: caused by the latest change 2013-03-04 18:41:57 -05:00
Heng Li 59bc9341f6 code backup; more changes coming later 2013-03-04 17:29:07 -05:00
Heng Li 733410b50d r320: speed up very long sequence alignment
100-200bp read alignment should not be affected at all.
2013-03-04 14:43:49 -05:00
Heng Li 40f1214736 change to debugging code only 2013-03-04 11:52:11 -05:00
Heng Li 7e00dbcac5 r317: bugfix - out-of-range extension
This happens when target region crosses the forward-reverse boundary. This will
almost never happen to short-read alignment.
2013-03-04 11:35:23 -05:00
Heng Li 1a451df800 prepare to ditch stdaln.{h,c} 2013-03-04 10:32:33 -05:00
Heng Li d35f33b513 r316: don't allocate zero-length memory
It is not a bug, but Electric Fence does not like that.
2013-03-04 10:22:18 -05:00
Heng Li 35fb7f9fdf r315: move kopen.o out of libbwa.a 2013-03-01 11:47:51 -05:00
Heng Li 3e4a178e08 r314: cleanup bwamem API
Don't modify input sequences; more documentations
2013-03-01 11:14:51 -05:00
Heng Li c5434ac865 r313: release bwa-0.7.0 2013-02-28 15:56:05 -05:00
Heng Li 39fcde9c19 updated NEWS further 2013-02-28 00:58:24 -05:00
Heng Li f3cff1c609 r311: even tighter bw for CIGAR 2013-02-27 23:59:50 -05:00
Heng Li a33b9c0633 tighter bw for cigar SW 2013-02-27 23:40:46 -05:00
Heng Li 6a4d8c79d8 r309: bugfix - soft clipping missing in example.c 2013-02-27 22:45:18 -05:00
Heng Li df7c3f0000 r308: added a new API to convert region to CIGAR
and an example program demonstrating how to do single-end alignment in <50
lines of C code.
2013-02-27 22:28:29 -05:00
Heng Li 64d92d26df more documentation in ksw.h 2013-02-27 21:40:46 -05:00
Heng Li 4bb0bdddca r306: introduce clipping penalty
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li b7791105bc r305: in NEWS, convert TAB to space 2013-02-27 16:56:54 -05:00
Heng Li aef179a580 r304: prepare release notes (not released yet) 2013-02-27 16:55:07 -05:00
Heng Li 292e92b602 r303: bugfix - wrong band width when CIGAR 2013-02-27 15:39:15 -05:00