Heng Li
bde5005f39
r396: er... the new tag is named SA not SP
2013-05-23 12:48:18 -04:00
Heng Li
3d2450ed97
r395: bugfix - hard clipping not applied on revaln
2013-05-23 12:45:14 -04:00
Heng Li
9441bb7f2a
r394: added future plan
2013-05-22 20:02:53 -04:00
Heng Li
9a6abe51b6
r391: better method to resolve xref alignment
...
The old method does not work when the alignment bridges three chr. This may
actually happen often. The new method does not work all the time, either, but
should be better than the old one. It is also simpler, arguably.
2013-05-22 18:57:51 -04:00
Rob Davies
e88529687f
Merge branch 'master' into master_fixes. Merged up to r389.
...
Conflicts:
bwamem.c
kopen.c
2013-04-29 12:09:30 +01:00
Heng Li
1a2bd2cf91
r389: return non-zero upon errors
2013-04-27 10:08:01 -04:00
Heng Li
19cb7cd7ed
r388: cleanup mem_process_seqs() interface
...
Print output outside the function and allow to feed insert size distribution.
2013-04-26 12:31:18 -04:00
Heng Li
8896cb942e
r386: bugfix - samse/pe segfault
...
This happens when a read is aligned across the forward-reverse boundary.
2013-04-24 16:00:02 -04:00
Rob Davies
b3d0a13b32
Merge branch 'master' into master_fixes. Merged up to release bwa-0.7.4-r385.
2013-04-23 17:31:34 +01:00
Heng Li
c14aaad1ce
Released bwa-0.7.4-r385
2013-04-23 11:40:56 -04:00
Heng Li
2f6897c72b
r384: don't compile bwamem-lite by default
2013-04-23 11:27:30 -04:00
Heng Li
78ed00021f
r384: updated NEWS
2013-04-23 11:25:46 -04:00
Rob Davies
4cb5110d03
Merge branch 'master' into master_fixes
2013-04-22 09:51:07 +01:00
Heng Li
f6ae0d4d0f
r382: similar treatment in bwa-sw (see r381)
2013-04-19 17:52:06 -04:00
Heng Li
3f8caef33c
r381: fixed a bug when upper bound < max read len
2013-04-19 17:44:35 -04:00
Heng Li
db7a98636f
r380: er... another compiling error
2013-04-19 12:04:44 -04:00
Heng Li
f0c94d80d1
r379: fixed compiling error
2013-04-19 12:04:00 -04:00
Heng Li
be11e27e12
r378: bugfix - wrong CIGAR
...
This is actually caused by a bug in SSE2-SW, where the query begin may be
smaller than the true one if there is an exact tandem repeat.
2013-04-19 12:00:37 -04:00
Heng Li
2087dc162f
r377: increased unpaired penalty from 9 to 17
...
This leads to more aggressive pairing - more properly paired reads. I have
found a few cases where, for example, read1 is umambiguously mapped to chr20
while its 100bp mate has a perfect match to another chr but has 3 mismatches
and 1 deletion when it is paired with read1 on chr20. With longer reads, it
seems that the chr20 hit is correct, although it is not obvious how this
happened in evolution.
2013-04-17 16:50:20 -04:00
Rob Davies
3dd10bd7db
Merge branch 'master' into master_fixes
2013-04-12 16:20:13 +01:00
Rob Davies
90ecd344ba
Merge branch 'master' into master_fixes. Merged up to master r375.
...
Conflicts:
bwt.c
2013-04-11 11:15:39 +01:00
Heng Li
499cf4c00d
r376: reduce wasteful seed extension
...
mainly for contig alignment
2013-04-10 12:18:56 -04:00
Heng Li
47520134e7
r375: fixed compiling errors by the last change
2013-04-10 11:04:32 -04:00
Heng Li
3d8a8c1e37
r374: fix - clipping penalty not always working
...
This only happens to gaps where mem underestimates the bandwidth without
considering the clipping penalty.
2013-04-10 01:09:37 -04:00
Heng Li
53bb846407
r373: optionally distable mate rescue
2013-04-09 16:13:55 -04:00
Heng Li
d64eaa851d
fixed an issue caused by a Mac/Darwin bug
...
On Mac/Darwin, it is not possible to read >2GB data with one fread().
2013-04-09 15:17:04 -04:00
Heng Li
d7ca0885eb
r371: extend overlapping seeds
...
to avoid misalignment in tandem repeats
2013-04-04 00:43:43 -04:00
Heng Li
1e118e0823
r370: suppress "D" at the end of a cigar
...
This is caused by seeds in tandem repeats, in which case, bwa-mem may not
extend the true seed. The change in this commit is only a temporary cure.
2013-04-03 23:57:19 -04:00
Rob Davies
c89756e2b0
Merge branch 'master' into master_fixes
2013-03-19 12:11:51 +00:00
Heng Li
8437cd4edd
r369: bugfix - segfault caused by the last change
...
Sigh... Even the simplest change can lead to new bugs.
2013-03-19 01:04:57 -04:00
Heng Li
1e3cadbfc2
r368: bugfix - wrong CIGAR when bridging 3 contigs
...
In this case, bwa_fix_xref() will return insane coordinates. The old version
did not check the return status and write wrong CIGAR. This bug only happen to
very short assembly contigs.
2013-03-18 20:49:32 -04:00
Rob Davies
c862a1a396
Merge branch 'master' into master_fixes
2013-03-18 13:35:12 +00:00
Heng Li
9346acde1b
Release bwa-0.7.3a-r367
...
In 0.7.3, the wrong CIGAR bug was only fixed in one scenario, but not fixed
in another corner case.
2013-03-15 21:26:37 -04:00
Heng Li
7dec00c217
Release BWA-0.7.3-r366
2013-03-15 12:51:53 -04:00
Heng Li
dd51177837
r365: bugfix - wrong alignment (right mapping)
...
The bug only happens when there is a 1bp del and 1bp ins which are close to the
end and there are no other substitutions or indels. In this case, bwa mem gave
a wrong band width.
2013-03-15 11:59:05 -04:00
Heng Li
e5355fe3a0
r364: bug in mem pairing (no effect with -A=1)
...
Forgot to adjust for matching score. This bug has no effect when -A takes the
default value.
2013-03-14 22:01:26 -04:00
Rob Davies
cca27c1ef5
Merge branch 'master' into master_fixes
...
Conflicts:
bwamem.c
bwamem_pair.c
example.c
2013-03-13 12:12:28 +00:00
Heng Li
bdf34f6ce7
r363: XA=>XP; output mapQ in XP
...
In BWA, XA gives hits "shadowed" by the primary hit. In BWA-MEM, we output
primary hits only. Primary hits may have non-zero mapping quality.
2013-03-12 09:56:04 -04:00
Heng Li
c29b176cb6
r362: bugfix - occasionally wrong TLEN
...
Use the 0.7.2 way to compute TLEN
2013-03-12 00:14:36 -04:00
Heng Li
aa7cdf4bb3
r361: flag proper pair even if multi-primary
...
Up to here, all the features in my checklist have been implemented.
2013-03-12 00:00:04 -04:00
Heng Li
dab5b17c1a
r360: output alternative primary alignments in XA
2013-03-11 23:43:58 -04:00
Heng Li
6c665189ad
r359: identical output to 0.7.2 (without -a)
2013-03-11 23:16:18 -04:00
Rob Davies
9228e48efd
Merge branch 'master' into master_fixes
...
Conflicts:
Makefile
2013-03-11 13:50:49 +00:00
Heng Li
5581cb9152
Release bwa-0.7.2-r351
...
For the TLEN sign fix. Sorry for the significant bug in 0.7.0/0.7.1
2013-03-09 18:15:41 -05:00
Heng Li
2d01a297fb
Improving 'properly paired' flag.
...
If one end has a low quality tail that happens to have a score-20 hit,
the pair won't be flagged as properly paired because bwa-mem thought it has
multiple hits. By filtering with -T, we won't have this problem.
2013-03-09 18:05:50 -05:00
Heng Li
1d132a546d
Release 0.7.1-r347
2013-03-08 15:30:06 -05:00
Heng Li
66c9783daf
r345: bugfix in mem - wrong mate strand for unmap
...
Received a clean bill from Picard
2013-03-08 13:15:43 -05:00
Heng Li
274c0ac96c
r343: bugfix in mem - wrong mate info for unmap
...
SAM generation is always among the nastiest bits. I would need to refactor at
some point (hardly happening).
2013-03-08 12:40:31 -05:00
Heng Li
017be45407
r342: bugfix in bwasw - AS is off by one
...
but I do not understand why the old code does not have the same problem.
2013-03-08 12:06:45 -05:00
Rob Davies
aabd990e8f
Merge branch 'master' into master_fixes
...
Conflicts:
Makefile
bwape.c
bwase.c
bwtsw2_aux.c
stdaln.c
2013-03-08 16:46:45 +00:00
Heng Li
b5b50ac8da
r341: bugfix - wrong mate position
...
when one end is mapped with a score less than -T. Caused by the -T option.
2013-03-07 21:35:57 -05:00
Heng Li
b0a76884e8
r340: feature freeze; updated the manpage
...
I will stop adding new features to bwa and prepare for the next release. I will
briefly evaluate the variant calling accuracy before the release.
2013-03-07 11:51:23 -05:00
Heng Li
503ca9ed2e
r339: pemerge - expose some settings to CLI
2013-03-07 11:22:19 -05:00
Heng Li
1cadfa1552
r338: pemerge - fixed memory leaks; multithreading
...
pemerge is actually quite slow.
2013-03-07 11:14:52 -05:00
Heng Li
3e3236dfc4
r337: mem - always read even number of reads
...
In the old code, we may read odd number of reads from an interleaved fastq.
2013-03-07 11:00:15 -05:00
Heng Li
72817b664e
r336: fine tuning pemerge
2013-03-06 23:38:07 -05:00
Heng Li
557d50c7e1
r335: fixed a compiling error
...
Caused by the last change
2013-03-06 21:57:13 -05:00
Heng Li
042e1f4442
r334: added pemerge to bwa
2013-03-06 21:55:02 -05:00
Heng Li
5fbd454682
r332: added output threshold
...
Otherwise there are far too many short hits
2013-03-05 22:49:38 -05:00
Heng Li
6476343a83
r331: rewrote CIGAR generation for bwa-short
...
When backtracking, bwa-short does not keep the detailed alignment or the exact
start and end positions. To find the boundary and the CIGAR, the old code does
a global alignment with a small end-gap penalty. It then deals with a lot of
special cases to derive the right position and CIGAR, which are actually not
always right. It is a mess.
As the new ksw.{c,h} does not support a different end-gap penalty, the old
strategy does not work. But we get something better. The new code finds the
boundaries with ksw_extend(). It is cleaner and gives more accurate CIGAR in
most cases.
2013-03-05 19:56:37 -05:00
Heng Li
98f8966750
r329: ditch stdaln.{c,h}; no changes to bwa-mem
...
stdaln.{c,h} was written ten years ago. Its local and SW extension code are
actually buggy (though that rarely happens and usually does not affect the
results too much). ksw.{c,h} is more concise, potentially faster, less buggy,
and richer in features.
2013-03-05 12:00:24 -05:00
Rob Davies
8a078cc16d
Merge branch 'master' into master_fixes
...
Conflicts:
bntseq.c
bwamem.c
2013-03-05 10:21:07 +00:00
Heng Li
efd9769b07
r324: a little code cleanup
...
The changes after r317 aim to improve the performance and accuracy for very
long query alignment. The short-read alignment should not be affected. The
changes include:
1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this
heuristic only improves speed, but now I realize it also reduces poor
alignment with long good flanking alignments. The difference from blast's
X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not.
2) Band width doubling. When band width is too small, we will get a poor
alignment in the middle. Sometimes such alignments cannot be fully excluded
with Z-dropoff. Band width doubling is an alternative heuristic. It is based
on the observation that the existing of close-to-boundary high score
possibly implies inadequate band width. When we see such a signal, we double
the band width.
2013-03-05 00:57:16 -05:00
Heng Li
e0991d6a45
r323: added Z-dropoff, a variant of blast's X-drop
2013-03-05 00:34:33 -05:00
Heng Li
733410b50d
r320: speed up very long sequence alignment
...
100-200bp read alignment should not be affected at all.
2013-03-04 14:43:49 -05:00
Heng Li
7e00dbcac5
r317: bugfix - out-of-range extension
...
This happens when target region crosses the forward-reverse boundary. This will
almost never happen to short-read alignment.
2013-03-04 11:35:23 -05:00
Heng Li
d35f33b513
r316: don't allocate zero-length memory
...
It is not a bug, but Electric Fence does not like that.
2013-03-04 10:22:18 -05:00
Heng Li
35fb7f9fdf
r315: move kopen.o out of libbwa.a
2013-03-01 11:47:51 -05:00
Heng Li
3e4a178e08
r314: cleanup bwamem API
...
Don't modify input sequences; more documentations
2013-03-01 11:14:51 -05:00
Rob Davies
6beab5f765
Merge branch 'master' into master_fixes
...
Merge changes to commit c5434ac (0.7.0 release)
Conflicts:
Makefile
bwamem.c
2013-03-01 10:22:49 +00:00
Rob Davies
3d33ab063e
Merge branch 'master' into master_fixes
...
Merged to master version b621d3a
Conflicts:
Makefile
bntseq.c
bwa.c
bwase.c
bwaseqio.c
bwtaln.c
bwtindex.c
bwtio.c
bwtmisc.c
bwtsw2_aux.c
cs2nt.c
fastmap.c
khash.h
kseq.h
ksw.c
kvec.h
simple_dp.c
utils.c
utils.h
2013-03-01 09:37:46 +00:00
Heng Li
c5434ac865
r313: release bwa-0.7.0
2013-02-28 15:56:05 -05:00
Heng Li
f3cff1c609
r311: even tighter bw for CIGAR
2013-02-27 23:59:50 -05:00
Heng Li
6a4d8c79d8
r309: bugfix - soft clipping missing in example.c
2013-02-27 22:45:18 -05:00
Heng Li
df7c3f0000
r308: added a new API to convert region to CIGAR
...
and an example program demonstrating how to do single-end alignment in <50
lines of C code.
2013-02-27 22:28:29 -05:00
Heng Li
4bb0bdddca
r306: introduce clipping penalty
...
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li
292e92b602
r303: bugfix - wrong band width when CIGAR
2013-02-27 15:39:15 -05:00
Heng Li
e620f0ff4e
r302: updated the manpage
2013-02-27 13:16:22 -05:00
Heng Li
b621d3ae38
r301: left-align indels
...
Don't know why the change is working...
2013-02-27 00:42:19 -05:00
Heng Li
65e099df34
r300: fixed an out-of-boundary bug in rare case
2013-02-27 00:37:17 -05:00
Heng Li
0b533385ef
r299: better way to exclude seed
2013-02-27 00:29:11 -05:00
Heng Li
acd1ab607b
r297: reduce wasteful SW extension
...
This is particularly important for long sequences
2013-02-26 16:26:46 -05:00
Heng Li
98787f0ae0
r295: generate NM
2013-02-26 13:36:01 -05:00
Heng Li
32f2d60a2e
r294: bugfix - -M not working
2013-02-26 13:14:33 -05:00
Heng Li
619ac4f93d
r293: bugfix - wrong RG type in SAM output
2013-02-26 13:03:35 -05:00
Heng Li
c6b226d719
r292: fixed a very stupid bug on CLI
...
I was thinking 0x10 or 16, but wrote 0x16...
2013-02-26 12:49:48 -05:00
Heng Li
bfb2583d7f
r291: summary - bwt.c micro optimization
2013-02-26 12:10:19 -05:00
Heng Li
e70c7c2a71
r284: amend cross-reference hit
...
I really hate this: complex and twisted logic for a nasty scenario that almost
never happens to short reads - but it may become serious when the reference
genome consists of many contigs.
On toy examples, the code seems to work. Don't know if it really works...
2013-02-26 00:03:49 -05:00
Heng Li
61dd3bf13a
r283: prepare for fixing cross-ref aln
2013-02-25 22:49:15 -05:00
Heng Li
77b5b586ad
r282: set min split_len to read length
2013-02-25 17:29:35 -05:00
Heng Li
30cc8a95d1
fixed an unimportant memory leak
2013-02-25 16:34:19 -05:00
Heng Li
d19e834d84
r280: align two ends in the same thread
...
Otherwise odd-number threads may be of different speed from even-number threads.
2013-02-25 15:40:15 -05:00
Heng Li
20aa848b3c
r279: for PE mapq, consider the number of pairs
...
If there are a lot of proper pairs, it is more likely that the best pair is
wrong.
2013-02-25 13:00:35 -05:00
Heng Li
9957e04590
r278: don't perform too many mate-sw
2013-02-25 11:56:02 -05:00
Heng Li
e9e5ee6a3d
r277: updated the revision number
2013-02-25 11:34:06 -05:00
Heng Li
0b4a40dc25
updated revision number; to merge into master
2013-02-24 13:34:20 -05:00
Heng Li
545fb87feb
removed another part related to color-space
2013-02-22 17:15:57 -05:00
Heng Li
6ad5a3c086
removed color-space support
...
which has been broken since 0.6.x
2013-02-12 10:21:17 -05:00
Heng Li
91debf412b
move smem iterators to bwamem.{c,h}
2013-01-31 13:59:48 -05:00
Heng Li
292f9061ab
r132: optionally copy FASTA/Q comment to SAM
2012-10-26 12:54:32 -04:00
Heng Li
3abfd0743a
r131: r128 plus remote changes
2012-06-28 14:52:18 -04:00
Heng Li
f44edd4fc9
r128: more conservative chaining filter
2012-06-28 14:51:02 -04:00
Heng Li
09ee115dcc
r126: release bwa-0.6.2
2012-06-19 13:29:44 -04:00
Heng Li
29ed2d8287
rename the "api" branch as "master"
2012-06-19 13:13:29 -04:00
Heng Li
d97ff6bf72
r124: updated version number
2012-04-17 20:45:07 -04:00
Heng Li
790df95e1a
updated revision number
2012-04-02 11:43:32 -04:00
Heng Li
bdc953cad9
Tim's suggestion suffix file name with .64
2012-03-29 12:22:51 -04:00
Heng Li
91a4a0c8ea
Release bwa-0.6.1
2011-11-28 09:52:07 -05:00
Heng Li
bf65b6463a
fastmap: optionally output the original query seq
2011-11-24 19:44:21 -05:00
Heng Li
b5170e0efa
output the NM tag
2011-11-24 11:51:38 -05:00
Heng Li
196b50dde3
optionally mark multi-part hits as secondary
2011-11-23 23:39:59 -05:00
Heng Li
182cb2e89c
use standard SW when no SSE2
2011-11-19 19:38:21 -05:00
Heng Li
dc4008936c
avoid duplicated XA tags
2011-11-19 14:52:47 -05:00
Heng Li
8f89f55484
fixed a segfault when there are too few good bases.
2011-11-17 22:13:38 -05:00
Heng Li
770a5f2ae0
Release BWA-0.6.0
2011-11-12 20:04:39 -05:00
Heng Li
7544aca718
updated revision number
2011-11-12 16:56:21 -05:00
Heng Li
8060693411
multithreading works again
2011-11-12 16:50:58 -05:00
Heng Li
fa8cfe5567
bugfix: wrong mapping quality
2011-11-12 12:12:45 -05:00
Heng Li
b42910ada6
proper mate information
2011-11-12 00:49:21 -05:00
Heng Li
e06685db45
bwa-sw PE seems working (SAM is incorrect)
2011-11-07 00:51:43 -05:00
Heng Li
673ae4aaf8
throw an error if insufficient memory during index
2011-10-31 13:26:24 -04:00
Heng Li
02946df28a
fixed a off-by-1 bug
2011-10-27 13:55:48 -04:00
Heng Li
7babb54e4c
drop smem based mapping algorithm
...
While we can compute smems very efficiently, there is still a long way to get
the alignment. On simulated data, this smem-based algorithm is 4X faster than
bwasw and twice as fast as bowtie2, but the accuracy is far lower than bwasw
and even lower than bowtie2 in the high-mapQ range. I am kind of sure that if
we continue to increase the mapping accuracy, the speed will approach to bwasw,
if not slower.
Smem-based mapping algorithm is still interesting, but given that I am short of
time, I will not explore it further.
2011-10-27 10:56:09 -04:00
Heng Li
7664795ffb
fixed a minor issue about +/-1
2011-10-25 13:00:41 -04:00
Heng Li
7168f5c10a
updated revision number
2011-10-25 12:50:19 -04:00
Heng Li
22c2252e15
added bidirectional bwt; seems buggy
2011-10-25 00:22:28 -04:00
Heng Li
7b4266a6e5
bugfix: integer overflow and strand error in sampe
2011-10-24 17:07:12 -04:00
Heng Li
b59fd2bf47
fixed an integer overflow
2011-10-24 14:39:57 -04:00
Heng Li
8f3c780552
fixed a potential int overflow
2011-10-24 14:22:39 -04:00
Heng Li
1f970b4557
updated revision number
2011-10-24 14:14:42 -04:00
Heng Li
26b77eabef
updated version number
2011-10-21 12:32:00 -04:00
Heng Li
46123639cf
removed reverse pac; bwa is not working right now
2011-10-20 12:09:35 -04:00
Heng Li
d70754e234
update revision number
2011-10-14 10:32:31 -04:00
Heng Li
72563c38f3
automatically choose the algorithm for BWT
2011-06-09 17:33:25 -04:00
Heng Li
a74523a68d
increase maximum barcode length limit to 63bp
2011-06-09 17:17:13 -04:00
Heng Li
243e735431
applied patches from Alec Wysoker
2011-05-04 09:46:50 -04:00
Heng Li
87664941b0
Release bwa-0.5.9 (r16)
2011-01-24 22:00:24 -05:00
Heng Li
7fd8948689
Added recommendation for PacBio reads
2011-01-22 13:20:11 -05:00
Heng Li
1d7d8be9e8
Put BC: to both ends
2011-01-18 20:16:57 -05:00
Heng Li
51d354cd28
Added barcode support
2011-01-15 15:35:39 -05:00
Heng Li
10721ca602
Added an option to accept Illumina 1.3+ fastq
2011-01-15 14:07:08 -05:00
Heng Li
f335b33624
fixed a bug in bwase: no RG for unmapped read pairs
2011-01-15 10:32:45 -05:00
Heng Li
5e30884730
Update to the latest modfication 0.5.9rc1-2. Update ChangeLog
2011-01-13 20:54:10 -05:00
Heng Li
007c3eb75d
Imported from my local bwa repository, the master repository.
2011-01-13 20:52:12 -05:00