Rob Davies
4cb5110d03
Merge branch 'master' into master_fixes
2013-04-22 09:51:07 +01:00
Heng Li
2087dc162f
r377: increased unpaired penalty from 9 to 17
...
This leads to more aggressive pairing - more properly paired reads. I have
found a few cases where, for example, read1 is umambiguously mapped to chr20
while its 100bp mate has a perfect match to another chr but has 3 mismatches
and 1 deletion when it is paired with read1 on chr20. With longer reads, it
seems that the chr20 hit is correct, although it is not obvious how this
happened in evolution.
2013-04-17 16:50:20 -04:00
Rob Davies
3dd10bd7db
Merge branch 'master' into master_fixes
2013-04-12 16:20:13 +01:00
Rob Davies
90ecd344ba
Merge branch 'master' into master_fixes. Merged up to master r375.
...
Conflicts:
bwt.c
2013-04-11 11:15:39 +01:00
Heng Li
499cf4c00d
r376: reduce wasteful seed extension
...
mainly for contig alignment
2013-04-10 12:18:56 -04:00
Heng Li
3d8a8c1e37
r374: fix - clipping penalty not always working
...
This only happens to gaps where mem underestimates the bandwidth without
considering the clipping penalty.
2013-04-10 01:09:37 -04:00
Heng Li
d7ca0885eb
r371: extend overlapping seeds
...
to avoid misalignment in tandem repeats
2013-04-04 00:43:43 -04:00
Heng Li
1e118e0823
r370: suppress "D" at the end of a cigar
...
This is caused by seeds in tandem repeats, in which case, bwa-mem may not
extend the true seed. The change in this commit is only a temporary cure.
2013-04-03 23:57:19 -04:00
Rob Davies
c89756e2b0
Merge branch 'master' into master_fixes
2013-03-19 12:11:51 +00:00
Heng Li
8437cd4edd
r369: bugfix - segfault caused by the last change
...
Sigh... Even the simplest change can lead to new bugs.
2013-03-19 01:04:57 -04:00
Heng Li
1e3cadbfc2
r368: bugfix - wrong CIGAR when bridging 3 contigs
...
In this case, bwa_fix_xref() will return insane coordinates. The old version
did not check the return status and write wrong CIGAR. This bug only happen to
very short assembly contigs.
2013-03-18 20:49:32 -04:00
Rob Davies
c862a1a396
Merge branch 'master' into master_fixes
2013-03-18 13:35:12 +00:00
Heng Li
9346acde1b
Release bwa-0.7.3a-r367
...
In 0.7.3, the wrong CIGAR bug was only fixed in one scenario, but not fixed
in another corner case.
2013-03-15 21:26:37 -04:00
Heng Li
dd51177837
r365: bugfix - wrong alignment (right mapping)
...
The bug only happens when there is a 1bp del and 1bp ins which are close to the
end and there are no other substitutions or indels. In this case, bwa mem gave
a wrong band width.
2013-03-15 11:59:05 -04:00
Rob Davies
cca27c1ef5
Merge branch 'master' into master_fixes
...
Conflicts:
bwamem.c
bwamem_pair.c
example.c
2013-03-13 12:12:28 +00:00
Heng Li
bdf34f6ce7
r363: XA=>XP; output mapQ in XP
...
In BWA, XA gives hits "shadowed" by the primary hit. In BWA-MEM, we output
primary hits only. Primary hits may have non-zero mapping quality.
2013-03-12 09:56:04 -04:00
Heng Li
c29b176cb6
r362: bugfix - occasionally wrong TLEN
...
Use the 0.7.2 way to compute TLEN
2013-03-12 00:14:36 -04:00
Heng Li
dab5b17c1a
r360: output alternative primary alignments in XA
2013-03-11 23:43:58 -04:00
Heng Li
6c665189ad
r359: identical output to 0.7.2 (without -a)
2013-03-11 23:16:18 -04:00
Heng Li
0f88103d2a
SAM almost identical to 0.7.2
2013-03-11 23:01:51 -04:00
Heng Li
26f4c704ed
drop the old SAM writer
2013-03-11 22:24:54 -04:00
Heng Li
ebb45dc42e
new code works for SE
2013-03-11 21:59:15 -04:00
Heng Li
c7edaa8e84
to test the new sam writer...
2013-03-11 21:55:52 -04:00
Heng Li
47952b6f3f
drop an unnecessary member from mem_aln_t
2013-03-11 21:35:32 -04:00
Heng Li
8f0d439913
prepare to replace the SAM printing code
...
This move is dangerous as SAM printing is very complex, but it will benefit in
the long run. The planned change will reduce the redundancy, improves clarity
and most importantly makes it much easier to output multiple primary hits in an
optional tag.
2013-03-11 21:25:17 -04:00
Rob Davies
9228e48efd
Merge branch 'master' into master_fixes
...
Conflicts:
Makefile
2013-03-11 13:50:49 +00:00
Heng Li
9ea7f83974
Emergent bugfix: wrong TLEN sign
...
It is interesting that Picard did not find the issue.
2013-03-09 18:03:15 -05:00
Heng Li
66c9783daf
r345: bugfix in mem - wrong mate strand for unmap
...
Received a clean bill from Picard
2013-03-08 13:15:43 -05:00
Heng Li
af7b4d8980
gcc wrongly thinks a variable may be uninitialized
...
It should always be initialized. To avoid a warning, made a change.
2013-03-08 12:45:50 -05:00
Heng Li
274c0ac96c
r343: bugfix in mem - wrong mate info for unmap
...
SAM generation is always among the nastiest bits. I would need to refactor at
some point (hardly happening).
2013-03-08 12:40:31 -05:00
Rob Davies
aabd990e8f
Merge branch 'master' into master_fixes
...
Conflicts:
Makefile
bwape.c
bwase.c
bwtsw2_aux.c
stdaln.c
2013-03-08 16:46:45 +00:00
Heng Li
5fbd454682
r332: added output threshold
...
Otherwise there are far too many short hits
2013-03-05 22:49:38 -05:00
Heng Li
07921659cf
move mem_fill_scmat() to bwa.{h,c}
2013-03-05 09:38:12 -05:00
Rob Davies
8a078cc16d
Merge branch 'master' into master_fixes
...
Conflicts:
bntseq.c
bwamem.c
2013-03-05 10:21:07 +00:00
Heng Li
efd9769b07
r324: a little code cleanup
...
The changes after r317 aim to improve the performance and accuracy for very
long query alignment. The short-read alignment should not be affected. The
changes include:
1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this
heuristic only improves speed, but now I realize it also reduces poor
alignment with long good flanking alignments. The difference from blast's
X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not.
2) Band width doubling. When band width is too small, we will get a poor
alignment in the middle. Sometimes such alignments cannot be fully excluded
with Z-dropoff. Band width doubling is an alternative heuristic. It is based
on the observation that the existing of close-to-boundary high score
possibly implies inadequate band width. When we see such a signal, we double
the band width.
2013-03-05 00:57:16 -05:00
Heng Li
e0991d6a45
r323: added Z-dropoff, a variant of blast's X-drop
2013-03-05 00:34:33 -05:00
Heng Li
d6096c3f99
bugfix: caused by the latest change
2013-03-04 18:41:57 -05:00
Heng Li
59bc9341f6
code backup; more changes coming later
2013-03-04 17:29:07 -05:00
Heng Li
733410b50d
r320: speed up very long sequence alignment
...
100-200bp read alignment should not be affected at all.
2013-03-04 14:43:49 -05:00
Heng Li
40f1214736
change to debugging code only
2013-03-04 11:52:11 -05:00
Heng Li
7e00dbcac5
r317: bugfix - out-of-range extension
...
This happens when target region crosses the forward-reverse boundary. This will
almost never happen to short-read alignment.
2013-03-04 11:35:23 -05:00
Heng Li
3e4a178e08
r314: cleanup bwamem API
...
Don't modify input sequences; more documentations
2013-03-01 11:14:51 -05:00
Rob Davies
6beab5f765
Merge branch 'master' into master_fixes
...
Merge changes to commit c5434ac (0.7.0 release)
Conflicts:
Makefile
bwamem.c
2013-03-01 10:22:49 +00:00
Rob Davies
3d33ab063e
Merge branch 'master' into master_fixes
...
Merged to master version b621d3a
Conflicts:
Makefile
bntseq.c
bwa.c
bwase.c
bwaseqio.c
bwtaln.c
bwtindex.c
bwtio.c
bwtmisc.c
bwtsw2_aux.c
cs2nt.c
fastmap.c
khash.h
kseq.h
ksw.c
kvec.h
simple_dp.c
utils.c
utils.h
2013-03-01 09:37:46 +00:00
Heng Li
f3cff1c609
r311: even tighter bw for CIGAR
2013-02-27 23:59:50 -05:00
Heng Li
a33b9c0633
tighter bw for cigar SW
2013-02-27 23:40:46 -05:00
Heng Li
6a4d8c79d8
r309: bugfix - soft clipping missing in example.c
2013-02-27 22:45:18 -05:00
Heng Li
df7c3f0000
r308: added a new API to convert region to CIGAR
...
and an example program demonstrating how to do single-end alignment in <50
lines of C code.
2013-02-27 22:28:29 -05:00
Heng Li
4bb0bdddca
r306: introduce clipping penalty
...
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li
65e099df34
r300: fixed an out-of-boundary bug in rare case
2013-02-27 00:37:17 -05:00