Heng Li
ccbbe48c4f
dev-470: don't stop on bwa_fix_xref2() failures
...
Peter Field has sent me an example caused by an alignment bridging three
adjacent chromosomes/contigs. Bwa-mem always aligns the query to the contig
covering the middle point of the alignment. In this example, it chooses the
middle contig, which should not be aligned. This leads to weird things failing
bwa_fix_xref2(), which cannot be fixed unless we build the contig boundaries
into the FM-index.
In the old code, bwa-mem halts when bwa_fix_xref2() fails. With this commit,
bwa-mem will give a warning instead of halting.
2014-04-10 11:43:17 -04:00
Heng Li
8220008564
an attempt to layout tool
2014-04-09 16:11:52 -04:00
Heng Li
db58392e9b
dev-469: fixed wrong command line prompt
2014-04-09 13:20:04 -04:00
Heng Li
d766591c1e
dev-468: fixed a segfault caused by NULL
2014-04-08 22:11:36 -04:00
Heng Li
99f6f9a0d1
dev-467: limit the max #chains to extend
2014-04-08 21:45:49 -04:00
Heng Li
c0a308a8b6
dev-466: simplified chain filtering
2014-04-08 17:33:07 -04:00
Heng Li
f12dfae772
dev-465: a new output format for read overlap
...
Also moved a few functions to bwamem_extra.c. File bwamem.c is becoming far too
long.
2014-04-08 16:29:36 -04:00
Heng Li
b45aeb87e1
dev-464: preset for pacbio read2read aln
2014-04-08 11:40:54 -04:00
Heng Li
172ba83241
dev-463: added option -x to change multiple params
...
I hate to copy-paste long command line options.
2014-04-07 11:29:36 -04:00
Heng Li
114901b005
dev-r462: refined setting for PacBio; weight flt
...
The recommended setting in the last commit is wrong. If we can extend a random
seed hit to the full length, we will force the read aligned through break
points, which is wrong. The new setting is better but it may lead to a small
fraction of fragmented alignments.
In addition, I added a filter on the minimum chain weight and tied
min_HSP_score to this filter. It doubles the mapping speed.
2014-04-04 17:01:04 -04:00
Heng Li
41f720dfa7
dev-461: added a heuristic for PacBio data
...
See the comment above mem_test_chain_sw() for details.
2014-04-04 16:05:41 -04:00
Heng Li
066ec4aa95
dev-460: disallow a cigar 20M2D2I30M in extension
...
Global alignment does not allow contiguous insertions and deletions, but local
alignment and extension allow such CIGARs. The optimal global alignment may
have a lower score than extension, which actually happens often for PacBio
data. This commit disallows a CIGAR like 20M2D2I30M to fix this inconsistency.
Local alignment has not been changed.
2014-04-04 10:44:34 -04:00
Heng Li
b6bd33b26c
dev-459: don't hard code the drop ratio
...
In the old code, if a secondary alignment is 50% worse, it won't be outputted.
2014-04-03 18:58:49 -04:00
Heng Li
b3225581be
dev-458: simplified the smem iterator
...
simpler but less powful.
2014-04-03 15:23:48 -04:00
Heng Li
acfe7613db
dev-457: separated interval collection and seeding
2014-04-03 15:10:50 -04:00
Heng Li
3efb7c0e91
r455: release bwa-0.7.8
2014-03-31 15:27:23 -04:00
Heng Li
127c00cc96
dev-454: wording change in command line prompt
2014-03-31 12:03:27 -04:00
Heng Li
b27bdf1ae0
dev-453: change of -A scales -TdBOELU
...
These paramemters are all proportional to -A.
2014-03-31 11:52:52 -04:00
Heng Li
b7076d9023
dev-r452: allow to specify insert size at cmd
...
This is also very useful for debugging.
2014-03-31 11:21:03 -04:00
Heng Li
417c6d66c7
dev-r451: fixed a few bugs when -A!=1
...
Something is still wrong.
2014-03-31 10:52:45 -04:00
Heng Li
9ce50a4e5e
dev-450: support diff ins/del penalties. NO TEST!!
2014-03-28 14:54:06 -04:00
Heng Li
578bb55c38
dev-449: unequal ins/del in global() and extend()
2014-03-28 14:15:38 -04:00
Heng Li
0c783399e8
dev-448: different ins/del penalties
2014-03-28 10:54:23 -04:00
Heng Li
2e9463ebf1
dev-r442: suppress exact full-length matches
2014-02-26 22:04:19 -05:00
Heng Li
1c19bc630f
Released bwa-0.7.7-r441
2014-02-25 01:05:37 -05:00
Heng Li
e879817373
r440: a condition not work due to a typo
2014-02-20 13:06:40 -05:00
Heng Li
ce026a07fc
r439: expose mem_opt_t::max_matesw
2014-02-19 13:10:33 -05:00
Heng Li
17fb85a227
r438: still an issue in MD
...
It occurs when the global alignment disagrees with the local alignment.
2014-02-19 11:31:54 -05:00
Heng Li
52391a9855
r437: print timing for each batch of reads
2014-02-19 10:54:26 -05:00
Heng Li
bdd14d2946
r436: fix rare MD/NM-CIGAR inconsistencies
2014-02-19 10:08:43 -05:00
Heng Li
4adc34eccb
r435: bugfix - base not complemented on the rev
2014-02-18 10:32:24 -05:00
Heng Li
14aa43cca0
r434: added the missing bwasw/aln commands!
2014-02-12 15:39:02 -05:00
Heng Li
7c50bad567
Release bwa-0.7.6a-r433
2014-01-31 12:58:21 -05:00
Heng Li
5fdab3ae13
Released bwa-0.7.6-r432
2014-01-31 11:12:59 -05:00
Heng Li
f524c7d3d8
r431: added the MD tag to bwa-mem
2014-01-29 12:05:11 -05:00
Heng Li
ea3dc2f003
r430: fix a bug producing incorrect alignment
...
Ksw uses two rounds of SSE2-SW to find the boundaries of an alignment. If the
second round gives a different score from the first round, it will fail. The
fix checks if this happens, though I have not dig into an example to understand
why this may happen in the first place.
2014-01-29 10:51:02 -05:00
Bradford Powell
c26ba4e376
fix duplicate PG lines in bwape and bwase
2014-01-05 14:54:48 -05:00
Heng Li
10cb6b0507
r428: allow to change the default chain_drop_ratio
2013-12-30 16:18:45 -05:00
Heng Li
f70d80a5a2
r427: fixed bugs in backtrack
...
See comments in ksw_global() for details.
2013-12-30 15:40:18 -05:00
Heng Li
8b6ec74907
r424: fixed a bw bug in samse/pe
2013-11-25 15:48:04 -05:00
Heng Li
4219e58623
r423: bugfix - SE hits not random
2013-11-23 09:36:26 -05:00
Heng Li
29aa855432
r422: matesw hits not sorted
2013-11-21 14:43:50 -05:00
Heng Li
ff4762f3c7
r421: bw doubling in the final alignment
...
In some cases, the band width used in the final alignment needs to be larger
than the band width in extension.
2013-11-20 10:04:16 -05:00
Heng Li
6e3fa0515a
r420: inferred bandwidth is not used in the final
2013-11-20 09:50:46 -05:00
Heng Li
ff6faf811a
r419: print the @PG line
2013-11-19 11:08:45 -05:00
Heng Li
deb19593aa
r418: use the new mapQ estimator by default
2013-11-02 12:25:53 -04:00
Heng Li
c564653b40
r416: removed a line of debugging code
2013-09-12 10:41:43 -04:00
Heng Li
7144a0cefc
r415: bug in the new (optional) mapQ computation
...
I may use the new method as the default. Testing needed.
2013-09-09 17:51:05 -04:00
Heng Li
ebb7b02e9b
r414: fixed a bug caused by the last commit
2013-09-09 16:57:55 -04:00
Heng Li
b51a66e4c1
r413: fixed an issue causing redundant alignment
...
I have seen a fosmid aligned to the same position but with two slightly
different CIGARs: 30000M and 29900M50D100M, possibly caused by tandem repeats.
0.7.5a will regard them as two distinct alignments and generates a very small
mapping quality. However, these two are essentially the same. Although there is
ambiguity in aligning the end of the fosmid, we should not penalize the entire
alignment with a small mapQ. This commit fixes this issue. More testing is
needed, though.
2013-09-09 11:36:50 -04:00
Heng Li
1346f03ff1
use the old mapQ by default
...
the new mapQ overestimate
2013-09-06 14:04:41 -04:00
Heng Li
ed78df9184
Merge branch 'master' into clip2
2013-08-28 16:00:34 -04:00
Heng Li
3b84c03c1e
r406: allow to use diff clipping penalties
...
for 5'-end or for 3'-end
2013-08-28 15:59:05 -04:00
John Marshall
b88718d8f4
Reformat note for 80 columns, and fix typo
2013-06-14 14:03:08 +01:00
Heng Li
7ec8b5c9e7
Release bwa-0.7.5a
2013-05-30 16:20:16 -04:00
Heng Li
ef18cb91cb
Release bwa-0.7.5-r404
2013-05-29 11:49:08 -04:00
Heng Li
73619754f8
r401: bugfix - forgot to change sampe
...
some changes to samse should also be applied to sampe
2013-05-27 22:24:35 -04:00
Heng Li
599e840779
r397: multi changes/bugfixes to bwa-backtrack
...
1. Check .sai versioning
2. Keep track of #ins and #del during backtrack
3. Use info above to get accurate aligned regions; don't call SW extension any more
4. Identify alignment crossing the for-rev boundary
5. Fixed a bug in printing the XA tag: ungapped alignments missing
2013-05-24 16:28:18 -04:00
Heng Li
bde5005f39
r396: er... the new tag is named SA not SP
2013-05-23 12:48:18 -04:00
Heng Li
3d2450ed97
r395: bugfix - hard clipping not applied on revaln
2013-05-23 12:45:14 -04:00
Heng Li
9441bb7f2a
r394: added future plan
2013-05-22 20:02:53 -04:00
Heng Li
9a6abe51b6
r391: better method to resolve xref alignment
...
The old method does not work when the alignment bridges three chr. This may
actually happen often. The new method does not work all the time, either, but
should be better than the old one. It is also simpler, arguably.
2013-05-22 18:57:51 -04:00
Rob Davies
e88529687f
Merge branch 'master' into master_fixes. Merged up to r389.
...
Conflicts:
bwamem.c
kopen.c
2013-04-29 12:09:30 +01:00
Heng Li
1a2bd2cf91
r389: return non-zero upon errors
2013-04-27 10:08:01 -04:00
Heng Li
19cb7cd7ed
r388: cleanup mem_process_seqs() interface
...
Print output outside the function and allow to feed insert size distribution.
2013-04-26 12:31:18 -04:00
Heng Li
8896cb942e
r386: bugfix - samse/pe segfault
...
This happens when a read is aligned across the forward-reverse boundary.
2013-04-24 16:00:02 -04:00
Rob Davies
b3d0a13b32
Merge branch 'master' into master_fixes. Merged up to release bwa-0.7.4-r385.
2013-04-23 17:31:34 +01:00
Heng Li
c14aaad1ce
Released bwa-0.7.4-r385
2013-04-23 11:40:56 -04:00
Heng Li
2f6897c72b
r384: don't compile bwamem-lite by default
2013-04-23 11:27:30 -04:00
Heng Li
78ed00021f
r384: updated NEWS
2013-04-23 11:25:46 -04:00
Rob Davies
4cb5110d03
Merge branch 'master' into master_fixes
2013-04-22 09:51:07 +01:00
Heng Li
f6ae0d4d0f
r382: similar treatment in bwa-sw (see r381)
2013-04-19 17:52:06 -04:00
Heng Li
3f8caef33c
r381: fixed a bug when upper bound < max read len
2013-04-19 17:44:35 -04:00
Heng Li
db7a98636f
r380: er... another compiling error
2013-04-19 12:04:44 -04:00
Heng Li
f0c94d80d1
r379: fixed compiling error
2013-04-19 12:04:00 -04:00
Heng Li
be11e27e12
r378: bugfix - wrong CIGAR
...
This is actually caused by a bug in SSE2-SW, where the query begin may be
smaller than the true one if there is an exact tandem repeat.
2013-04-19 12:00:37 -04:00
Heng Li
2087dc162f
r377: increased unpaired penalty from 9 to 17
...
This leads to more aggressive pairing - more properly paired reads. I have
found a few cases where, for example, read1 is umambiguously mapped to chr20
while its 100bp mate has a perfect match to another chr but has 3 mismatches
and 1 deletion when it is paired with read1 on chr20. With longer reads, it
seems that the chr20 hit is correct, although it is not obvious how this
happened in evolution.
2013-04-17 16:50:20 -04:00
Rob Davies
3dd10bd7db
Merge branch 'master' into master_fixes
2013-04-12 16:20:13 +01:00
Rob Davies
90ecd344ba
Merge branch 'master' into master_fixes. Merged up to master r375.
...
Conflicts:
bwt.c
2013-04-11 11:15:39 +01:00
Heng Li
499cf4c00d
r376: reduce wasteful seed extension
...
mainly for contig alignment
2013-04-10 12:18:56 -04:00
Heng Li
47520134e7
r375: fixed compiling errors by the last change
2013-04-10 11:04:32 -04:00
Heng Li
3d8a8c1e37
r374: fix - clipping penalty not always working
...
This only happens to gaps where mem underestimates the bandwidth without
considering the clipping penalty.
2013-04-10 01:09:37 -04:00
Heng Li
53bb846407
r373: optionally distable mate rescue
2013-04-09 16:13:55 -04:00
Heng Li
d64eaa851d
fixed an issue caused by a Mac/Darwin bug
...
On Mac/Darwin, it is not possible to read >2GB data with one fread().
2013-04-09 15:17:04 -04:00
Heng Li
d7ca0885eb
r371: extend overlapping seeds
...
to avoid misalignment in tandem repeats
2013-04-04 00:43:43 -04:00
Heng Li
1e118e0823
r370: suppress "D" at the end of a cigar
...
This is caused by seeds in tandem repeats, in which case, bwa-mem may not
extend the true seed. The change in this commit is only a temporary cure.
2013-04-03 23:57:19 -04:00
Rob Davies
c89756e2b0
Merge branch 'master' into master_fixes
2013-03-19 12:11:51 +00:00
Heng Li
8437cd4edd
r369: bugfix - segfault caused by the last change
...
Sigh... Even the simplest change can lead to new bugs.
2013-03-19 01:04:57 -04:00
Heng Li
1e3cadbfc2
r368: bugfix - wrong CIGAR when bridging 3 contigs
...
In this case, bwa_fix_xref() will return insane coordinates. The old version
did not check the return status and write wrong CIGAR. This bug only happen to
very short assembly contigs.
2013-03-18 20:49:32 -04:00
Rob Davies
c862a1a396
Merge branch 'master' into master_fixes
2013-03-18 13:35:12 +00:00
Heng Li
9346acde1b
Release bwa-0.7.3a-r367
...
In 0.7.3, the wrong CIGAR bug was only fixed in one scenario, but not fixed
in another corner case.
2013-03-15 21:26:37 -04:00
Heng Li
7dec00c217
Release BWA-0.7.3-r366
2013-03-15 12:51:53 -04:00
Heng Li
dd51177837
r365: bugfix - wrong alignment (right mapping)
...
The bug only happens when there is a 1bp del and 1bp ins which are close to the
end and there are no other substitutions or indels. In this case, bwa mem gave
a wrong band width.
2013-03-15 11:59:05 -04:00
Heng Li
e5355fe3a0
r364: bug in mem pairing (no effect with -A=1)
...
Forgot to adjust for matching score. This bug has no effect when -A takes the
default value.
2013-03-14 22:01:26 -04:00
Rob Davies
cca27c1ef5
Merge branch 'master' into master_fixes
...
Conflicts:
bwamem.c
bwamem_pair.c
example.c
2013-03-13 12:12:28 +00:00
Heng Li
bdf34f6ce7
r363: XA=>XP; output mapQ in XP
...
In BWA, XA gives hits "shadowed" by the primary hit. In BWA-MEM, we output
primary hits only. Primary hits may have non-zero mapping quality.
2013-03-12 09:56:04 -04:00
Heng Li
c29b176cb6
r362: bugfix - occasionally wrong TLEN
...
Use the 0.7.2 way to compute TLEN
2013-03-12 00:14:36 -04:00
Heng Li
aa7cdf4bb3
r361: flag proper pair even if multi-primary
...
Up to here, all the features in my checklist have been implemented.
2013-03-12 00:00:04 -04:00
Heng Li
dab5b17c1a
r360: output alternative primary alignments in XA
2013-03-11 23:43:58 -04:00
Heng Li
6c665189ad
r359: identical output to 0.7.2 (without -a)
2013-03-11 23:16:18 -04:00