Heng Li
efd9769b07
r324: a little code cleanup
...
The changes after r317 aim to improve the performance and accuracy for very
long query alignment. The short-read alignment should not be affected. The
changes include:
1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this
heuristic only improves speed, but now I realize it also reduces poor
alignment with long good flanking alignments. The difference from blast's
X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not.
2) Band width doubling. When band width is too small, we will get a poor
alignment in the middle. Sometimes such alignments cannot be fully excluded
with Z-dropoff. Band width doubling is an alternative heuristic. It is based
on the observation that the existing of close-to-boundary high score
possibly implies inadequate band width. When we see such a signal, we double
the band width.
2013-03-05 00:57:16 -05:00
Heng Li
e0991d6a45
r323: added Z-dropoff, a variant of blast's X-drop
2013-03-05 00:34:33 -05:00
Heng Li
59bc9341f6
code backup; more changes coming later
2013-03-04 17:29:07 -05:00
Heng Li
35fb7f9fdf
r315: move kopen.o out of libbwa.a
2013-03-01 11:47:51 -05:00
Heng Li
3e4a178e08
r314: cleanup bwamem API
...
Don't modify input sequences; more documentations
2013-03-01 11:14:51 -05:00
Heng Li
6a4d8c79d8
r309: bugfix - soft clipping missing in example.c
2013-02-27 22:45:18 -05:00
Heng Li
df7c3f0000
r308: added a new API to convert region to CIGAR
...
and an example program demonstrating how to do single-end alignment in <50
lines of C code.
2013-02-27 22:28:29 -05:00
Heng Li
4bb0bdddca
r306: introduce clipping penalty
...
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li
c6b226d719
r292: fixed a very stupid bug on CLI
...
I was thinking 0x10 or 16, but wrote 0x16...
2013-02-26 12:49:48 -05:00
Heng Li
20aa848b3c
r279: for PE mapq, consider the number of pairs
...
If there are a lot of proper pairs, it is more likely that the best pair is
wrong.
2013-02-25 13:00:35 -05:00
Heng Li
9957e04590
r278: don't perform too many mate-sw
2013-02-25 11:56:02 -05:00
Heng Li
5ead86acd3
optionally mark split hit as secondary
2013-02-25 11:18:35 -05:00
Heng Li
85775c3384
output multiple hits
2013-02-24 13:23:43 -05:00
Heng Li
6bdccf2a8a
added a bit documentation
2013-02-24 13:09:29 -05:00
Heng Li
ee59a13109
simplified bwamem.h
...
Hide mem_seed_t and mem_chain_t. Don't expose unnecessary routines.
2013-02-24 12:17:29 -05:00
Heng Li
e613195e17
moved some common code to bwa.{c,h}
2013-02-23 15:30:46 -05:00
Heng Li
17c123d65a
pring paired-end SAM
2013-02-22 16:38:48 -05:00
Heng Li
a578688fa8
generate multiple alignments from one chain
2013-02-21 14:58:51 -05:00
Heng Li
54da54ffd4
extend more seeds (and thus slower...)
2013-02-21 12:52:00 -05:00
Heng Li
5626fe29b7
Well, at least output sth
2013-02-20 19:11:44 -05:00
Heng Li
688872fb1b
code backup
2013-02-19 00:50:39 -05:00
Heng Li
66585b7982
code backup
2013-02-18 16:33:06 -05:00
Heng Li
df1ff2b36e
better and proper way to infer orinentation
2013-02-14 12:59:32 -05:00
Heng Li
604e3d8da1
code backup; to upgrade ksw.{c,h}
2013-02-12 16:15:26 -05:00
Heng Li
cd0969332f
keep track of the "parent" of a secondary
2013-02-12 15:52:23 -05:00
Heng Li
22b79b3475
mark primary, instead of dropping secondary
2013-02-12 15:34:44 -05:00
Heng Li
95d18449b3
merge bseq.{h,c} to utils.{h,c}
...
I do not like many small files.
2013-02-12 10:36:15 -05:00
Heng Li
99907c98fb
separated and improved SAM printing code
...
This is for the PE mode. The routines may also be useful for bwa-sw, but
probably I won't change the old code.
2013-02-11 15:29:03 -05:00
Heng Li
59eaf650ac
code backup
2013-02-11 10:59:38 -05:00
Heng Li
829664d6b5
missing identical hits; improved sub_n
2013-02-08 17:55:35 -05:00
Heng Li
b2c7148dc9
consider the number of suboptimal hits
2013-02-08 17:20:44 -05:00
Heng Li
39607065e0
allow more seeds to be seen (thus slower..)
2013-02-08 16:56:28 -05:00
Heng Li
fdb0a7405f
better dealing with microrepeat
2013-02-08 14:46:57 -05:00
Heng Li
1bf1a674a8
minor improvement to mapQ
2013-02-08 13:43:15 -05:00
Heng Li
bfeb37c4de
code backup
2013-02-07 13:29:01 -05:00
Heng Li
5dc398cdef
start to write CLI
2013-02-07 13:13:43 -05:00
Heng Li
5a0b32bfd2
updated to the latest kseq.h
2013-02-06 14:38:40 -05:00
Heng Li
a9292d674d
a bit code cleanup
2013-02-06 13:59:32 -05:00
Heng Li
e65b2096f7
removed useless members
2013-02-06 12:25:49 -05:00
Heng Li
a61288c768
separate CIGAR generation
2013-02-05 21:49:19 -05:00
Heng Li
d6a73c9171
chain filtering apparently working
2013-02-05 00:17:20 -05:00
Heng Li
9d0cdb2d3c
unfinished chain filter
2013-02-04 17:23:06 -05:00
Heng Li
f27bd18f20
check if every seed is included; not used for now
2013-02-04 15:09:47 -05:00
Heng Li
5bfa45a69b
write the mem_aln_t struct
2013-02-04 15:02:56 -05:00
Heng Li
ba18db1a9f
sw extension works for the simplest case
2013-02-04 12:37:38 -05:00
Heng Li
d25a87cc50
code backup
2013-02-02 15:14:24 -05:00
Heng Li
00e5302219
routine to get subsequence from 2-bit pac
2013-02-01 16:39:50 -05:00
Heng Li
f8f3b7577a
code cleanup; added a missing file
2013-02-01 14:38:44 -05:00
Heng Li
620ad6e5b9
reseed long SMEMs
2013-02-01 14:20:38 -05:00
Heng Li
8977737460
basic chaining working
...
Definitely suboptimal in a lot of corner cases...
2013-01-31 16:26:05 -05:00