Heng Li
47952b6f3f
drop an unnecessary member from mem_aln_t
2013-03-11 21:35:32 -04:00
Heng Li
8f0d439913
prepare to replace the SAM printing code
...
This move is dangerous as SAM printing is very complex, but it will benefit in
the long run. The planned change will reduce the redundancy, improves clarity
and most importantly makes it much easier to output multiple primary hits in an
optional tag.
2013-03-11 21:25:17 -04:00
Heng Li
5fbd454682
r332: added output threshold
...
Otherwise there are far too many short hits
2013-03-05 22:49:38 -05:00
Heng Li
efd9769b07
r324: a little code cleanup
...
The changes after r317 aim to improve the performance and accuracy for very
long query alignment. The short-read alignment should not be affected. The
changes include:
1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this
heuristic only improves speed, but now I realize it also reduces poor
alignment with long good flanking alignments. The difference from blast's
X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not.
2) Band width doubling. When band width is too small, we will get a poor
alignment in the middle. Sometimes such alignments cannot be fully excluded
with Z-dropoff. Band width doubling is an alternative heuristic. It is based
on the observation that the existing of close-to-boundary high score
possibly implies inadequate band width. When we see such a signal, we double
the band width.
2013-03-05 00:57:16 -05:00
Heng Li
e0991d6a45
r323: added Z-dropoff, a variant of blast's X-drop
2013-03-05 00:34:33 -05:00
Heng Li
59bc9341f6
code backup; more changes coming later
2013-03-04 17:29:07 -05:00
Heng Li
35fb7f9fdf
r315: move kopen.o out of libbwa.a
2013-03-01 11:47:51 -05:00
Heng Li
3e4a178e08
r314: cleanup bwamem API
...
Don't modify input sequences; more documentations
2013-03-01 11:14:51 -05:00
Heng Li
6a4d8c79d8
r309: bugfix - soft clipping missing in example.c
2013-02-27 22:45:18 -05:00
Heng Li
df7c3f0000
r308: added a new API to convert region to CIGAR
...
and an example program demonstrating how to do single-end alignment in <50
lines of C code.
2013-02-27 22:28:29 -05:00
Heng Li
4bb0bdddca
r306: introduce clipping penalty
...
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li
c6b226d719
r292: fixed a very stupid bug on CLI
...
I was thinking 0x10 or 16, but wrote 0x16...
2013-02-26 12:49:48 -05:00
Heng Li
20aa848b3c
r279: for PE mapq, consider the number of pairs
...
If there are a lot of proper pairs, it is more likely that the best pair is
wrong.
2013-02-25 13:00:35 -05:00
Heng Li
9957e04590
r278: don't perform too many mate-sw
2013-02-25 11:56:02 -05:00
Heng Li
5ead86acd3
optionally mark split hit as secondary
2013-02-25 11:18:35 -05:00
Heng Li
85775c3384
output multiple hits
2013-02-24 13:23:43 -05:00
Heng Li
6bdccf2a8a
added a bit documentation
2013-02-24 13:09:29 -05:00
Heng Li
ee59a13109
simplified bwamem.h
...
Hide mem_seed_t and mem_chain_t. Don't expose unnecessary routines.
2013-02-24 12:17:29 -05:00
Heng Li
e613195e17
moved some common code to bwa.{c,h}
2013-02-23 15:30:46 -05:00
Heng Li
17c123d65a
pring paired-end SAM
2013-02-22 16:38:48 -05:00
Heng Li
a578688fa8
generate multiple alignments from one chain
2013-02-21 14:58:51 -05:00
Heng Li
54da54ffd4
extend more seeds (and thus slower...)
2013-02-21 12:52:00 -05:00
Heng Li
5626fe29b7
Well, at least output sth
2013-02-20 19:11:44 -05:00
Heng Li
688872fb1b
code backup
2013-02-19 00:50:39 -05:00
Heng Li
66585b7982
code backup
2013-02-18 16:33:06 -05:00
Heng Li
df1ff2b36e
better and proper way to infer orinentation
2013-02-14 12:59:32 -05:00
Heng Li
604e3d8da1
code backup; to upgrade ksw.{c,h}
2013-02-12 16:15:26 -05:00
Heng Li
cd0969332f
keep track of the "parent" of a secondary
2013-02-12 15:52:23 -05:00
Heng Li
22b79b3475
mark primary, instead of dropping secondary
2013-02-12 15:34:44 -05:00
Heng Li
95d18449b3
merge bseq.{h,c} to utils.{h,c}
...
I do not like many small files.
2013-02-12 10:36:15 -05:00
Heng Li
99907c98fb
separated and improved SAM printing code
...
This is for the PE mode. The routines may also be useful for bwa-sw, but
probably I won't change the old code.
2013-02-11 15:29:03 -05:00
Heng Li
59eaf650ac
code backup
2013-02-11 10:59:38 -05:00
Heng Li
829664d6b5
missing identical hits; improved sub_n
2013-02-08 17:55:35 -05:00
Heng Li
b2c7148dc9
consider the number of suboptimal hits
2013-02-08 17:20:44 -05:00
Heng Li
39607065e0
allow more seeds to be seen (thus slower..)
2013-02-08 16:56:28 -05:00
Heng Li
fdb0a7405f
better dealing with microrepeat
2013-02-08 14:46:57 -05:00
Heng Li
1bf1a674a8
minor improvement to mapQ
2013-02-08 13:43:15 -05:00
Heng Li
bfeb37c4de
code backup
2013-02-07 13:29:01 -05:00
Heng Li
5dc398cdef
start to write CLI
2013-02-07 13:13:43 -05:00
Heng Li
5a0b32bfd2
updated to the latest kseq.h
2013-02-06 14:38:40 -05:00
Heng Li
a9292d674d
a bit code cleanup
2013-02-06 13:59:32 -05:00
Heng Li
e65b2096f7
removed useless members
2013-02-06 12:25:49 -05:00
Heng Li
a61288c768
separate CIGAR generation
2013-02-05 21:49:19 -05:00
Heng Li
d6a73c9171
chain filtering apparently working
2013-02-05 00:17:20 -05:00
Heng Li
9d0cdb2d3c
unfinished chain filter
2013-02-04 17:23:06 -05:00
Heng Li
f27bd18f20
check if every seed is included; not used for now
2013-02-04 15:09:47 -05:00
Heng Li
5bfa45a69b
write the mem_aln_t struct
2013-02-04 15:02:56 -05:00
Heng Li
ba18db1a9f
sw extension works for the simplest case
2013-02-04 12:37:38 -05:00
Heng Li
d25a87cc50
code backup
2013-02-02 15:14:24 -05:00
Heng Li
00e5302219
routine to get subsequence from 2-bit pac
2013-02-01 16:39:50 -05:00
Heng Li
f8f3b7577a
code cleanup; added a missing file
2013-02-01 14:38:44 -05:00
Heng Li
620ad6e5b9
reseed long SMEMs
2013-02-01 14:20:38 -05:00
Heng Li
8977737460
basic chaining working
...
Definitely suboptimal in a lot of corner cases...
2013-01-31 16:26:05 -05:00
Heng Li
6c19c9640c
code backup
2013-01-31 15:55:22 -05:00
Heng Li
91debf412b
move smem iterators to bwamem.{c,h}
2013-01-31 13:59:48 -05:00