When backtracking, bwa-short does not keep the detailed alignment or the exact
start and end positions. To find the boundary and the CIGAR, the old code does
a global alignment with a small end-gap penalty. It then deals with a lot of
special cases to derive the right position and CIGAR, which are actually not
always right. It is a mess.
As the new ksw.{c,h} does not support a different end-gap penalty, the old
strategy does not work. But we get something better. The new code finds the
boundaries with ksw_extend(). It is cleaner and gives more accurate CIGAR in
most cases.
Released packages can be downloaded from SourceForge.net:
http://sourceforge.net/projects/bio-bwa/files/
Introduction and FAQ are available at:
http://bio-bwa.sourceforge.net
Manual page at:
http://bio-bwa.sourceforge.net/bwa.shtml
Mailing list:
bio-bwa-help@lists.sourceforge.net
To sign up:
http://sourceforge.net/mail/?group_id=276243
Publications (Open Access):
http://www.ncbi.nlm.nih.gov/pubmed/20080505
http://www.ncbi.nlm.nih.gov/pubmed/19451168
Incomplete list of citations (via HubMed.org):
http://www.hubmed.org/references.cgi?uids=20080505
http://www.hubmed.org/references.cgi?uids=19451168
Related projects:
http://pbwa.sourceforge.net/
http://www.many-core.group.cam.ac.uk/projects/lam.shtml
http://biodoop-seal.sourceforge.net/
http://gitorious.org/bwa-cuda