Commit Graph

131 Commits (08764215c6615ea52894e1ce9cd10d2a2faa37a6)

Author SHA1 Message Date
Heng Li 4219e58623 r423: bugfix - SE hits not random 2013-11-23 09:36:26 -05:00
Heng Li c564653b40 r416: removed a line of debugging code 2013-09-12 10:41:43 -04:00
Heng Li 623da055e1 alternative way to estimate mapQ
the old mapQ estimate is too conservative
2013-09-06 12:31:47 -04:00
Heng Li ed78df9184 Merge branch 'master' into clip2 2013-08-28 16:00:34 -04:00
Heng Li 3b84c03c1e r406: allow to use diff clipping penalties
for 5'-end or for 3'-end
2013-08-28 15:59:05 -04:00
John Marshall 128ffc089b Complain when bwa mem is given too many filenames
Reads in extra .fq filenames beyond "bwa mem index one.fq two.fq"
will not be aligned, so complain about such invalid usage instead.
2013-06-14 14:00:24 +01:00
Heng Li 9735d7a31a conform to the latest (unpublished) SAM spec
for chimeric alignments
2013-05-22 19:45:16 -04:00
Rob Davies 0aa7e0a402 Ensure exit status of 1 if given invalid options or index files are not found.
Added missing default cases in option scanning.
Ensure exit value is 1 if bwa_idx_load or bwa_idx_infer_prefix fail.
These changes extend the previous one, which only fixed the mem aligner.
2013-04-29 13:58:28 +01:00
Rob Davies e88529687f Merge branch 'master' into master_fixes. Merged up to r389.
Conflicts:
	bwamem.c
	kopen.c
2013-04-29 12:09:30 +01:00
Heng Li 1a2bd2cf91 r389: return non-zero upon errors 2013-04-27 10:08:01 -04:00
Heng Li 19cb7cd7ed r388: cleanup mem_process_seqs() interface
Print output outside the function and allow to feed insert size distribution.
2013-04-26 12:31:18 -04:00
Rob Davies 90ecd344ba Merge branch 'master' into master_fixes. Merged up to master r375.
Conflicts:
	bwt.c
2013-04-11 11:15:39 +01:00
Heng Li 53bb846407 r373: optionally distable mate rescue 2013-04-09 16:13:55 -04:00
Rob Davies aabd990e8f Merge branch 'master' into master_fixes
Conflicts:
	Makefile
	bwape.c
	bwase.c
	bwtsw2_aux.c
	stdaln.c
2013-03-08 16:46:45 +00:00
Heng Li b0a76884e8 r340: feature freeze; updated the manpage
I will stop adding new features to bwa and prepare for the next release. I will
briefly evaluate the variant calling accuracy before the release.
2013-03-07 11:51:23 -05:00
Heng Li 3e3236dfc4 r337: mem - always read even number of reads
In the old code, we may read odd number of reads from an interleaved fastq.
2013-03-07 11:00:15 -05:00
Heng Li 5fbd454682 r332: added output threshold
Otherwise there are far too many short hits
2013-03-05 22:49:38 -05:00
Heng Li 07921659cf move mem_fill_scmat() to bwa.{h,c} 2013-03-05 09:38:12 -05:00
Rob Davies 8a078cc16d Merge branch 'master' into master_fixes
Conflicts:
	bntseq.c
	bwamem.c
2013-03-05 10:21:07 +00:00
Heng Li efd9769b07 r324: a little code cleanup
The changes after r317 aim to improve the performance and accuracy for very
long query alignment. The short-read alignment should not be affected. The
changes include:

1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this
   heuristic only improves speed, but now I realize it also reduces poor
   alignment with long good flanking alignments. The difference from blast's
   X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not.

2) Band width doubling. When band width is too small, we will get a poor
   alignment in the middle. Sometimes such alignments cannot be fully excluded
   with Z-dropoff. Band width doubling is an alternative heuristic. It is based
   on the observation that the existing of close-to-boundary high score
   possibly implies inadequate band width. When we see such a signal, we double
   the band width.
2013-03-05 00:57:16 -05:00
Heng Li e0991d6a45 r323: added Z-dropoff, a variant of blast's X-drop 2013-03-05 00:34:33 -05:00
Rob Davies 6beab5f765 Merge branch 'master' into master_fixes
Merge changes to commit c5434ac (0.7.0 release)

Conflicts:
	Makefile
	bwamem.c
2013-03-01 10:22:49 +00:00
Rob Davies 3d33ab063e Merge branch 'master' into master_fixes
Merged to master version b621d3a

Conflicts:
	Makefile
	bntseq.c
	bwa.c
	bwase.c
	bwaseqio.c
	bwtaln.c
	bwtindex.c
	bwtio.c
	bwtmisc.c
	bwtsw2_aux.c
	cs2nt.c
	fastmap.c
	khash.h
	kseq.h
	ksw.c
	kvec.h
	simple_dp.c
	utils.c
	utils.h
2013-03-01 09:37:46 +00:00
Heng Li 4bb0bdddca r306: introduce clipping penalty
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li e620f0ff4e r302: updated the manpage 2013-02-27 13:16:22 -05:00
Heng Li 98787f0ae0 r295: generate NM 2013-02-26 13:36:01 -05:00
Heng Li 9957e04590 r278: don't perform too many mate-sw 2013-02-25 11:56:02 -05:00
Heng Li 5092211d75 controllable scoring matrix 2013-02-25 11:24:21 -05:00
Heng Li 5ead86acd3 optionally mark split hit as secondary 2013-02-25 11:18:35 -05:00
Heng Li 4dc982a3c7 support interleaved fastq 2013-02-25 00:13:32 -05:00
Heng Li 0b4a40dc25 updated revision number; to merge into master 2013-02-24 13:34:20 -05:00
Heng Li 85775c3384 output multiple hits 2013-02-24 13:23:43 -05:00
Heng Li 6bdccf2a8a added a bit documentation 2013-02-24 13:09:29 -05:00
Heng Li 6e7903e9f3 added kopen support 2013-02-23 17:09:23 -05:00
Heng Li b4c38bcc1c append fasta/q comment 2013-02-23 16:57:34 -05:00
Heng Li ee4540c394 support read group in bwa-mem 2013-02-23 16:41:44 -05:00
Heng Li 67543f19a1 code refactoring 2013-02-23 15:55:55 -05:00
Heng Li 3c330d5049 for another round of code cleanup 2013-02-23 15:12:26 -05:00
Heng Li d460f2ec9e bugfix in multi-threaded bwa-mem 2013-02-23 14:48:54 -05:00
Heng Li f122fad562 minor code clean up
bwtio.c is merged to bwt.c
2013-02-22 17:09:40 -05:00
Heng Li 54da54ffd4 extend more seeds (and thus slower...) 2013-02-21 12:52:00 -05:00
Heng Li 66585b7982 code backup 2013-02-18 16:33:06 -05:00
Heng Li 95d18449b3 merge bseq.{h,c} to utils.{h,c}
I do not like many small files.
2013-02-12 10:36:15 -05:00
Heng Li 987d4b4205 fixed a stupid bug in fastq reading 2013-02-11 11:27:35 -05:00
Heng Li 59eaf650ac code backup 2013-02-11 10:59:38 -05:00
Heng Li cb55617f50 added a new line 2013-02-08 22:12:18 -05:00
Heng Li 95a79afe71 command-line prompt 2013-02-08 22:11:44 -05:00
Heng Li 39607065e0 allow more seeds to be seen (thus slower..) 2013-02-08 16:56:28 -05:00
Heng Li cd6bd524d4 discard internal seeds shorter than half 2013-02-07 19:50:37 -05:00
Heng Li ff3fea115c write soft clip; added debugging code 2013-02-07 16:27:11 -05:00
Heng Li 1fd51fc3f7 code backup 2013-02-07 14:36:18 -05:00
Heng Li 5dc398cdef start to write CLI 2013-02-07 13:13:43 -05:00
Heng Li 5a0b32bfd2 updated to the latest kseq.h 2013-02-06 14:38:40 -05:00
Heng Li a9292d674d a bit code cleanup 2013-02-06 13:59:32 -05:00
Heng Li a61288c768 separate CIGAR generation 2013-02-05 21:49:19 -05:00
Heng Li 1e16f3e701 calling ksw_global(); ksw_extend() is buggy! 2013-02-05 17:13:12 -05:00
Heng Li d6a73c9171 chain filtering apparently working 2013-02-05 00:17:20 -05:00
Heng Li 9d0cdb2d3c unfinished chain filter 2013-02-04 17:23:06 -05:00
Heng Li 788e9d1e3d fixed a couple of leaks; buggy atm 2013-02-04 15:40:26 -05:00
Heng Li ba18db1a9f sw extension works for the simplest case 2013-02-04 12:37:38 -05:00
Heng Li d25a87cc50 code backup 2013-02-02 15:14:24 -05:00
Heng Li 00e5302219 routine to get subsequence from 2-bit pac 2013-02-01 16:39:50 -05:00
Heng Li f8f3b7577a code cleanup; added a missing file 2013-02-01 14:38:44 -05:00
Heng Li 620ad6e5b9 reseed long SMEMs 2013-02-01 14:20:38 -05:00
Heng Li 8977737460 basic chaining working
Definitely suboptimal in a lot of corner cases...
2013-01-31 16:26:05 -05:00
Heng Li 91debf412b move smem iterators to bwamem.{c,h} 2013-01-31 13:59:48 -05:00
Heng Li 5a4a0c4173 a bit refactoring for further changes 2013-01-31 12:34:05 -05:00
Heng Li 6641788d38 preparation for further changes 2013-01-31 11:42:31 -05:00
Rob Davies 4f4e998d7f Added wrappers for fputc and fputs; more efficient sequence printing
Added wrappers err_fputc and err_fputs to catch failures in fput and fputs.
Macros err_putchar and err_puts call the new wrappers and can be used in
place of putchar and puts.

To avoid having to make millions of function calls when printing out
sequences, the code to print them in bwa_print_sam1 using putchar has
been replaced by a new version in bwa_print_seq that puts the sequence
into a buffer and then outputs the lot with err_fwrite.  In testing, the
new code was slightly faster than the old version, with the added benefit
that it will stop promptly if IO problems are detected.
2013-01-09 14:43:36 +00:00
Rob Davies 55f1b36534 New wrapper for gzclose; added err_fflush calls and made it call fsync too.
Added a new utils.c wrapper err_gzclose and changed gzclose calls to use it.

Put in some more err_fflush calls before files being written are closed.

Made err_fflush call fsync.  This is useful for remote filesystems where
errors may not be reported on fflush or fclose as problems at the server
end may only be detected after they have returned.  If bwa is being used
only to write to local filesystems, calling fsync is not really necessary.
To disable it, comment out #define FSYNC_ON_FLUSH in utils.c.
2013-01-03 16:57:37 +00:00
Rob Davies b081ac9b8b Use wrapper functions to catch system errors
Use the wrapper functions in utils.c plus a few extra bits of error
checking code to catch system errors and exit non-zero when they occur.
2012-12-16 10:34:57 +00:00
Heng Li bf65b6463a fastmap: optionally output the original query seq 2011-11-24 19:44:21 -05:00
Heng Li 150bfbdef4 fixed a deadlock; SMEM iterator 2011-11-24 19:15:14 -05:00
Heng Li 7babb54e4c drop smem based mapping algorithm
While we can compute smems very efficiently, there is still a long way to get
the alignment. On simulated data, this smem-based algorithm is 4X faster than
bwasw and twice as fast as bowtie2, but the accuracy is far lower than bwasw
and even lower than bowtie2 in the high-mapQ range. I am kind of sure that if
we continue to increase the mapping accuracy, the speed will approach to bwasw,
if not slower.

Smem-based mapping algorithm is still interesting, but given that I am short of
time, I will not explore it further.
2011-10-27 10:56:09 -04:00
Heng Li 7467671c30 minor change 2011-10-25 21:39:38 -04:00
Heng Li e890b8ac2e preliminary code to generate fake sam 2011-10-25 19:45:55 -04:00
Heng Li 55059443bd print msg to stderr; output more in fastmap 2011-10-25 15:06:13 -04:00
Heng Li 4813257d4f remove debugging code 2011-10-25 12:38:33 -04:00
Heng Li f56edd07dd forward-backward search seems working 2011-10-25 12:31:36 -04:00
Heng Li 7626595e3a backup the current debugging code; more changes 2011-10-25 10:03:57 -04:00
Heng Li 22c2252e15 added bidirectional bwt; seems buggy 2011-10-25 00:22:28 -04:00