Commit Graph

160 Commits (856a0e0c0140050feae396d88c2a6b8efce09df1)

Author SHA1 Message Date
Heng Li a35a6c2580 updated maual page 2014-05-12 12:52:16 -04:00
Heng Li 39a6cd5bb0 r762: cleanup for the new release; unfinished
It will take to make the documentation ready.
2014-05-11 15:15:44 -04:00
Heng Li 6ac8dd5840 r754: added command msg for -h 2014-05-06 16:15:14 -04:00
Heng Li ce3c198245 r749: max_hits tunable on CMD; default to 5 2014-05-04 10:17:03 -04:00
Heng Li 11698fc4e5 r735: fixed a bug caused by merge 2014-04-30 13:12:43 -04:00
Heng Li 76bb49e01b r729: halved band width; doubled patch band width 2014-04-24 16:06:01 -04:00
Heng Li b92bbb47e5 Merge branch '0.7.7-softclip' into layout
Conflicts:
	Makefile
	bwamem.h
	fastmap.c
	main.c
2014-04-24 12:24:49 -04:00
Heng Li 8c12ec4a4b r725: optionally disable hard clipping
as is reqested by the cancer group
2014-04-24 11:56:43 -04:00
Heng Li 00a07f61bf r721: merge overlapping hits by default 2014-04-15 16:16:04 -04:00
Heng Li 4e22270eba r718: merge alnregs overlapping on both query/ref 2014-04-14 17:01:17 -04:00
Heng Li 6d4a6debdc r716: changed -x pbread 2014-04-14 16:04:29 -04:00
Heng Li bbcabfe342 r707: change params for pacbio-to-pacbio 2014-04-10 21:53:52 -04:00
Heng Li db58392e9b dev-469: fixed wrong command line prompt 2014-04-09 13:20:04 -04:00
Heng Li d766591c1e dev-468: fixed a segfault caused by NULL 2014-04-08 22:11:36 -04:00
Heng Li 99f6f9a0d1 dev-467: limit the max #chains to extend 2014-04-08 21:45:49 -04:00
Heng Li f12dfae772 dev-465: a new output format for read overlap
Also moved a few functions to bwamem_extra.c. File bwamem.c is becoming far too
long.
2014-04-08 16:29:36 -04:00
Heng Li b45aeb87e1 dev-464: preset for pacbio read2read aln 2014-04-08 11:40:54 -04:00
Heng Li 172ba83241 dev-463: added option -x to change multiple params
I hate to copy-paste long command line options.
2014-04-07 11:29:36 -04:00
Heng Li 114901b005 dev-r462: refined setting for PacBio; weight flt
The recommended setting in the last commit is wrong. If we can extend a random
seed hit to the full length, we will force the read aligned through break
points, which is wrong. The new setting is better but it may lead to a small
fraction of fragmented alignments.

In addition, I added a filter on the minimum chain weight and tied
min_HSP_score to this filter. It doubles the mapping speed.
2014-04-04 17:01:04 -04:00
Heng Li 41f720dfa7 dev-461: added a heuristic for PacBio data
See the comment above mem_test_chain_sw() for details.
2014-04-04 16:05:41 -04:00
Heng Li b3225581be dev-458: simplified the smem iterator
simpler but less powful.
2014-04-03 15:23:48 -04:00
Heng Li 3efb7c0e91 r455: release bwa-0.7.8 2014-03-31 15:27:23 -04:00
Heng Li 127c00cc96 dev-454: wording change in command line prompt 2014-03-31 12:03:27 -04:00
Heng Li b27bdf1ae0 dev-453: change of -A scales -TdBOELU
These paramemters are all proportional to -A.
2014-03-31 11:52:52 -04:00
Heng Li b7076d9023 dev-r452: allow to specify insert size at cmd
This is also very useful for debugging.
2014-03-31 11:21:03 -04:00
Heng Li 9ce50a4e5e dev-450: support diff ins/del penalties. NO TEST!! 2014-03-28 14:54:06 -04:00
Heng Li 2e9463ebf1 dev-r442: suppress exact full-length matches 2014-02-26 22:04:19 -05:00
Heng Li ce026a07fc r439: expose mem_opt_t::max_matesw 2014-02-19 13:10:33 -05:00
Heng Li 10cb6b0507 r428: allow to change the default chain_drop_ratio 2013-12-30 16:18:45 -05:00
Heng Li 4219e58623 r423: bugfix - SE hits not random 2013-11-23 09:36:26 -05:00
Heng Li c564653b40 r416: removed a line of debugging code 2013-09-12 10:41:43 -04:00
Heng Li 623da055e1 alternative way to estimate mapQ
the old mapQ estimate is too conservative
2013-09-06 12:31:47 -04:00
Heng Li ed78df9184 Merge branch 'master' into clip2 2013-08-28 16:00:34 -04:00
Heng Li 3b84c03c1e r406: allow to use diff clipping penalties
for 5'-end or for 3'-end
2013-08-28 15:59:05 -04:00
John Marshall 128ffc089b Complain when bwa mem is given too many filenames
Reads in extra .fq filenames beyond "bwa mem index one.fq two.fq"
will not be aligned, so complain about such invalid usage instead.
2013-06-14 14:00:24 +01:00
Heng Li 9735d7a31a conform to the latest (unpublished) SAM spec
for chimeric alignments
2013-05-22 19:45:16 -04:00
Rob Davies 0aa7e0a402 Ensure exit status of 1 if given invalid options or index files are not found.
Added missing default cases in option scanning.
Ensure exit value is 1 if bwa_idx_load or bwa_idx_infer_prefix fail.
These changes extend the previous one, which only fixed the mem aligner.
2013-04-29 13:58:28 +01:00
Rob Davies e88529687f Merge branch 'master' into master_fixes. Merged up to r389.
Conflicts:
	bwamem.c
	kopen.c
2013-04-29 12:09:30 +01:00
Heng Li 1a2bd2cf91 r389: return non-zero upon errors 2013-04-27 10:08:01 -04:00
Heng Li 19cb7cd7ed r388: cleanup mem_process_seqs() interface
Print output outside the function and allow to feed insert size distribution.
2013-04-26 12:31:18 -04:00
Rob Davies 90ecd344ba Merge branch 'master' into master_fixes. Merged up to master r375.
Conflicts:
	bwt.c
2013-04-11 11:15:39 +01:00
Heng Li 53bb846407 r373: optionally distable mate rescue 2013-04-09 16:13:55 -04:00
Rob Davies aabd990e8f Merge branch 'master' into master_fixes
Conflicts:
	Makefile
	bwape.c
	bwase.c
	bwtsw2_aux.c
	stdaln.c
2013-03-08 16:46:45 +00:00
Heng Li b0a76884e8 r340: feature freeze; updated the manpage
I will stop adding new features to bwa and prepare for the next release. I will
briefly evaluate the variant calling accuracy before the release.
2013-03-07 11:51:23 -05:00
Heng Li 3e3236dfc4 r337: mem - always read even number of reads
In the old code, we may read odd number of reads from an interleaved fastq.
2013-03-07 11:00:15 -05:00
Heng Li 5fbd454682 r332: added output threshold
Otherwise there are far too many short hits
2013-03-05 22:49:38 -05:00
Heng Li 07921659cf move mem_fill_scmat() to bwa.{h,c} 2013-03-05 09:38:12 -05:00
Rob Davies 8a078cc16d Merge branch 'master' into master_fixes
Conflicts:
	bntseq.c
	bwamem.c
2013-03-05 10:21:07 +00:00
Heng Li efd9769b07 r324: a little code cleanup
The changes after r317 aim to improve the performance and accuracy for very
long query alignment. The short-read alignment should not be affected. The
changes include:

1) Z-dropoff. This is a variant of blast's X-dropoff. I orginally thought this
   heuristic only improves speed, but now I realize it also reduces poor
   alignment with long good flanking alignments. The difference from blast's
   X-dropoff is that Z-dropoff allows big gaps, but X-dropoff does not.

2) Band width doubling. When band width is too small, we will get a poor
   alignment in the middle. Sometimes such alignments cannot be fully excluded
   with Z-dropoff. Band width doubling is an alternative heuristic. It is based
   on the observation that the existing of close-to-boundary high score
   possibly implies inadequate band width. When we see such a signal, we double
   the band width.
2013-03-05 00:57:16 -05:00
Heng Li e0991d6a45 r323: added Z-dropoff, a variant of blast's X-drop 2013-03-05 00:34:33 -05:00
Rob Davies 6beab5f765 Merge branch 'master' into master_fixes
Merge changes to commit c5434ac (0.7.0 release)

Conflicts:
	Makefile
	bwamem.c
2013-03-01 10:22:49 +00:00
Rob Davies 3d33ab063e Merge branch 'master' into master_fixes
Merged to master version b621d3a

Conflicts:
	Makefile
	bntseq.c
	bwa.c
	bwase.c
	bwaseqio.c
	bwtaln.c
	bwtindex.c
	bwtio.c
	bwtmisc.c
	bwtsw2_aux.c
	cs2nt.c
	fastmap.c
	khash.h
	kseq.h
	ksw.c
	kvec.h
	simple_dp.c
	utils.c
	utils.h
2013-03-01 09:37:46 +00:00
Heng Li 4bb0bdddca r306: introduce clipping penalty
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li e620f0ff4e r302: updated the manpage 2013-02-27 13:16:22 -05:00
Heng Li 98787f0ae0 r295: generate NM 2013-02-26 13:36:01 -05:00
Heng Li 9957e04590 r278: don't perform too many mate-sw 2013-02-25 11:56:02 -05:00
Heng Li 5092211d75 controllable scoring matrix 2013-02-25 11:24:21 -05:00
Heng Li 5ead86acd3 optionally mark split hit as secondary 2013-02-25 11:18:35 -05:00
Heng Li 4dc982a3c7 support interleaved fastq 2013-02-25 00:13:32 -05:00
Heng Li 0b4a40dc25 updated revision number; to merge into master 2013-02-24 13:34:20 -05:00
Heng Li 85775c3384 output multiple hits 2013-02-24 13:23:43 -05:00
Heng Li 6bdccf2a8a added a bit documentation 2013-02-24 13:09:29 -05:00
Heng Li 6e7903e9f3 added kopen support 2013-02-23 17:09:23 -05:00
Heng Li b4c38bcc1c append fasta/q comment 2013-02-23 16:57:34 -05:00
Heng Li ee4540c394 support read group in bwa-mem 2013-02-23 16:41:44 -05:00
Heng Li 67543f19a1 code refactoring 2013-02-23 15:55:55 -05:00
Heng Li 3c330d5049 for another round of code cleanup 2013-02-23 15:12:26 -05:00
Heng Li d460f2ec9e bugfix in multi-threaded bwa-mem 2013-02-23 14:48:54 -05:00
Heng Li f122fad562 minor code clean up
bwtio.c is merged to bwt.c
2013-02-22 17:09:40 -05:00
Heng Li 54da54ffd4 extend more seeds (and thus slower...) 2013-02-21 12:52:00 -05:00
Heng Li 66585b7982 code backup 2013-02-18 16:33:06 -05:00
Heng Li 95d18449b3 merge bseq.{h,c} to utils.{h,c}
I do not like many small files.
2013-02-12 10:36:15 -05:00
Heng Li 987d4b4205 fixed a stupid bug in fastq reading 2013-02-11 11:27:35 -05:00
Heng Li 59eaf650ac code backup 2013-02-11 10:59:38 -05:00
Heng Li cb55617f50 added a new line 2013-02-08 22:12:18 -05:00
Heng Li 95a79afe71 command-line prompt 2013-02-08 22:11:44 -05:00
Heng Li 39607065e0 allow more seeds to be seen (thus slower..) 2013-02-08 16:56:28 -05:00
Heng Li cd6bd524d4 discard internal seeds shorter than half 2013-02-07 19:50:37 -05:00
Heng Li ff3fea115c write soft clip; added debugging code 2013-02-07 16:27:11 -05:00
Heng Li 1fd51fc3f7 code backup 2013-02-07 14:36:18 -05:00
Heng Li 5dc398cdef start to write CLI 2013-02-07 13:13:43 -05:00
Heng Li 5a0b32bfd2 updated to the latest kseq.h 2013-02-06 14:38:40 -05:00
Heng Li a9292d674d a bit code cleanup 2013-02-06 13:59:32 -05:00
Heng Li a61288c768 separate CIGAR generation 2013-02-05 21:49:19 -05:00
Heng Li 1e16f3e701 calling ksw_global(); ksw_extend() is buggy! 2013-02-05 17:13:12 -05:00
Heng Li d6a73c9171 chain filtering apparently working 2013-02-05 00:17:20 -05:00
Heng Li 9d0cdb2d3c unfinished chain filter 2013-02-04 17:23:06 -05:00
Heng Li 788e9d1e3d fixed a couple of leaks; buggy atm 2013-02-04 15:40:26 -05:00
Heng Li ba18db1a9f sw extension works for the simplest case 2013-02-04 12:37:38 -05:00
Heng Li d25a87cc50 code backup 2013-02-02 15:14:24 -05:00
Heng Li 00e5302219 routine to get subsequence from 2-bit pac 2013-02-01 16:39:50 -05:00
Heng Li f8f3b7577a code cleanup; added a missing file 2013-02-01 14:38:44 -05:00
Heng Li 620ad6e5b9 reseed long SMEMs 2013-02-01 14:20:38 -05:00
Heng Li 8977737460 basic chaining working
Definitely suboptimal in a lot of corner cases...
2013-01-31 16:26:05 -05:00
Heng Li 91debf412b move smem iterators to bwamem.{c,h} 2013-01-31 13:59:48 -05:00
Heng Li 5a4a0c4173 a bit refactoring for further changes 2013-01-31 12:34:05 -05:00
Heng Li 6641788d38 preparation for further changes 2013-01-31 11:42:31 -05:00
Rob Davies 4f4e998d7f Added wrappers for fputc and fputs; more efficient sequence printing
Added wrappers err_fputc and err_fputs to catch failures in fput and fputs.
Macros err_putchar and err_puts call the new wrappers and can be used in
place of putchar and puts.

To avoid having to make millions of function calls when printing out
sequences, the code to print them in bwa_print_sam1 using putchar has
been replaced by a new version in bwa_print_seq that puts the sequence
into a buffer and then outputs the lot with err_fwrite.  In testing, the
new code was slightly faster than the old version, with the added benefit
that it will stop promptly if IO problems are detected.
2013-01-09 14:43:36 +00:00
Rob Davies 55f1b36534 New wrapper for gzclose; added err_fflush calls and made it call fsync too.
Added a new utils.c wrapper err_gzclose and changed gzclose calls to use it.

Put in some more err_fflush calls before files being written are closed.

Made err_fflush call fsync.  This is useful for remote filesystems where
errors may not be reported on fflush or fclose as problems at the server
end may only be detected after they have returned.  If bwa is being used
only to write to local filesystems, calling fsync is not really necessary.
To disable it, comment out #define FSYNC_ON_FLUSH in utils.c.
2013-01-03 16:57:37 +00:00
Rob Davies b081ac9b8b Use wrapper functions to catch system errors
Use the wrapper functions in utils.c plus a few extra bits of error
checking code to catch system errors and exit non-zero when they occur.
2012-12-16 10:34:57 +00:00