Commit Graph

59 Commits (0976272d993738a78b5c0560fee99724dc82dd4e)

Author SHA1 Message Date
John Marshall 690649872b Copy the whole kstring_t even if it contains NULs
FASTQ files containing NULs are invalid but should not cause bwa to
crash, as it does if the quality line contains a NUL.  Fixes #122.
2017-06-30 12:46:56 +01:00
John Marshall ab3a92bc73 Prevent Clang warnings on abs() and fabs() calls
In the bwa.c and bwase.c calls, rlen is an int64_t returned from
bns_get_seq() and is the number of reference bases covered by the
alignment; l_query/len is an int and the query length of the alignment;
and the result is an int given to an int parameter of ksw_global[2]().

As even the result is int and as rlen is effectively bounded by the
maximum length of a reference sequence, we maintain the status quo in
this code and simply cast rlen to int to silence Clang's "use llabs()"
(llabs() would not be a great answer given an int64_t anyway).

The bwtsw2_pair.c call needs to remain fabs() so both divisions are
done in floating point; cast to double to prevent Clang suggesting
changing the call to integer abs().
2017-06-26 10:45:13 +01:00
Heng Li 7ec3261877 r1134: use AH:* instead of AH:Y 2016-05-03 11:28:58 -04:00
Heng Li 3c038250f9 r1130: changed "ah" to "AH" 2016-04-28 15:39:18 -04:00
Heng Li c561759222 r1027: segfault caused by the last commit 2014-12-12 16:56:54 -05:00
Heng Li 925ddfb697 r1025: accept file with -H; allow to replace @SQ 2014-12-11 10:38:36 -05:00
Heng Li b5f6ed3020 r1005: insert arbitrary header lines 2014-11-19 10:59:05 -05:00
Heng Li 80e4ecfa79 r998: smart pairing; allow mixture of SE/PE reads 2014-11-18 14:30:22 -05:00
Heng Li a06646493b r915: fixed broken example.c 2014-10-17 16:17:28 -04:00
Heng Li e318d8e7e5 r905: lower peak RAM for "shm -f" 2014-10-16 11:22:09 -04:00
Heng Li bfd5e1840f shm works on small files, but not large ones
I don't know why. SHMMAX, SHMALL and SHMMNI are large enough.
2014-10-15 15:44:06 -04:00
Heng Li 6a0952948d shared memory 2014-10-15 14:44:08 -04:00
Heng Li c5e859b49f r898: read the index into a single memory block
Prepare for shared memory. Not used now.
2014-10-15 12:27:45 -04:00
Heng Li 71277f0fea r896: more flexible ALT reading 2014-10-14 23:37:24 -04:00
Heng Li 7954e77a1b r741: fixed segfault in rare cases 2014-05-01 11:13:05 -04:00
Heng Li b93fca2b2e r723: merge adjacent hits 2014-04-16 16:38:50 -04:00
Heng Li 8638cfadc8 dev-472: get rid of bwa_fix_xref()
This function causes all kinds of problems when the reference genome consists
of many short reads/contigs/chromsomes. Some of the problems are nearly
unfixable at the point where bwa_fix_xref() gets called. This commit attempts
to fix the problem at the root. It disallows chains spanning multiple contigs
and never retrieves sequences bridging two adjacent contigs. Thus all the
chaining, extension, SW and global alignments are confined to on contig only.

This commit brings many changes. I have tested it on a couple examples
including Peter Field's PacBio example. It works well so far.
2014-04-10 20:54:27 -04:00
Heng Li ccbbe48c4f dev-470: don't stop on bwa_fix_xref2() failures
Peter Field has sent me an example caused by an alignment bridging three
adjacent chromosomes/contigs. Bwa-mem always aligns the query to the contig
covering the middle point of the alignment. In this example, it chooses the
middle contig, which should not be aligned. This leads to weird things failing
bwa_fix_xref2(), which cannot be fixed unless we build the contig boundaries
into the FM-index.

In the old code, bwa-mem halts when bwa_fix_xref2() fails. With this commit,
bwa-mem will give a warning instead of halting.
2014-04-10 11:43:17 -04:00
Heng Li 9ce50a4e5e dev-450: support diff ins/del penalties. NO TEST!! 2014-03-28 14:54:06 -04:00
Heng Li 7d63e76245 r444: more debugging output in CIGAR generation
Also found a potential issue which should not affect accuracy but may hurt
speed. Will investigate later.
2014-03-16 23:25:04 -04:00
Heng Li e879817373 r440: a condition not work due to a typo 2014-02-20 13:06:40 -05:00
Heng Li 17fb85a227 r438: still an issue in MD
It occurs when the global alignment disagrees with the local alignment.
2014-02-19 11:31:54 -05:00
Heng Li bdd14d2946 r436: fix rare MD/NM-CIGAR inconsistencies 2014-02-19 10:08:43 -05:00
Heng Li 4adc34eccb r435: bugfix - base not complemented on the rev 2014-02-18 10:32:24 -05:00
Heng Li 7c50bad567 Release bwa-0.7.6a-r433 2014-01-31 12:58:21 -05:00
Heng Li f524c7d3d8 r431: added the MD tag to bwa-mem 2014-01-29 12:05:11 -05:00
Heng Li ff6faf811a r419: print the @PG line 2013-11-19 11:08:45 -05:00
Heng Li 9735d7a31a conform to the latest (unpublished) SAM spec
for chimeric alignments
2013-05-22 19:45:16 -04:00
Heng Li 9a6abe51b6 r391: better method to resolve xref alignment
The old method does not work when the alignment bridges three chr. This may
actually happen often. The new method does not work all the time, either, but
should be better than the old one. It is also simpler, arguably.
2013-05-22 18:57:51 -04:00
Rob Davies 96e445d9e4 Reduce dependency on utils.h - new malloc wrapping scheme.
Remove xmalloc, xcalloc, xrealloc and xstrdup from utils.h and revert calls
to the normal malloc, calloc, realloc, strdup.  Add new files malloc_wrap.[ch]
with the wrapper functions.  malloc_wrap.h #defines malloc etc. to the
wrapper, but only if USE_MALLOC_WRAPPERS has been defined.

Put #include "malloc_wrap.h" in any file that uses *alloc or strdup.  This
is also in a #ifdef USE_MALLOC_WRAPPERS ... #endif block to make using the
wrappers optional.  Add -DUSE_MALLOC_WRAPPERS into the makefile so they
should normally get added.

This is an improvement on the previous method as we now don't need to
worry about stray function calls that were not changed to the wrapped version
and the code will still work even if the wrapping is disabled.

Other possible methods of doing this are using malloc_hook (glibc-specific),
adding -include malloc_wrap.h to the gcc command-line (somewhat
gcc-specific) or making our own malloc function and using dlopen (scary).
This way is probably the most portable.
2013-05-02 15:12:01 +01:00
Rob Davies c89756e2b0 Merge branch 'master' into master_fixes 2013-03-19 12:11:51 +00:00
Heng Li 1e3cadbfc2 r368: bugfix - wrong CIGAR when bridging 3 contigs
In this case, bwa_fix_xref() will return insane coordinates. The old version
did not check the return status and write wrong CIGAR. This bug only happen to
very short assembly contigs.
2013-03-18 20:49:32 -04:00
Rob Davies 9228e48efd Merge branch 'master' into master_fixes
Conflicts:
	Makefile
2013-03-11 13:50:49 +00:00
Heng Li 740d2c1314 Match to 'N' costs -1, instead of 0.
This is to prevent alignment through 'N'.
2013-03-09 18:03:57 -05:00
Rob Davies aabd990e8f Merge branch 'master' into master_fixes
Conflicts:
	Makefile
	bwape.c
	bwase.c
	bwtsw2_aux.c
	stdaln.c
2013-03-08 16:46:45 +00:00
Heng Li 3e3236dfc4 r337: mem - always read even number of reads
In the old code, we may read odd number of reads from an interleaved fastq.
2013-03-07 11:00:15 -05:00
Heng Li 07921659cf move mem_fill_scmat() to bwa.{h,c} 2013-03-05 09:38:12 -05:00
Rob Davies 6beab5f765 Merge branch 'master' into master_fixes
Merge changes to commit c5434ac (0.7.0 release)

Conflicts:
	Makefile
	bwamem.c
2013-03-01 10:22:49 +00:00
Rob Davies 3d33ab063e Merge branch 'master' into master_fixes
Merged to master version b621d3a

Conflicts:
	Makefile
	bntseq.c
	bwa.c
	bwase.c
	bwaseqio.c
	bwtaln.c
	bwtindex.c
	bwtio.c
	bwtmisc.c
	bwtsw2_aux.c
	cs2nt.c
	fastmap.c
	khash.h
	kseq.h
	ksw.c
	kvec.h
	simple_dp.c
	utils.c
	utils.h
2013-03-01 09:37:46 +00:00
Heng Li f3cff1c609 r311: even tighter bw for CIGAR 2013-02-27 23:59:50 -05:00
Heng Li 292e92b602 r303: bugfix - wrong band width when CIGAR 2013-02-27 15:39:15 -05:00
Heng Li 98787f0ae0 r295: generate NM 2013-02-26 13:36:01 -05:00
Heng Li e70c7c2a71 r284: amend cross-reference hit
I really hate this: complex and twisted logic for a nasty scenario that almost
never happens to short reads - but it may become serious when the reference
genome consists of many contigs.

On toy examples, the code seems to work. Don't know if it really works...
2013-02-26 00:03:49 -05:00
Heng Li 61dd3bf13a r283: prepare for fixing cross-ref aln 2013-02-25 22:49:15 -05:00
Heng Li b4c38bcc1c append fasta/q comment 2013-02-23 16:57:34 -05:00
Heng Li 33236de32e a bit more error message 2013-02-23 16:44:02 -05:00
Heng Li ee4540c394 support read group in bwa-mem 2013-02-23 16:41:44 -05:00
Heng Li cfa7165036 cleanup index loading code 2013-02-23 16:10:48 -05:00
Heng Li 67543f19a1 code refactoring 2013-02-23 15:55:55 -05:00
Heng Li e613195e17 moved some common code to bwa.{c,h} 2013-02-23 15:30:46 -05:00