Commit Graph

49 Commits (7d085962a26cab160a07a42a461378f575ff011a)

Author SHA1 Message Date
zzh 7d085962a2 开始改成sbwa那种batch模式 2024-03-07 18:23:21 +08:00
John Marshall aeff0eed7a Use native SSE2 intrinsics on i386 as well as x86-64
Make the native SSE2 code conditional on __SSE2__, which is defined
by GCC/Clang/etc on x86-64 by default and on i386 with -msse2 etc.
2022-06-27 14:15:59 +01:00
John Marshall 50f99b6890 On other platforms, emulate SSE2 SIMD calls using scalar code 2022-06-26 19:38:20 +01:00
John Marshall b64ccddda7 On ARM, rewrite SSE2 SIMD calls using Neon intrinsics
Many Intel intrinsics have a corresponding Neon equivalent.
Other cases are more interesting:

* Neon's vmaxvq directly selects the maximum entry in a vector,
  so can be used to implement both the __max_16/__max_8 macros
  and the _mm_movemask_epi8 early loop exit. Introduce additional
  helper macros alongside __max_16/__max_8 so that the early loop
  exit can similarly be implemented differently on the two platforms.

* Full-width shifts can be done via vextq. This is defined close to
  the ksw_u8()/ksw_i16() functions (rather than in neon_sse.h) as it
  implicitly uses one of their local variables.

* ksw_i16() uses saturating *signed* 16-bit operations apart from
  _mm_subs_epu16; presumably the data is effectively still signed but
  we wish to keep it non-negative. The ARM intrinsics are more careful
  about type checking, so this requires an extra U16() helper macro.
2022-06-20 20:43:17 +01:00
John Marshall b5f4bdae91 Make _mm_load_si128() explicit
The previous code implicitly caused a load; change it so the load
intrinsic is explicitly invoked, as the others are. (This in fact
makes no difference to the generated code.)
2022-06-17 18:42:07 +01:00
Heng Li b7076848ab r744: int overflow given MB query 2014-05-01 15:30:36 -04:00
Heng Li fa20c71920 r742: further control the max bandwidth
I am looking at 6kb bandwidth...
2014-05-01 14:27:38 -04:00
Heng Li df65893fb5 r727: extend seeds with SW 2014-04-24 14:28:40 -04:00
Heng Li 954cfd766d improved ksw_extend2()
1. the first cell in a row is not always right
2. prevent from H->H extension from H=0 cells
3. replaced the band narrowing heuristic with always correct one
2014-04-23 15:14:52 -04:00
Heng Li 066ec4aa95 dev-460: disallow a cigar 20M2D2I30M in extension
Global alignment does not allow contiguous insertions and deletions, but local
alignment and extension allow such CIGARs. The optimal global alignment may
have a lower score than extension, which actually happens often for PacBio
data. This commit disallows a CIGAR like 20M2D2I30M to fix this inconsistency.
Local alignment has not been changed.
2014-04-04 10:44:34 -04:00
Heng Li 578bb55c38 dev-449: unequal ins/del in global() and extend() 2014-03-28 14:15:38 -04:00
Heng Li 0c783399e8 dev-448: different ins/del penalties 2014-03-28 10:54:23 -04:00
Heng Li f70d80a5a2 r427: fixed bugs in backtrack
See comments in ksw_global() for details.
2013-12-30 15:40:18 -05:00
Rob Davies 96e445d9e4 Reduce dependency on utils.h - new malloc wrapping scheme.
Remove xmalloc, xcalloc, xrealloc and xstrdup from utils.h and revert calls
to the normal malloc, calloc, realloc, strdup.  Add new files malloc_wrap.[ch]
with the wrapper functions.  malloc_wrap.h #defines malloc etc. to the
wrapper, but only if USE_MALLOC_WRAPPERS has been defined.

Put #include "malloc_wrap.h" in any file that uses *alloc or strdup.  This
is also in a #ifdef USE_MALLOC_WRAPPERS ... #endif block to make using the
wrappers optional.  Add -DUSE_MALLOC_WRAPPERS into the makefile so they
should normally get added.

This is an improvement on the previous method as we now don't need to
worry about stray function calls that were not changed to the wrapped version
and the code will still work even if the wrapping is disabled.

Other possible methods of doing this are using malloc_hook (glibc-specific),
adding -include malloc_wrap.h to the gcc command-line (somewhat
gcc-specific) or making our own malloc function and using dlopen (scary).
This way is probably the most portable.
2013-05-02 15:12:01 +01:00
Rob Davies 4cb5110d03 Merge branch 'master' into master_fixes 2013-04-22 09:51:07 +01:00
Heng Li db7a98636f r380: er... another compiling error 2013-04-19 12:04:44 -04:00
Heng Li f0c94d80d1 r379: fixed compiling error 2013-04-19 12:04:00 -04:00
Heng Li be11e27e12 r378: bugfix - wrong CIGAR
This is actually caused by a bug in SSE2-SW, where the query begin may be
smaller than the true one if there is an exact tandem repeat.
2013-04-19 12:00:37 -04:00
Rob Davies 90ecd344ba Merge branch 'master' into master_fixes. Merged up to master r375.
Conflicts:
	bwt.c
2013-04-11 11:15:39 +01:00
Heng Li 3d8a8c1e37 r374: fix - clipping penalty not always working
This only happens to gaps where mem underestimates the bandwidth without
considering the clipping penalty.
2013-04-10 01:09:37 -04:00
Rob Davies aabd990e8f Merge branch 'master' into master_fixes
Conflicts:
	Makefile
	bwape.c
	bwase.c
	bwtsw2_aux.c
	stdaln.c
2013-03-08 16:46:45 +00:00
Heng Li e6c262594f bwa-sw: ditch stdaln 2013-03-05 10:12:38 -05:00
Rob Davies 8a078cc16d Merge branch 'master' into master_fixes
Conflicts:
	bntseq.c
	bwamem.c
2013-03-05 10:21:07 +00:00
Heng Li e0991d6a45 r323: added Z-dropoff, a variant of blast's X-drop 2013-03-05 00:34:33 -05:00
Heng Li 59bc9341f6 code backup; more changes coming later 2013-03-04 17:29:07 -05:00
Rob Davies 6beab5f765 Merge branch 'master' into master_fixes
Merge changes to commit c5434ac (0.7.0 release)

Conflicts:
	Makefile
	bwamem.c
2013-03-01 10:22:49 +00:00
Rob Davies 3d33ab063e Merge branch 'master' into master_fixes
Merged to master version b621d3a

Conflicts:
	Makefile
	bntseq.c
	bwa.c
	bwase.c
	bwaseqio.c
	bwtaln.c
	bwtindex.c
	bwtio.c
	bwtmisc.c
	bwtsw2_aux.c
	cs2nt.c
	fastmap.c
	khash.h
	kseq.h
	ksw.c
	kvec.h
	simple_dp.c
	utils.c
	utils.h
2013-03-01 09:37:46 +00:00
Heng Li 4bb0bdddca r306: introduce clipping penalty
More clipping leads to more severe reference bias. We should not clip the
alignment unless necessary.
2013-02-27 21:13:39 -05:00
Heng Li b621d3ae38 r301: left-align indels
Don't know why the change is working...
2013-02-27 00:42:19 -05:00
Heng Li ea8f4f4d34 clean bill from valgrind 2013-02-20 20:26:57 -05:00
Heng Li 557daabf38 bugfix: bug in the new ksw.c
On my test data, one alignment is different, caused by polyA
2013-02-12 17:48:46 -05:00
Heng Li 28a7d501f2 updated to the latest ksw; NOT TESTED YET!!! 2013-02-12 16:35:05 -05:00
Heng Li d8e4d57956 Don't use narrow band.
I may retry this feature if the profilter indicates that this greatly helps.
2013-02-07 21:22:54 -05:00
Heng Li 14e6a7bdb9 fixed a silly bug in ksw_extend()
Query return value is assigned to the target variable and vice versa...
2013-02-05 17:29:03 -05:00
Heng Li 1e16f3e701 calling ksw_global(); ksw_extend() is buggy! 2013-02-05 17:13:12 -05:00
Heng Li 86caae811e added comments 2013-02-05 16:58:35 -05:00
Heng Li 1bc9712cd8 explicitly use bit to keep bt matrix
This also simplifies backtracking.
2013-02-05 16:28:15 -05:00
Heng Li 7e1466c885 implemented NW backtrack 2013-02-05 16:05:53 -05:00
Heng Li d91e320972 towards reimplementing banded NW alignment 2013-02-05 12:06:56 -05:00
Heng Li 788e9d1e3d fixed a couple of leaks; buggy atm 2013-02-04 15:40:26 -05:00
Heng Li ba18db1a9f sw extension works for the simplest case 2013-02-04 12:37:38 -05:00
Heng Li f83dea36d8 no effective changes 2013-02-03 18:16:43 -05:00
Heng Li 2093398231 bugfix: the first line is wrong 2013-02-03 17:47:57 -05:00
Heng Li e8a1962efe code backup; it is wrong 2013-02-03 17:25:40 -05:00
Heng Li 92b084e553 reimplemented SW extension; not tested yet 2013-02-02 16:38:21 -05:00
Rob Davies 55f1b36534 New wrapper for gzclose; added err_fflush calls and made it call fsync too.
Added a new utils.c wrapper err_gzclose and changed gzclose calls to use it.

Put in some more err_fflush calls before files being written are closed.

Made err_fflush call fsync.  This is useful for remote filesystems where
errors may not be reported on fflush or fclose as problems at the server
end may only be detected after they have returned.  If bwa is being used
only to write to local filesystems, calling fsync is not really necessary.
To disable it, comment out #define FSYNC_ON_FLUSH in utils.c.
2013-01-03 16:57:37 +00:00
Rob Davies b081ac9b8b Use wrapper functions to catch system errors
Use the wrapper functions in utils.c plus a few extra bits of error
checking code to catch system errors and exit non-zero when they occur.
2012-12-16 10:34:57 +00:00
Heng Li 182cb2e89c use standard SW when no SSE2 2011-11-19 19:38:21 -05:00
Heng Li c8c79ef024 mate rescue seems working (not MT) 2011-11-06 16:20:40 -05:00