Commit Graph

12950 Commits (d081c19178bfd8fdfb78eaf7643b7179444b16b2)

Author SHA1 Message Date
Karthik Gururaj d081c19178 Minor: added support in C++ sandbox to choose implementation and check
from command line
2014-02-09 18:05:35 -08:00
Karthik Gururaj a03d83579b Matrices in baseline C++ (no vector) implementation of PairHMM are now
allocated on heap using "new". Stack allocation led to program crashes
for large matrix sizes.
2014-02-07 23:22:05 -08:00
Karthik Gururaj 20a46e4098 Check only for SSE 4.1 (rather than SSE 4.2) when trying to use the SSE
implementation of PairHMM
2014-02-07 15:19:55 -08:00
Karthik Gururaj dc44b64ad8 1. Added support for building the PairHMM vector library into build.xml.
The library is compiled using  makefile and copied into the directory:
build/java/classes/org/broadinstitute/sting/utils/pairhmm/
2. Bundled the library into StingUtils.jar. Unpacked and loaded at
runtime without the need to set java.library.path

Caveats:
Platform independence has probably been thrown out of the window.
Assumptions:
a. make command exists at /usr/bin/make
b. rsync command exists at /usr/bin/rsync
c. icc is in the PATH of the user
2014-02-07 13:13:59 -08:00
mghodrat 7815c30df8 Adding comments to pairhmm-template-kernel 2014-02-06 20:13:06 -08:00
Karthik Gururaj b729fc0136 1. Split main JNI function into initializeTestcases, compute_testcases
and releaseReads
2. FTZ enabled
3. Cleaner profiling code
2014-02-06 14:35:32 -08:00
Karthik Gururaj 166f91d698 Merge branch 'test_branch'
Conflicts:
	public/c++/VectorPairHMM/LoadTimeInitializer.cc
	public/c++/VectorPairHMM/pairhmm-1-base.cc
	public/c++/VectorPairHMM/utils.cc
	public/c++/VectorPairHMM/utils.h

Merged test_branch with intel_pairhmm
2014-02-06 11:18:18 -08:00
Karthik Gururaj fab6f57e97 1. Enabled FTZ in LoadTimeInitializer.cc
2. Added Sandbox.java for testing
3. Moved compute to utils.cc (inside library)
4. Added flag for disabling FTZ in Makefile
2014-02-06 11:01:33 -08:00
Karthik Gururaj 78642944c0 1. Moved break statement in utils.cc to correct position
2. Tested sandbox with regions
3. Lots of profiling code from previous commit exists
2014-02-06 09:32:56 -08:00
Karthik Gururaj acda6ca27b 1. Whew, finally debugged the source of performance issues with PairHMM
JNI. See copied text from email below.
2. This commit contains all the code used in profiling, detecting FP
exceptions, dumping intermediate results. All flagged off using ifdefs,
but it's there.
--------------Text from email
As we discussed before, it's the denormal numbers that are causing the
slowdown - the core executes some microcode uops (called FP assists)
when denormal numbers are detected for FP operations (even un-vectorized
code).
The C++ compiler by default enables flush to zero (FTZ) - when set, the
hardware simply converts denormal numbers to 0. The Java binary
(executable provided by Oracle, not the native library) seems to be
compiled without FTZ (sensible choice, they want to be conservative).
Hence, the JNI invocation sees a large slowdown. Disabling FTZ in C++
slows down the C++ sandbox performance to the JNI version (fortunately,
the reverse also holds :)).
Not sure how to show the overhead for these FP assists easily - measured
a couple of counters.
FP_ASSISTS:ANY - shows number of uops executed as part of the FP
assists. When FTZ is enabled, this is 0 (both C++ and JNI), when FTZ is
disabled this value is around 203540557 (both C++ and JNI)
IDQ:MS_UOPS_CYCLES - shows the number of cycles the decoder was issuing
uops when the microcode sequencing engine was busy. When FTZ is enabled,
this is around 1.77M cycles (both C++ and JNI), when FTZ is disabled
this value is around 4.31B cycles (both C++ and JNI). This number is
still small with respect to total cycles (~40B), but it only reflects
the cycles in the decode stage. The total overhead of the microcode
assist ops could be larger.
As suggested by Mustafa, I compared intermediate values (matrices M,X,Y)
and final output of compute_full_prob. The values produced by C++ and
Java are identical to the last bit (as long as both use FTZ or no-FTZ).
Comparing the outputs of compute_full_prob for the cases no-FTZ and FTZ,
there are differences for very small values (denormal numbers).
Examples:
Diff values 1.952970E-33 1.952967E-33
Diff values 1.135071E-32 1.135070E-32
Diff values 1.135071E-32 1.135070E-32
Diff values 1.135071E-32 1.135070E-32
For this test case (low coverage NA12878), all these values would be
recomputed using the double precision version. Enabling FTZ should be
fine.
-------------------End text from email
2014-02-05 17:09:57 -08:00
Karthik Gururaj 24f8aef344 Contains profiling, exception tracking, PAPI code
Contains Sandbox Java
2014-02-04 16:27:29 -08:00
Karthik Gururaj 6d4d776633 Includes code for all debug code for obtaining profiling info 2014-01-30 12:08:06 -08:00
Karthik Gururaj 5c7427e48c Temporary commit containing debug profiling code - commented out 2014-01-29 12:10:29 -08:00
Karthik Gururaj 0c63d6264f 1. Added synchronization block around loadLibrary in
VectorLoglessPairHMM
2. Edited Makefile to use static libraries where possible
2014-01-27 15:34:58 -08:00
Karthik Gururaj a15137a667 Modified run.sh 2014-01-27 14:56:46 -08:00
Karthik Gururaj 2c0d70c863 Moved vector JNI code to public/c++/VectorPairHMM 2014-01-27 14:52:59 -08:00
Karthik Gururaj 85a748860e 1. Added more profiling code
2. Modified JNI_README
2014-01-27 14:32:44 -08:00
Karthik Gururaj a14a11c0cf Pulled Mohammad's changes for creating variable sized arrays
Merge branch 'master' of /home/mghodrat/PairHMM/shared-repository into intel_pairhmm

Conflicts:
	PairHMM_JNI/org_broadinstitute_sting_utils_pairhmm_VectorLoglessPairHMM.cc
2014-01-26 19:40:43 -08:00
Karthik Gururaj 018e9e2c5f 1. Cleaned up code
2. Split into DebugJNILoglessPairHMM and VectorLoglessPairHMM with base
class JNILoglessPairHMM. DebugJNILoglessPairHMM can, in principle,
invoke any other child class of JNILoglessPairHMM.
3. Added more profiling code for Java parts of LoglessPairHMM
2014-01-26 19:18:12 -08:00
mghodrat e7598dde8b Clean up 2014-01-26 11:36:06 -08:00
Karthik Gururaj 81bdfbd00d Temporary commit before moving to new native library 2014-01-24 16:29:35 -08:00
Intel Repocontact f7fa79e561 Merge branch 'intel_pairhmm' of /home/karthikg/broad/gsa-unstable into intel_pairhmm
Committing into central_repo
2014-01-22 23:08:32 -08:00
Karthik Gururaj 936e9e175e 1. Converted q,i,d,c in C++ from int* to char*
2. Use clock_gettime to measure performance
3. Disabled OpenMP
4. Moved LoadTimeInitializer to different file
2014-01-22 22:57:32 -08:00
Karthik Gururaj 733a84e4f9 Added support to transfer haplotypes once per region to the JNI
Re-use transferred haplotypes (stored in GlobalRef) across calls to
computeLikelihoods
2014-01-22 10:52:41 -08:00
Karthik Gururaj 868a8394f7 Deleted libJNILoglessPairHMM.so from git tracking 2014-01-21 15:01:09 -08:00
Karthik Gururaj 217f6948f1 Merge branch 'master' of /home/mozdal/git/hmm into intel_pairhmm
Conflicts:
	PairHMM_JNI/pairhmm-1-base.cc
	PairHMM_JNI/pairhmm-template-kernel.cc
	PairHMM_JNI/utils.cc
2014-01-21 12:43:16 -08:00
mozdal 1b1c0c8e76 Split the inner loop to avoid the overhead incurred when -fPIC flag is enabled. 2014-01-21 11:47:30 -08:00
Karthik Gururaj 88c08e78e7 1. Inserted #define in sandbox pairhmm-template-main.cc
2. Wrapped _mm_empty() with ifdef SIMD_TYPE_SSE
3. OpenMP disabled
4. Added code for initializing PairHMM's data inside initializePairHMM -
not used yet
2014-01-21 09:57:14 -08:00
mozdal 0170d4f3d5 Got rid of the MMX instructions in the SSE version of the code. Handling the mask operations in a class, which is defined for each version of SSE and AVX implementations separately. 2014-01-21 09:30:15 -08:00
Karthik Gururaj 28891117e2 Fixed bug in JNI interface release_array
Disabled OpenMP
2014-01-20 11:07:44 -08:00
Karthik Gururaj f614d7b0d8 1. Enabled OpenMP
2. Enabled AVX - earlier commit had disabled AVX
2014-01-20 08:51:53 -08:00
Karthik Gururaj 7180c392af 1. Integrated Mohammad's SSE4.2 code, Mustafa's bug fix and code to fix the
SSE compilation warning.
2. Added code to dynamically select between AVX, SSE4.2 and normal C++ (in
that order)
3. Created multiple files to compile with different compilation flags:
avx_function_prototypes.cc is compiled with -xAVX while
sse_function_instantiations.cc is compiled with -xSSE4.2 flag.
4. Added jniClose() and support in Java (HaplotypeCaller,
PairHMMLikelihoodCalculationEngine) to call this function at the end of
the program.
5. Removed debug code, kept assertions and profiling in C++
6. Disabled OpenMP for now.
2014-01-20 08:03:42 -08:00
Karthik Gururaj 25aecb96e0 Added support for dynamic selection between AVX and un-vectorized C++,
still to include SSE code from Mohammad.
Debug flags turned on in this commit.
2014-01-18 11:07:23 -08:00
Intel Repocontact d53e2fbe66 Uncommenting download option in build.xml 2014-01-16 21:55:04 -08:00
Karthik Gururaj f1c772ceea Same log message as before - forgot -a option
1. Moved computeLikelihoods from PairHMM to native implementation
2. Disabled debug - debug code still left (hopefully, not part of
    bytecode)
3. Added directory PairHMM_JNI in the root which holds the C++
library that contains the PairHMM AVX implementation. See
PairHMM_JNI/JNI_README first
2014-01-16 21:40:04 -08:00
Karthik Gururaj d7ba1f1c28 1. Moved computeLikelihoods from PairHMM to native implementation
2. Disabled debug - debug code still left (hopefully, not part of
bytecode)
3. Added directory PairHMM_JNI in the root which holds the C++ library
that contains the PairHMM AVX implementation. See PairHMM_JNI/JNI_README
first
2014-01-16 21:36:15 -08:00
Karthik Gururaj b57de8eec1 Merge branch 'master' of /home/karthikg/broad/archive/hmm_intra into intel_pairhmm 2014-01-16 20:29:51 -08:00
Karthik Gururaj e6c6f8e313 Renamed directory 2014-01-16 20:28:50 -08:00
Karthik Gururaj 532485ca59 Removed unnecessary files 2014-01-16 20:26:41 -08:00
Karthik Gururaj 90938b8610 Minor typo in comments fixed 2014-01-16 19:58:04 -08:00
Karthik Gururaj e90405cd1f 1. Nested loops over reads and haplotypes moved to C++ through JNI
2. OpenMP support added
3. Using direct access to Java primitive arrays
4. Debug messages disabled
2014-01-16 19:53:50 -08:00
Karthik Gururaj e8a5022777 1. Added support for JNI integration for LoglessCaching PairHMM AVX
implementation.
2. Contains lots of debug code
3. Only invokes JNI for subComputeReadLikelihoodGivenHaplotypeLog10
2014-01-15 11:07:09 -08:00
Karthik Gururaj 8240ea826e Changes:
1. Added TRISTATE_CORRECTION in pairhmm-template-kernel.cc (function
stripINITIALIZATION)
2. Added VEC_DIV macros to define-double.h and define-float.h
3. Edited initializeVectors to match Java
C++ original:
*(ptr_p_MY+r-1) = (r == ROWS - 1) ? ctx._(1.0) : ctx.ph2pr[_d];
*(ptr_p_YY+r-1) = (r == ROWS - 1) ? ctx._(1.0) : ctx.ph2pr[_c];
Modified:
*(ptr_p_MY+r-1) = ctx.ph2pr[_d];
*(ptr_p_YY+r-1) = ctx.ph2pr[_c];
2014-01-15 10:48:58 -08:00
Karthik Gururaj 5fab96b7ee First import of AVX-JNI to git 2014-01-14 17:26:55 -08:00
Eric Banks f6a44afa3a Merge pull request #464 from broadinstitute/eb_rev_variant_jar_for_bcf_fixes
Rev'ing the Variant jar to incorporate some patches to the BCF encoder t...
2014-01-02 21:05:13 -08:00
Eric Banks 856c17868b Rev'ing the Variant jar to incorporate some patches to the BCF encoder that Menachem needs. 2014-01-02 23:33:17 -05:00
Ryan Poplin 5c32ad174a Merge pull request #452 from broadinstitute/rp_vqsr_aggregate_model
Allow for additional input data to be used in the VQSR for clustering bu...
2014-01-02 12:54:45 -08:00
Ryan Poplin 856c1f87c1 Allow for additional input data to be used in the VQSR for clustering but don't carry it forward into the output VCF file.
-- New -a argument in the VQSR for specifying additional data to be used in the clustering
-- New NA12878KB walker which creates ROC curves by partitioning the data along VQSLOD and calculating how many KB TP/FP's are called.
2014-01-02 14:46:04 -05:00
Ryan Poplin c82501ac35 Merge pull request #462 from broadinstitute/rp_SingleSampleHC_exome_scala
Adding SingleSampleHC_exome.scala for Valentin to use as a jumping off p...
2014-01-02 08:57:27 -08:00
Ryan Poplin 15372c4873 Adding SingleSampleHC_exome.scala for Valentin to use as a jumping off point. 2014-01-02 11:56:17 -05:00