gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Karthik Gururaj	15fe244e4b	Now has PAPI values	2014-02-26 11:47:42 -08:00
Karthik Gururaj	d081c19178	Minor: added support in C++ sandbox to choose implementation and check from command line	2014-02-09 18:05:35 -08:00
Karthik Gururaj	20a46e4098	Check only for SSE 4.1 (rather than SSE 4.2) when trying to use the SSE implementation of PairHMM	2014-02-07 15:19:55 -08:00
Karthik Gururaj	b729fc0136	1. Split main JNI function into initializeTestcases, compute_testcases and releaseReads 2. FTZ enabled 3. Cleaner profiling code	2014-02-06 14:35:32 -08:00
Karthik Gururaj	166f91d698	Merge branch 'test_branch' Conflicts: public/c++/VectorPairHMM/LoadTimeInitializer.cc public/c++/VectorPairHMM/pairhmm-1-base.cc public/c++/VectorPairHMM/utils.cc public/c++/VectorPairHMM/utils.h Merged test_branch with intel_pairhmm	2014-02-06 11:18:18 -08:00
Karthik Gururaj	fab6f57e97	1. Enabled FTZ in LoadTimeInitializer.cc 2. Added Sandbox.java for testing 3. Moved compute to utils.cc (inside library) 4. Added flag for disabling FTZ in Makefile	2014-02-06 11:01:33 -08:00
Karthik Gururaj	78642944c0	1. Moved break statement in utils.cc to correct position 2. Tested sandbox with regions 3. Lots of profiling code from previous commit exists	2014-02-06 09:32:56 -08:00
Karthik Gururaj	acda6ca27b	1. Whew, finally debugged the source of performance issues with PairHMM JNI. See copied text from email below. 2. This commit contains all the code used in profiling, detecting FP exceptions, dumping intermediate results. All flagged off using ifdefs, but it's there. --------------Text from email As we discussed before, it's the denormal numbers that are causing the slowdown - the core executes some microcode uops (called FP assists) when denormal numbers are detected for FP operations (even un-vectorized code). The C++ compiler by default enables flush to zero (FTZ) - when set, the hardware simply converts denormal numbers to 0. The Java binary (executable provided by Oracle, not the native library) seems to be compiled without FTZ (sensible choice, they want to be conservative). Hence, the JNI invocation sees a large slowdown. Disabling FTZ in C++ slows down the C++ sandbox performance to the JNI version (fortunately, the reverse also holds :)). Not sure how to show the overhead for these FP assists easily - measured a couple of counters. FP_ASSISTS:ANY - shows number of uops executed as part of the FP assists. When FTZ is enabled, this is 0 (both C++ and JNI), when FTZ is disabled this value is around 203540557 (both C++ and JNI) IDQ:MS_UOPS_CYCLES - shows the number of cycles the decoder was issuing uops when the microcode sequencing engine was busy. When FTZ is enabled, this is around 1.77M cycles (both C++ and JNI), when FTZ is disabled this value is around 4.31B cycles (both C++ and JNI). This number is still small with respect to total cycles (~40B), but it only reflects the cycles in the decode stage. The total overhead of the microcode assist ops could be larger. As suggested by Mustafa, I compared intermediate values (matrices M,X,Y) and final output of compute_full_prob. The values produced by C++ and Java are identical to the last bit (as long as both use FTZ or no-FTZ). Comparing the outputs of compute_full_prob for the cases no-FTZ and FTZ, there are differences for very small values (denormal numbers). Examples: Diff values 1.952970E-33 1.952967E-33 Diff values 1.135071E-32 1.135070E-32 Diff values 1.135071E-32 1.135070E-32 Diff values 1.135071E-32 1.135070E-32 For this test case (low coverage NA12878), all these values would be recomputed using the double precision version. Enabling FTZ should be fine. -------------------End text from email	2014-02-05 17:09:57 -08:00
Karthik Gururaj	24f8aef344	Contains profiling, exception tracking, PAPI code Contains Sandbox Java	2014-02-04 16:27:29 -08:00
Karthik Gururaj	6d4d776633	Includes code for all debug code for obtaining profiling info	2014-01-30 12:08:06 -08:00
Karthik Gururaj	5c7427e48c	Temporary commit containing debug profiling code - commented out	2014-01-29 12:10:29 -08:00
Karthik Gururaj	2c0d70c863	Moved vector JNI code to public/c++/VectorPairHMM	2014-01-27 14:52:59 -08:00

12 Commits (15fe244e4b0ac7fb6921c8a03fe0245cc4d1c7de)