gatk-3.8

Commit Graph

Author	SHA1	Message	Date
mghodrat	7815c30df8	Adding comments to pairhmm-template-kernel	2014-02-06 20:13:06 -08:00
Karthik Gururaj	b729fc0136	1. Split main JNI function into initializeTestcases, compute_testcases and releaseReads 2. FTZ enabled 3. Cleaner profiling code	2014-02-06 14:35:32 -08:00
Karthik Gururaj	166f91d698	Merge branch 'test_branch' Conflicts: public/c++/VectorPairHMM/LoadTimeInitializer.cc public/c++/VectorPairHMM/pairhmm-1-base.cc public/c++/VectorPairHMM/utils.cc public/c++/VectorPairHMM/utils.h Merged test_branch with intel_pairhmm	2014-02-06 11:18:18 -08:00
Karthik Gururaj	fab6f57e97	1. Enabled FTZ in LoadTimeInitializer.cc 2. Added Sandbox.java for testing 3. Moved compute to utils.cc (inside library) 4. Added flag for disabling FTZ in Makefile	2014-02-06 11:01:33 -08:00
Karthik Gururaj	78642944c0	1. Moved break statement in utils.cc to correct position 2. Tested sandbox with regions 3. Lots of profiling code from previous commit exists	2014-02-06 09:32:56 -08:00
Karthik Gururaj	acda6ca27b	1. Whew, finally debugged the source of performance issues with PairHMM JNI. See copied text from email below. 2. This commit contains all the code used in profiling, detecting FP exceptions, dumping intermediate results. All flagged off using ifdefs, but it's there. --------------Text from email As we discussed before, it's the denormal numbers that are causing the slowdown - the core executes some microcode uops (called FP assists) when denormal numbers are detected for FP operations (even un-vectorized code). The C++ compiler by default enables flush to zero (FTZ) - when set, the hardware simply converts denormal numbers to 0. The Java binary (executable provided by Oracle, not the native library) seems to be compiled without FTZ (sensible choice, they want to be conservative). Hence, the JNI invocation sees a large slowdown. Disabling FTZ in C++ slows down the C++ sandbox performance to the JNI version (fortunately, the reverse also holds :)). Not sure how to show the overhead for these FP assists easily - measured a couple of counters. FP_ASSISTS:ANY - shows number of uops executed as part of the FP assists. When FTZ is enabled, this is 0 (both C++ and JNI), when FTZ is disabled this value is around 203540557 (both C++ and JNI) IDQ:MS_UOPS_CYCLES - shows the number of cycles the decoder was issuing uops when the microcode sequencing engine was busy. When FTZ is enabled, this is around 1.77M cycles (both C++ and JNI), when FTZ is disabled this value is around 4.31B cycles (both C++ and JNI). This number is still small with respect to total cycles (~40B), but it only reflects the cycles in the decode stage. The total overhead of the microcode assist ops could be larger. As suggested by Mustafa, I compared intermediate values (matrices M,X,Y) and final output of compute_full_prob. The values produced by C++ and Java are identical to the last bit (as long as both use FTZ or no-FTZ). Comparing the outputs of compute_full_prob for the cases no-FTZ and FTZ, there are differences for very small values (denormal numbers). Examples: Diff values 1.952970E-33 1.952967E-33 Diff values 1.135071E-32 1.135070E-32 Diff values 1.135071E-32 1.135070E-32 Diff values 1.135071E-32 1.135070E-32 For this test case (low coverage NA12878), all these values would be recomputed using the double precision version. Enabling FTZ should be fine. -------------------End text from email	2014-02-05 17:09:57 -08:00
Karthik Gururaj	24f8aef344	Contains profiling, exception tracking, PAPI code Contains Sandbox Java	2014-02-04 16:27:29 -08:00
Karthik Gururaj	6d4d776633	Includes code for all debug code for obtaining profiling info	2014-01-30 12:08:06 -08:00
Karthik Gururaj	5c7427e48c	Temporary commit containing debug profiling code - commented out	2014-01-29 12:10:29 -08:00
Karthik Gururaj	0c63d6264f	1. Added synchronization block around loadLibrary in VectorLoglessPairHMM 2. Edited Makefile to use static libraries where possible	2014-01-27 15:34:58 -08:00
Karthik Gururaj	a15137a667	Modified run.sh	2014-01-27 14:56:46 -08:00
Karthik Gururaj	2c0d70c863	Moved vector JNI code to public/c++/VectorPairHMM	2014-01-27 14:52:59 -08:00
Karthik Gururaj	85a748860e	1. Added more profiling code 2. Modified JNI_README	2014-01-27 14:32:44 -08:00
Karthik Gururaj	a14a11c0cf	Pulled Mohammad's changes for creating variable sized arrays Merge branch 'master' of /home/mghodrat/PairHMM/shared-repository into intel_pairhmm Conflicts: PairHMM_JNI/org_broadinstitute_sting_utils_pairhmm_VectorLoglessPairHMM.cc	2014-01-26 19:40:43 -08:00
Karthik Gururaj	018e9e2c5f	1. Cleaned up code 2. Split into DebugJNILoglessPairHMM and VectorLoglessPairHMM with base class JNILoglessPairHMM. DebugJNILoglessPairHMM can, in principle, invoke any other child class of JNILoglessPairHMM. 3. Added more profiling code for Java parts of LoglessPairHMM	2014-01-26 19:18:12 -08:00
mghodrat	e7598dde8b	Clean up	2014-01-26 11:36:06 -08:00
Karthik Gururaj	81bdfbd00d	Temporary commit before moving to new native library	2014-01-24 16:29:35 -08:00
Intel Repocontact	f7fa79e561	Merge branch 'intel_pairhmm' of /home/karthikg/broad/gsa-unstable into intel_pairhmm Committing into central_repo	2014-01-22 23:08:32 -08:00
Karthik Gururaj	936e9e175e	1. Converted q,i,d,c in C++ from int* to char* 2. Use clock_gettime to measure performance 3. Disabled OpenMP 4. Moved LoadTimeInitializer to different file	2014-01-22 22:57:32 -08:00
Karthik Gururaj	733a84e4f9	Added support to transfer haplotypes once per region to the JNI Re-use transferred haplotypes (stored in GlobalRef) across calls to computeLikelihoods	2014-01-22 10:52:41 -08:00
Karthik Gururaj	868a8394f7	Deleted libJNILoglessPairHMM.so from git tracking	2014-01-21 15:01:09 -08:00
Karthik Gururaj	217f6948f1	Merge branch 'master' of /home/mozdal/git/hmm into intel_pairhmm Conflicts: PairHMM_JNI/pairhmm-1-base.cc PairHMM_JNI/pairhmm-template-kernel.cc PairHMM_JNI/utils.cc	2014-01-21 12:43:16 -08:00
mozdal	1b1c0c8e76	Split the inner loop to avoid the overhead incurred when -fPIC flag is enabled.	2014-01-21 11:47:30 -08:00
Karthik Gururaj	88c08e78e7	1. Inserted #define in sandbox pairhmm-template-main.cc 2. Wrapped _mm_empty() with ifdef SIMD_TYPE_SSE 3. OpenMP disabled 4. Added code for initializing PairHMM's data inside initializePairHMM - not used yet	2014-01-21 09:57:14 -08:00
mozdal	0170d4f3d5	Got rid of the MMX instructions in the SSE version of the code. Handling the mask operations in a class, which is defined for each version of SSE and AVX implementations separately.	2014-01-21 09:30:15 -08:00
Karthik Gururaj	28891117e2	Fixed bug in JNI interface release_array Disabled OpenMP	2014-01-20 11:07:44 -08:00
Karthik Gururaj	f614d7b0d8	1. Enabled OpenMP 2. Enabled AVX - earlier commit had disabled AVX	2014-01-20 08:51:53 -08:00
Karthik Gururaj	7180c392af	1. Integrated Mohammad's SSE4.2 code, Mustafa's bug fix and code to fix the SSE compilation warning. 2. Added code to dynamically select between AVX, SSE4.2 and normal C++ (in that order) 3. Created multiple files to compile with different compilation flags: avx_function_prototypes.cc is compiled with -xAVX while sse_function_instantiations.cc is compiled with -xSSE4.2 flag. 4. Added jniClose() and support in Java (HaplotypeCaller, PairHMMLikelihoodCalculationEngine) to call this function at the end of the program. 5. Removed debug code, kept assertions and profiling in C++ 6. Disabled OpenMP for now.	2014-01-20 08:03:42 -08:00
Karthik Gururaj	25aecb96e0	Added support for dynamic selection between AVX and un-vectorized C++, still to include SSE code from Mohammad. Debug flags turned on in this commit.	2014-01-18 11:07:23 -08:00
Intel Repocontact	d53e2fbe66	Uncommenting download option in build.xml	2014-01-16 21:55:04 -08:00
Karthik Gururaj	f1c772ceea	Same log message as before - forgot -a option 1. Moved computeLikelihoods from PairHMM to native implementation 2. Disabled debug - debug code still left (hopefully, not part of bytecode) 3. Added directory PairHMM_JNI in the root which holds the C++ library that contains the PairHMM AVX implementation. See PairHMM_JNI/JNI_README first	2014-01-16 21:40:04 -08:00
Karthik Gururaj	d7ba1f1c28	1. Moved computeLikelihoods from PairHMM to native implementation 2. Disabled debug - debug code still left (hopefully, not part of bytecode) 3. Added directory PairHMM_JNI in the root which holds the C++ library that contains the PairHMM AVX implementation. See PairHMM_JNI/JNI_README first	2014-01-16 21:36:15 -08:00
Karthik Gururaj	b57de8eec1	Merge branch 'master' of /home/karthikg/broad/archive/hmm_intra into intel_pairhmm	2014-01-16 20:29:51 -08:00
Karthik Gururaj	e6c6f8e313	Renamed directory	2014-01-16 20:28:50 -08:00
Karthik Gururaj	532485ca59	Removed unnecessary files	2014-01-16 20:26:41 -08:00
Karthik Gururaj	90938b8610	Minor typo in comments fixed	2014-01-16 19:58:04 -08:00
Karthik Gururaj	e90405cd1f	1. Nested loops over reads and haplotypes moved to C++ through JNI 2. OpenMP support added 3. Using direct access to Java primitive arrays 4. Debug messages disabled	2014-01-16 19:53:50 -08:00
Karthik Gururaj	e8a5022777	1. Added support for JNI integration for LoglessCaching PairHMM AVX implementation. 2. Contains lots of debug code 3. Only invokes JNI for subComputeReadLikelihoodGivenHaplotypeLog10	2014-01-15 11:07:09 -08:00
Karthik Gururaj	8240ea826e	Changes: 1. Added TRISTATE_CORRECTION in pairhmm-template-kernel.cc (function stripINITIALIZATION) 2. Added VEC_DIV macros to define-double.h and define-float.h 3. Edited initializeVectors to match Java C++ original: (ptr_p_MY+r-1) = (r == ROWS - 1) ? ctx._(1.0) : ctx.ph2pr[_d]; (ptr_p_YY+r-1) = (r == ROWS - 1) ? ctx._(1.0) : ctx.ph2pr[_c]; Modified: (ptr_p_MY+r-1) = ctx.ph2pr[_d]; (ptr_p_YY+r-1) = ctx.ph2pr[_c];	2014-01-15 10:48:58 -08:00
Karthik Gururaj	5fab96b7ee	First import of AVX-JNI to git	2014-01-14 17:26:55 -08:00
Eric Banks	f6a44afa3a	Merge pull request #464 from broadinstitute/eb_rev_variant_jar_for_bcf_fixes Rev'ing the Variant jar to incorporate some patches to the BCF encoder t...	2014-01-02 21:05:13 -08:00
Eric Banks	856c17868b	Rev'ing the Variant jar to incorporate some patches to the BCF encoder that Menachem needs.	2014-01-02 23:33:17 -05:00
Ryan Poplin	5c32ad174a	Merge pull request #452 from broadinstitute/rp_vqsr_aggregate_model Allow for additional input data to be used in the VQSR for clustering bu...	2014-01-02 12:54:45 -08:00
Ryan Poplin	856c1f87c1	Allow for additional input data to be used in the VQSR for clustering but don't carry it forward into the output VCF file. -- New -a argument in the VQSR for specifying additional data to be used in the clustering -- New NA12878KB walker which creates ROC curves by partitioning the data along VQSLOD and calculating how many KB TP/FP's are called.	2014-01-02 14:46:04 -05:00
Ryan Poplin	c82501ac35	Merge pull request #462 from broadinstitute/rp_SingleSampleHC_exome_scala Adding SingleSampleHC_exome.scala for Valentin to use as a jumping off p...	2014-01-02 08:57:27 -08:00
Ryan Poplin	15372c4873	Adding SingleSampleHC_exome.scala for Valentin to use as a jumping off point.	2014-01-02 11:56:17 -05:00
amilev	f81a38f596	Merge pull request #446 from broadinstitute/ami-RNAseq-tools Write a new tool for spliting reads that have N cigar string.	2014-01-01 21:06:25 -08:00
MauricioCarneiro	1223345726	Merge pull request #459 from broadinstitute/eb_fix_bad_hmm_clipping Fixed up edge condition for clipping long reads in the HMM.	2014-01-01 20:00:34 -08:00
Ami Levy-Moonshine	6da53aea09	Write a new tool for spliting reads that have N cigar string. For example, this tool can be used for processing bowtie RNA-seq data. Each read with k N-cigar elemments is plit to k+1 reads. The split is done by hard clipping the bases rest of the bases. In order to do it, few changes were introduced to some other clipping methods: - make a segnificant change in ClippingOp.hardClip() that prevent the spliting of read with cigar: 1M2I1N1M3I. - change getReadCoordinateForReferenceCoordinate in ReadUtil to recognize Ns create unitTests for that walker: - change ReadClipperTestUtils to be more general in order to use its code and avoid code duplication - move some useful methods from ReadClipperTestUtils to CigarUtils create integration test for that class small change in a comment in FullProcessingPipeline last commit: Address review comments: - move to protected under walkers/rnaseq - change the read splitting methods to be more readable and more efficiant - change (minor changes) some methods in ReadClipper to allow the changes in split reads - add (minor change) one method to CigarUtils to allow the changes in split reads - change ReadUtils.getReadCoordinateForReferenceCoordinate to include possible N in the cigar - address the rest of the review comments (minor changes) - fix ReadUtilsUnitTest.testReadWithNs acoording to the defult behaviour of getReadCoordinateForReferenceCoordinate (in case of refernce index that fall into deletion, return the read index of the base before the deletion). - add another test to ReadUtilsUnitTest.testReadWithNs - Allow the user to print the split positions (not working proparly currently)	2014-01-01 22:21:36 -05:00
Eric Banks	bb4c4b1fcd	Fixed up edge condition for clipping long reads in the HMM. MD5s change because some reads were incorrectly getting clipped before. [delivers #62584746]	2014-01-01 19:05:09 -05:00

1 2 3 4 5 ...

12946 Commits (7815c30df8c618ccaceffb663f42c87ee052ab3a) All Branches Search

12946 Commits (7815c30df8c618ccaceffb663f42c87ee052ab3a)

All Branches