Commit Graph

13678 Commits (bcf6be0b08f8cd87dfdde05aa4e7fb6a370663cb)

Author SHA1 Message Date
Karthik Gururaj 733a84e4f9 Added support to transfer haplotypes once per region to the JNI
Re-use transferred haplotypes (stored in GlobalRef) across calls to
computeLikelihoods
2014-01-22 10:52:41 -08:00
Karthik Gururaj 868a8394f7 Deleted libJNILoglessPairHMM.so from git tracking 2014-01-21 15:01:09 -08:00
Karthik Gururaj 217f6948f1 Merge branch 'master' of /home/mozdal/git/hmm into intel_pairhmm
Conflicts:
	PairHMM_JNI/pairhmm-1-base.cc
	PairHMM_JNI/pairhmm-template-kernel.cc
	PairHMM_JNI/utils.cc
2014-01-21 12:43:16 -08:00
mozdal 1b1c0c8e76 Split the inner loop to avoid the overhead incurred when -fPIC flag is enabled. 2014-01-21 11:47:30 -08:00
Karthik Gururaj 88c08e78e7 1. Inserted #define in sandbox pairhmm-template-main.cc
2. Wrapped _mm_empty() with ifdef SIMD_TYPE_SSE
3. OpenMP disabled
4. Added code for initializing PairHMM's data inside initializePairHMM -
not used yet
2014-01-21 09:57:14 -08:00
mozdal 0170d4f3d5 Got rid of the MMX instructions in the SSE version of the code. Handling the mask operations in a class, which is defined for each version of SSE and AVX implementations separately. 2014-01-21 09:30:15 -08:00
Ryan Poplin bdd06ebfc2 Merge pull request #478 from broadinstitute/eb_generalize_hc_values_as_args
Pulled out some hard-coded values from the read-threading and isActive c...
2014-01-21 09:01:54 -08:00
Eric Banks 8812278c2c Merge pull request #479 from broadinstitute/eb_move_test_up_one_level
Moving this test up one level to where it actually belongs.
2014-01-21 06:45:55 -08:00
Karthik Gururaj 28891117e2 Fixed bug in JNI interface release_array
Disabled OpenMP
2014-01-20 11:07:44 -08:00
Karthik Gururaj f614d7b0d8 1. Enabled OpenMP
2. Enabled AVX - earlier commit had disabled AVX
2014-01-20 08:51:53 -08:00
Karthik Gururaj 7180c392af 1. Integrated Mohammad's SSE4.2 code, Mustafa's bug fix and code to fix the
SSE compilation warning.
2. Added code to dynamically select between AVX, SSE4.2 and normal C++ (in
that order)
3. Created multiple files to compile with different compilation flags:
avx_function_prototypes.cc is compiled with -xAVX while
sse_function_instantiations.cc is compiled with -xSSE4.2 flag.
4. Added jniClose() and support in Java (HaplotypeCaller,
PairHMMLikelihoodCalculationEngine) to call this function at the end of
the program.
5. Removed debug code, kept assertions and profiling in C++
6. Disabled OpenMP for now.
2014-01-20 08:03:42 -08:00
Eric Banks 9e858270d7 Moving this test up one level to where it actually belongs. 2014-01-19 02:33:11 -05:00
Eric Banks 64d5bf650e Pulled out some hard-coded values from the read-threading and isActive code of the HC, and made them into a single argument.
In unifying the arguments it was clear that the values were inconsistent throughout the code, so now there's a
single value that is intended to be more liberal in what it allows in (in an attempt to increase sensitivity).

Very little code actually changes here, but just about every md5 in the HC integration tests are different (as
expected).  Added another integration test for the new argument.

To be used by David R to test his per-branch QC framework: does this commit make the HC look better against the KB?
2014-01-19 01:15:13 -05:00
Karthik Gururaj 25aecb96e0 Added support for dynamic selection between AVX and un-vectorized C++,
still to include SSE code from Mohammad.
Debug flags turned on in this commit.
2014-01-18 11:07:23 -08:00
Eric Banks abd4f552ba Merge pull request #476 from broadinstitute/yf_logging_all_input_SAMFiles
Added an info log containing the SAM/BAM files that were eventually found.
2014-01-17 08:54:33 -08:00
Yossi Farjoun c79e8ca53e Added an info log containing the SAM/BAM files that were eventually found from the commandline (useful for when there are files hiding inside bam.lists which may or may not have been constructed correctly...)
Added a @hidden option controling the appearance of the full BamList in the log
2014-01-17 11:25:21 -05:00
Intel Repocontact d53e2fbe66 Uncommenting download option in build.xml 2014-01-16 21:55:04 -08:00
Karthik Gururaj f1c772ceea Same log message as before - forgot -a option
1. Moved computeLikelihoods from PairHMM to native implementation
2. Disabled debug - debug code still left (hopefully, not part of
    bytecode)
3. Added directory PairHMM_JNI in the root which holds the C++
library that contains the PairHMM AVX implementation. See
PairHMM_JNI/JNI_README first
2014-01-16 21:40:04 -08:00
Karthik Gururaj d7ba1f1c28 1. Moved computeLikelihoods from PairHMM to native implementation
2. Disabled debug - debug code still left (hopefully, not part of
bytecode)
3. Added directory PairHMM_JNI in the root which holds the C++ library
that contains the PairHMM AVX implementation. See PairHMM_JNI/JNI_README
first
2014-01-16 21:36:15 -08:00
Karthik Gururaj b57de8eec1 Merge branch 'master' of /home/karthikg/broad/archive/hmm_intra into intel_pairhmm 2014-01-16 20:29:51 -08:00
Karthik Gururaj e6c6f8e313 Renamed directory 2014-01-16 20:28:50 -08:00
Karthik Gururaj 532485ca59 Removed unnecessary files 2014-01-16 20:26:41 -08:00
Karthik Gururaj 90938b8610 Minor typo in comments fixed 2014-01-16 19:58:04 -08:00
Karthik Gururaj e90405cd1f 1. Nested loops over reads and haplotypes moved to C++ through JNI
2. OpenMP support added
3. Using direct access to Java primitive arrays
4. Debug messages disabled
2014-01-16 19:53:50 -08:00
Eric Banks 3b6b7626aa Merge pull request #472 from broadinstitute/eb_extend_private_simulate_reads_tool
Fixed up and refactored what seems to be a useful private tool to create...
2014-01-15 17:51:07 -08:00
Karthik Gururaj e8a5022777 1. Added support for JNI integration for LoglessCaching PairHMM AVX
implementation.
2. Contains lots of debug code
3. Only invokes JNI for subComputeReadLikelihoodGivenHaplotypeLog10
2014-01-15 11:07:09 -08:00
Eric Banks de56134579 Fixed up and refactored what seems to be a useful private tool to create simulated reads around a VCF.
It didn't completely work before (it was hard-coded for a particular long-lost data set) but it should work now.
Since I thought that it might prove useful to others, I moved it to protected and added integration tests.

GERALDINE: NEW TOOL ALERT!
2014-01-15 13:49:31 -05:00
Karthik Gururaj 8240ea826e Changes:
1. Added TRISTATE_CORRECTION in pairhmm-template-kernel.cc (function
stripINITIALIZATION)
2. Added VEC_DIV macros to define-double.h and define-float.h
3. Edited initializeVectors to match Java
C++ original:
*(ptr_p_MY+r-1) = (r == ROWS - 1) ? ctx._(1.0) : ctx.ph2pr[_d];
*(ptr_p_YY+r-1) = (r == ROWS - 1) ? ctx._(1.0) : ctx.ph2pr[_c];
Modified:
*(ptr_p_MY+r-1) = ctx.ph2pr[_d];
*(ptr_p_YY+r-1) = ctx.ph2pr[_c];
2014-01-15 10:48:58 -08:00
Eric Banks e2c2aa7b05 Merge pull request #475 from broadinstitute/eb_fix_null_alleles_bug_PT63551060
Added in a check for what would be an empty allele after trimming.
2014-01-15 08:05:21 -08:00
Eric Banks 9f1ab0087a Added in a check for what would be an empty allele after trimming. 2014-01-15 11:04:19 -05:00
Karthik Gururaj 5fab96b7ee First import of AVX-JNI to git 2014-01-14 17:26:55 -08:00
Ryan Poplin 201ad398ac Merge pull request #473 from broadinstitute/eb_fix_qd_indel_normalization
The QD normalization for indels was busted and is now fixed.
2014-01-14 08:56:19 -08:00
Eric Banks e4fdc5ac44 Merge pull request #474 from broadinstitute/eb_fix_haplotype_resolver_PT63333488
Fixing the Haplotype Resolver so that it doesn't complain about missing header lines
2014-01-14 07:36:53 -08:00
Geraldine Van der Auwera f67c33919b Merge pull request #468 from broadinstitute/gg_fixSAMPileup
Updated SAMPileup codec and pileup-related docs
2014-01-14 06:30:04 -08:00
Geraldine Van der Auwera edf5880022 Updated SAMPileup codec and pileup-related docs
Problem: the codec was written to take in consensus pileups produced with pileup -c option (which consists of 10 or 13 fields per line depending on the variant type) but errored out on the basic pileup format (which only has 6 fields per line). This was inconsistent and confusing to users.

	Solution: I added a switch in the parsing to recognize and handle both cases more appropriately, and updated related docs. While I was at it I also improved error messages in CheckPileup, which now emits User Error: Bad Input exceptions when reporting mismatches. Which may not be the best thing to do (ultimately they're not really errors, they're just reporting unwelcome results) but it beats emitting Runtime Exceptions.

	Tested by CheckPileupIntegrationTest which tests both format cases.
2014-01-14 09:14:16 -05:00
Eric Banks 16ecc53749 Merge pull request #469 from broadinstitute/gg_gatkdoc_fixes
Assorted fixes and improvements to gatkdocs
2014-01-14 05:56:07 -08:00
Eric Banks fd511d12a2 Fixing the Haplotype Resolver so that it doesn't complain about missing header lines.
The code comments very clearly state that INFO fields shouldn't be propagated into the output,
but someone must have accidentally changed it afterwards.  This is just a simple one-line fix
to make sure the code adhered to the comments.

Delivers #63333488.
2014-01-13 22:47:43 -05:00
droazen 347fab4717 Merge pull request #471 from broadinstitute/eb_output_log_info_for_tim
Adding more meta information about the user to the GATK logging output, per Tim F's request.
2014-01-13 17:48:40 -08:00
Geraldine Van der Auwera bdb3954eb3 removed maxRuntime minValue 2014-01-13 20:45:43 -05:00
Geraldine Van der Auwera 8fcad6680b Assorted fixes and improvements to gatkdocs
-Added docs for ERC mode in HC
 -Move RecalibrationPerformance walker since to private since it is experimental and unsupported
 -Updated VR docs and restored percentBad/numBad (but @Hidden) to enable deprecation alert if users try to use them
 -Improved error msg for conflict between per-interval aggregation and -nt
 -Minor clean up in exception docs
 -Added Toy Walkers category for devs and dev supercat (to build out docs for developers)
 -Added more detailed info to GenotypeConcordance doc based on Chris forum post
 -Added system to include min/max argument values in gatkdocs (build gatkdocs with 'ant gatkdocs' to test it, see engine and DoC args for in situ examples)
 -Added tentative min/max argument annotations to DepthOfCoverage and CommandLineGATK arguments (and improved docs while at it)
 -Added gotoDev annotation to GATKDocumentedFeature to track who is the go-to person in GSA for questions & issues about specific walkers/tools (now discreetly indicated in each gatkdoc)
2014-01-13 17:46:22 -05:00
Eric Banks c7e08965d0 The QD normalization for indels was busted and is now fixed.
It is true that indels of length > 1 have higher QUALS than those of length = 1.  But for the HC those
QUALS are not that much higher, and it doesn't continue scaling up as the indels get larger.  So we no
longer normalize by indel length (which massively over-penalizes larger events and effectively drops their
QD to 0).

For the UG the previous normalization also wasn't perfect.  Now we divide the indel length by a factor
of 3 to make sure that QD is consistent over the range of indel lengths.

Integration tests change because QD is different for indels.
Also, got permission from Valentin to archive a failing test that no longer applies.

Thanks to Kurt on the GATK forum for pointing this all out.
2014-01-13 15:23:36 -05:00
Eric Banks 851ec67bdc Adding more meta information about the user to the GATK logging output, per Tim F's request. 2014-01-13 14:36:02 -05:00
droazen 7cd304fb41 Merge pull request #470 from broadinstitute/mf_new_RBP
Mf new rbp
2014-01-13 08:46:27 -08:00
Ryan Poplin 3b8209f3b2 Merge pull request #467 from broadinstitute/rp_fix_names_NA12878ROCCurve
The ROC Curve report lists the name as the name of the vcf file now inst...
2014-01-09 06:56:34 -08:00
MauricioCarneiro 50cd6781b3 Merge pull request #465 from broadinstitute/eb_improvements_to_ref_confidence_merger
Improvements to ref confidence merger
2014-01-08 10:51:01 -08:00
Ryan Poplin 8881926bc6 The ROC Curve report lists the name as the name of the vcf file now instead of project+name. 2014-01-08 09:44:21 -05:00
Ryan Poplin c86e36c909 Merge pull request #466 from broadinstitute/rp_phase3_vqsr_scala
Adding here the Qscript used to perform the VQSR for 1000 Genomes Projec...
2014-01-08 06:39:46 -08:00
Ryan Poplin 7d5a710ea6 Adding here the Qscript used to perform the VQSR for 1000 Genomes Project phase 3 2014-01-08 09:38:13 -05:00
Eric Banks 553b3e56bd Merge pull request #463 from broadinstitute/eb_fix_realigner_bugs_from_pearson
Fixed edge condition in the realigner where a realigned read can sometim...
2014-01-08 05:36:11 -08:00
Eric Banks 0323caefc8 Added some bug fixes to the gVCF merging code after finally getting some real data to play with.
Still under construction, awaiting more test data from Valentin.
2014-01-08 08:34:35 -05:00