gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Laura Gauthier	bf7b97393e	Add ability to output to a file discordant loci and their respective genotypes for each sample	2014-05-07 10:12:45 -04:00
MauricioCarneiro	f03a12263a	Merge pull request #625 from broadinstitute/intel_updateCell_inlined (Optional) Inlined the code from updateCell	2014-05-07 10:11:09 -04:00
MauricioCarneiro	bd33c3334e	Merge pull request #626 from broadinstitute/intel_pairhmm Removed scary warning messages for VectorPairHMM	2014-05-06 19:23:35 -04:00
Karthik Gururaj	d9c489f928	Removed scary warning messages for VectorPairHMM	2014-05-06 10:59:24 -07:00
Karthik Gururaj	fb8578ec8e	Inlined the code from updateCell - helps Java JIT to detect hotspots and produce good native code	2014-05-06 10:37:10 -07:00
MauricioCarneiro	587e81fbd9	Merge pull request #573 from broadinstitute/intel_pairhmm Intel pairhmm	2014-05-05 16:27:04 -04:00
Karthik Gururaj	f6ea25b4d1	Parallel version of the JNI for the PairHMM The JNI treats shared memory as critical memory and doesn't allow any parallel reads or writes to it until the native code finishes. This is not a problem per se it is the right thing to do, but we need to enable -nct when running the haplotype caller and with it have multiple native PairHMM running for each map call. Move to a copy based memory sharing where the JNI simply copies the memory over to C++ and then has no blocked critical memory when running, allowing -nct to work. This version is slightly (almost unnoticeably) slower with -nct 1, but scales better with -nct 2-4 (we haven't tested anything beyond that because we know the GATK falls apart with higher levels of parallelism * Make VECTOR_LOGLESS_CACHING the default implementation for PairHMM. * Changed version number in pom.xml under public/VectorPairHMM * VectorPairHMM can now be compiled using gcc 4.8.x * Modified define-* to get rid of gcc warnings for extra tokens after #undefs * Added a Linux kernel version check for AVX - gcc's __builtin_cpu_supports function does not check whether the kernel supports AVX or not. * Updated PairHMM profiling code to update and print numbers only in single-thread mode * Edited README.md, pom.xml and Makefile for users to pass path to gcc 4.8.x if necessary * Moved all cpuid inline assembly to single function Changed info message to clog from cinfo * Modified version in pom.xml in VectorPairHMM from 3.1 to 3.2 * Deleted some unnecessary code * Modified C++ sandbox to print per interval timing	2014-05-02 19:12:48 -04:00
Ryan Poplin	38b7cfbda9	Merge pull request #621 from broadinstitute/vrr_combine_gvcf_bugfix Fix for CombineGVCFs and GenotypeGVCFs recurrent exception about missing...	2014-05-02 11:52:32 -04:00
Ryan Poplin	4957359963	Merge pull request #620 from broadinstitute/ldg_fixCGL Fixed GP annotation reading bug and updated integration test	2014-05-02 11:38:06 -04:00
Valentin Ruano-Rubio	d563072282	Fix for CombineGVCFs and GenotypeGVCFs recurrent exception about missing PLs Story: https://www.pivotaltracker.com/story/show/68220438 Changes: - PL-less input genotypes are now uncalled and so non-variant sites when combining GVCFs. - HC GVCF/BP_RESOLUTION Mode now outputs non-variant sites in sites covered by deletions. - Fixed existing tests Test: - HaplotypeCallerGVCFIntegrationTest - ReferenceConfidenceModelUnitTest - CombineGVCFsIntegrationTest	2014-05-02 09:21:06 -04:00
Laura Gauthier	e64479d4f5	Fixed GP annotation reading bug and updated integration test	2014-05-02 09:08:28 -04:00
jmthibault79	cb97afd7d8	Merge pull request #618 from broadinstitute/mm_java_8 Java 8 compatability fix: `Reflections` NPE.	2014-05-01 10:33:31 -04:00
Michael McCowan	fe3c68cb2d	Java 8 compatability fix: `Reflections` NPE bugfix.	2014-04-29 13:34:03 -04:00
Valentin Ruano Rubio	4ce09d8693	Merge pull request #617 from broadinstitute/rp_GenotypeGVCF_trimAD When we subset PLs because Alleles are removed during genotyping we also...	2014-04-28 19:53:41 -04:00
Ryan Poplin	41d3069213	When we subset PLs because Alleles are removed during genotyping we also need to subset AD.	2014-04-28 15:52:26 -04:00
kshakir	10ee35eafa	Merge pull request #616 from broadinstitute/ks_cjav_pbsengine_no_default_queue Removed setting of a default queue in PbsEngineJobRunner.	2014-04-28 14:24:51 -04:00
Ryan Poplin	06dbe74a23	Merge pull request #609 from kcibul/kc_cancersimreads extended SimulateReadsForVariants to optionally use the AF field to indi...	2014-04-28 13:31:56 -04:00
Ryan Poplin	8d5a7d412b	Merge pull request #615 from broadinstitute/ami-createCigarDNFilter create a new read filter (transformer) that refactor NDN cigar elements ...	2014-04-28 13:31:04 -04:00
Carlos Borroto	b7a59e01aa	Removed setting of a default queue in PbsEngineJobRunner. Discussed here: http://gatkforums.broadinstitute.org/discussion/3959/would-it-be-possible-for-pbsengine-jobrunner-not-to-set-a-default-queue Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>	2014-04-29 00:44:12 +08:00
Ami Levy-Moonshine	13dd755468	create a new read transformer that refactor NDN cigar elements to one N element. story: https://www.pivotaltracker.com/story/show/69648104 description: This read transformer will refactor cigar strings that contain N-D-N elements to one N element (with total length of the three refactored elements). This is intended primarily for users of RNA-Seq data handling programs such as TopHat2. Currently we consider that the internal N-D-N motif is illegal and we error out when we encounter it. By refactoring the cigar string of those specific reads, users of TopHat and other tools can circumvent this problem without affecting the rest of their dataset. edit: address review comments - change the tool's name and change the tool to be a readTransformer instead of read filter	2014-04-28 11:29:00 -04:00
Eric Banks	385fe5fb56	Merge pull request #614 from broadinstitute/rp_fix_GenotypeGVCF_VCF_headers GenotypeGVCF was pulling the headers from all input rods including DBsnp...	2014-04-25 15:36:35 -04:00
Ryan Poplin	221b999cb0	GenotypeGVCF was pulling the headers from all input rods including DBsnp. Now it pulls from just the input variant rods.	2014-04-25 13:16:28 -04:00
ldgauthier	147ae21253	Merge pull request #606 from broadinstitute/ldg_CalibrateLikelihoodsForCGP Improvements to CalculateGenotypePosteriors and CalibrateGenotypeLikelih...	2014-04-24 10:58:40 -04:00
Laura Gauthier	9f3cbb2ef1	Improvements to CalculateGenotypePosteriors and CalibrateGenotypeLikelihoods CalculateGenotypePosteriors now only computes posterior probs for SNP sites with SNP priors (other sites have flat priors applied) CalibrateGenotypeLikelihoods had originally applied HOM_REF/HET/HOM_VAR frequencies in callset as priors before empirical quality analysis. Now has option (-noPriors) to not apply/apply flat priors. Also takes in new external probabilities files, such as those generated by CGP, from which the genotype posterior probability qualities will be read. Integration test was changed to account for new SNP-only behavior and default behavior to not use missing priors. (Also, new numRefIfMissing is 0, which should only matter in cases using few samples when you probably don't want to be doing that anyway!)	2014-04-24 08:49:42 -04:00
amilev	92a3aa35d5	Merge pull request #613 from broadinstitute/ami-RNAEdttingTool create a new tool CountMutationTypes	2014-04-23 17:17:02 -04:00
Ami Levy-Moonshine	9e5333f1d1	create a new tool CountMutationTypes The new tool gets an VCF file as an input and create a GATK report with the percentages of each mutation type (e.g. A->G, A->T...). It allow the user to filter sites that will be count based of JXEL or based on the varait quals A user can aslo print 12 VCF files (one for each mutation) with the VCF line of the mutations that were counted.	2014-04-23 14:22:33 -04:00
droazen	58c8b2dd84	Merge pull request #611 from broadinstitute/mm_otf_sample_rename_support_whitespacing_sample_names Allow for whitespace in sample names when performing on-the-fly sample-renaming.	2014-04-22 13:01:15 -04:00
Michael McCowan	8290d3c8ac	Allow for non-tab whitespace in sample names when performing on-the-fly sample-renaming.	2014-04-22 11:07:13 -04:00
Valentin Ruano Rubio	d38835822e	Merge pull request #612 from broadinstitute/vrr_integration_test_error_quickfix Fixed integration test problems from previous premature merge	2014-04-20 18:40:22 -04:00
Valentin Ruano-Rubio	e610373169	Fixed integration test problems from previous premature merge	2014-04-20 17:11:51 -04:00
MauricioCarneiro	f03e5ffeb1	Merge pull request #604 from broadinstitute/vrr_hc_omniploidy_general_api Disentangle UG and HC Genotyper engines.	2014-04-20 07:43:23 -04:00
Valentin Ruano-Rubio	4e5850966a	Reengineer engine constructors	2014-04-19 17:58:14 -04:00
Valentin Ruano-Rubio	7455ac9796	Addressed revisions	2014-04-19 16:48:48 -04:00
Ryan Poplin	a9a48f2459	Merge pull request #607 from broadinstitute/mm_bugfix_raise_mathutils_n_ceiling Support more samples in math utilities.	2014-04-17 13:32:34 -04:00
jmthibault79	b840cf6b3f	Merge pull request #610 from broadinstitute/jt_block_compressed_vcfs Enable reading of other extensions for block-compressed VCFs	2014-04-17 12:32:49 -04:00
Joel Thibault	1ab50f4ba8	CatVariants now handles BCF and Block-Compressed VCF [Delivers #67461500]	2014-04-17 12:31:38 -04:00
Kristian Cibulskis	6b9e38c8bb	incorporated comments from review, made variables final, made AF paramater hidden, and added bounds checking to AF value	2014-04-16 19:29:25 -04:00
Kristian Cibulskis	7115cadbd8	extended SimulateReadsForVariants to optionally use the AF field to indicate allele fraction of the simulated event, useful in cancer and other variable ploidy use cases	2014-04-16 16:20:02 -04:00
Joel Thibault	4c74319578	Update for Picard refactoring which improves block-compressed VCF reading [Delivers #69215404]	2014-04-16 14:39:23 -04:00
Joel Thibault	fd09cb7143	Rev Picard 1.111.1920	2014-04-16 14:39:19 -04:00
Joel Thibault	f98df5c071	Integration test for the file extensions CatVariants should handle	2014-04-16 13:25:47 -04:00
Joel Thibault	bdd7024d00	Integration test for block-compressed VCF reading	2014-04-16 13:09:40 -04:00
Joel Thibault	ce770b032a	Move execAndCheck() to ProcessController	2014-04-16 13:09:40 -04:00
Joel Thibault	b197618d13	This comment is no longer true	2014-04-15 15:42:39 -04:00
MauricioCarneiro	34ece31f4a	Merge pull request #605 from broadinstitute/ks_escape_dir_names Quoting -out parameter during resource bundle creation	2014-04-15 05:56:35 -04:00
Khalid Shakir	218fe3875a	Quoting -out parameter during resource bundle (StingText.properties) creation. Fixes case where directory has parenthesis in it, like "Dropbox (Broad Dropbox1)".	2014-04-15 17:06:49 +08:00
Mike	f0732d386c	Support more samples in math utilities. - Amend `MathUtils`' constants such that they support callings in excess of 70,000 samples (instead, 100,000).	2014-04-14 12:05:38 -04:00
Valentin Ruano-Rubio	08203b516e	Disentangle UG and HC Genotyper engines. Description: Transforms a delegation dependency from HC to UG genotyping engine into a reusage by inhertance where HC and UG engines inherit from a common superclass GenotyperEngine that implements the common parts. A side-effect some of the code is now more clear and redundant code has been removed. Changes have a few consequence for the end user. HC has now a few more user arguments, those that control the functionality that HC was borrowing directly from UGE. Added -ploidy argument although it is contraint to be 2 for now. Added -out_mode EMIT_ALL_SITES\|EMIT_VARIANTS_ONLY ... Added -allSitePLs flag. Stories: https://www.pivotaltracker.com/story/show/68017394 Changes: - Moved (HC's) GenotyperEngine to HaplotypeCallerGenotyperEngine (HCGE). Then created a engine superclass class GenotypingEngine (GE) that contains common parts between HCGE and the UG counterpart 'UnifiedGenotypingEngine' (UGE). Simplified the code and applied the template pattern to accomodate for small diferences in behaviour between both caller engines. (There is still room for improvement though). - Moved inner classes and enums to top-level components for various reasons including making them shorter and simpler names to refer to them. - Create a HomoSpiens class for Human specific constants; even if they are good default for most users we need to clearly identify the human assumption across the code if we want to make GATK work with any species in general; i.e. any reference to HomoSapiens, except as a default value for a user argument, should smell. - Fixed a bug deep in the genotyping calculation we were taking on fixed values for snp and indel heterozygisity to be the default for Human ignoring user arguments. - GenotypingLikehooldCalculationCModel.Model to Gen.Like.Calc.*Model.Name; not a definitive solution though as names are used often in conditionals that perhaps should be member methods of the GenLikeCalc classes. - Renamed LikelihoodCalculationEngine to ReadLikelihoodCalculationEngine to distinguish them clearly from Genotype likelihood calculation engines. - Changed copy by explicity argument listing to a clone/reflexion solution for casting between genotypers argument collection classes. - Created GenotypeGivenAllelesUtils to collect methods needed nearly exclusively by the GGA mode. Tests : - StandardCallerArgumentCollectionUnitTest (check copy by cloning/reflexion). - All existing integration and unit tests for modified classes.	2014-04-13 03:09:55 -04:00
Ryan Poplin	4b140c9e48	Merge pull request #600 from broadinstitute/rp_random_forest_no_QUAL Improvements to the Random Forest pipeline based on Marathon results.	2014-04-11 13:41:05 -04:00
Ryan Poplin	04ddbac585	Improvements to the Random Forest pipeline based on Marathon results. -- We no longer use QUAL because it scales insidiously with AC. -- By default we exclude sites in which NA12878 is polymorphic to prevent overfitting to the knowledgebase. -- Tweaks to training parameters were required because of the QUAL change. -- We now test for model convergence instead of specifying the number of iterations at the command line.	2014-04-11 12:16:05 -04:00

1 2 3 4 5 ...

13400 Commits (bf7b97393e91c941aa3e8b6c53ce723ef12c5f85) All Branches Search

13400 Commits (bf7b97393e91c941aa3e8b6c53ce723ef12c5f85)

All Branches