gatk-3.8

Commit Graph

Author	SHA1	Message	Date
rpoplin	7ec38bf2a2	Merge pull request #752 from broadinstitute/pd_cran_license Changed license of gsalib R package from BSD to MIT	2014-10-21 11:51:24 -04:00
rpoplin	312a833db3	Merge pull request #751 from broadinstitute/pd_bamout_ref_regions Added -disableOptimizations argument to HaplotypeCaller.	2014-10-21 11:50:46 -04:00
Phillip Dexheimer	f766608b4e	Changed license of gsalib R package from BSD to MIT - PT 75827540	2014-10-16 21:37:07 -04:00
Phillip Dexheimer	b348ce8f25	Added -disableOptimizations argument to HaplotypeCaller. * This argument is intended to be used in conjunction with -bamout, and disable early-exit optimizations to allow reference regions to be contained in the output bam * Also forcibly includes the reference haplotype in the set of haplotypes given to the BAMWriter * Made -dontTrimActiveRegions visible, as it is likely also desirable in this use case * Addresses PT 77731660	2014-10-16 21:11:20 -04:00
ldgauthier	5fa5724e4a	Merge pull request #750 from broadinstitute/ldg_uniqueSamplesInCombineVariants Throw UserException if input VCFs have duplicate samples but no genotype...	2014-10-16 07:57:27 -04:00
Laura Gauthier	0f08065ebc	Throw UserException if input VCFs have duplicate samples but no genotypemergeoption is specified	2014-10-15 16:03:10 -04:00
rpoplin	286f4cc8a8	Merge pull request #749 from broadinstitute/ldg_CGPfasterTests Decrease interval on CGP integration test to reduce test execution time	2014-10-15 15:53:48 -04:00
Laura Gauthier	81482138ca	Decrease interval on CGP integration test to reduce test execution time	2014-10-15 11:28:27 -04:00
rpoplin	0ad9daeac1	Merge pull request #747 from broadinstitute/gg_fixLicenseOmission Updated license files and text	2014-10-15 11:26:42 -04:00
Geraldine Van der Auwera	e7e8052f84	Updated license information - Updated license files (private/protected) for version, address and a couple of legal clauses - Updated license snippet throught the codebase	2014-10-14 17:10:12 -04:00
rpoplin	32bf46d8bf	Merge pull request #748 from broadinstitute/rp_roc_curve_qual_field In ROCCurveNA12878, if VQSLOD value is not present default to using the ...	2014-10-14 14:30:52 -04:00
Ryan Poplin	9680511cc3	In ROCCurveNA12878, if VQSLOD value is not present default to using the QUAL field instead to rank the variants.	2014-10-14 12:15:23 -04:00
rpoplin	426907ddd0	Merge pull request #744 from broadinstitute/gg_gatkdocs_annots_and_GSON Output JSON version of docs for Galaxy	2014-10-14 11:41:16 -04:00
ldgauthier	d259f3c84f	Merge pull request #745 from broadinstitute/ldg_VariantAnnotatorDocs Added docs to VariantFiltration is accordance with new htsjdk changes. ...	2014-10-10 14:11:24 -04:00
jmthibault79	22a3c6e556	Merge pull request #742 from broadinstitute/rhl_prob_thresh_param_arg Rhl prob thresh param arg	2014-10-10 12:55:01 -04:00
Laura Gauthier	0ecb85d321	Added docs to VariantFiltration is accordance with new htsjdk changes. Fixed typo in VariantAnnotator docs.	2014-10-10 11:54:24 -04:00
Ron Levine	36c27155af	Made the threshold for the probability of a state being active a command line argument remove TODO comment after activeProbThreshold recover static ACTIVE_PROB_THRESHOLD for unit tests Add min/max values for active_probability_threshold parameter Move activeProbThreshold parameter to GATKArguemtnCollection define ACTIVE_PROB_THRESHOLD in unit tests add construction of argCollection in in ctor Move arguments from GATKArgumentCollection to ActiveRegionWalker Throw exception if threshold < 0 or > 1 in ActivityProfile ctor max propogation distance parameter to ActiveRegionWalker for AcrtivityProfile Use polymorphic getMaxProbPropagationDistance() so BandPassActivityProfile computes the crrect region size cutoff Get the maxProbPropagationDistance from the super class's method, instead of directly, this is safer Removed extraneous command line imports and make maxProbPropagationDistance a hidden argument remove limit check for activeProbThreshold, not necessary because the check is made when imput as a command line arg Remove extra 'region' in the doxygen param description for maxProbPropagationDistance	2014-10-10 10:36:02 -04:00
jmthibault79	4e099c1f44	Merge pull request #741 from broadinstitute/rhl_activiteregion_read_limits_args Changed hardcoded downsampling max/min coverage values to parameters	2014-10-09 18:49:22 -04:00
Ron Levine	645d418015	Changed hardcoded downsampling max/min coverage values to parameters Rename parameters using camel case and add to integration test Correct documentation for maxReadsInRegionPerSample and minReadsPerAlignmentStart Change the argument--minReadsPerAlignmentStart in the integration test from 50 to 5 'each genomic location' only pertains to minReadsPerAlignmentStart, not maxReadsInRegionPerSample	2014-10-09 17:09:26 -04:00
Geraldine Van der Auwera	3f21f63161	Output JSON version of docs for Galaxy	2014-10-09 06:42:25 -04:00
rpoplin	1cbf37c539	Merge pull request #739 from broadinstitute/rp_KB_reviews Adding manual review.	2014-10-01 09:53:08 -04:00
rpoplin	b0481ced87	Merge pull request #738 from broadinstitute/vrr_slow_integration_test_due_to_general_ploidy_exact_af_calculator Reduce execution time of various integration tests	2014-10-01 09:52:04 -04:00
Valentin Ruano-Rubio	a3ad6f63bd	Reduce execution time of various integration tests Story: https://www.pivotaltracker.com/story/show/79461912	2014-09-30 13:28:55 -04:00
Ryan Poplin	d609b2cdbb	Adding manual review.	2014-09-30 12:53:55 -04:00
rpoplin	329bd081b7	Merge pull request #736 from broadinstitute/rhl_remove_line removed an unneed import that broke maven	2014-09-29 15:03:55 -04:00
rpoplin	d21ea126c4	Merge pull request #735 from broadinstitute/vrr_fix_qual_calculation_independent_exact_af_calculator Fixed the QUAL calculation of the EXACT_INDEPENDENT	2014-09-29 14:57:04 -04:00
rpoplin	ff887f5674	Merge pull request #737 from broadinstitute/rp_VariantContextUtils_warning_message This warning message actually happens all the time in AssessNA12878 when...	2014-09-29 14:49:32 -04:00
Ron Levine	1c9d60c9a0	removed an unneed import that broke maven	2014-09-29 12:57:33 -04:00
Ryan Poplin	ac1a397024	This warning message actually happens all the time in AssessNA12878 when we subset down to biallelic events but I've verified that it is working as intended. Moving the logging level up to debug.	2014-09-29 11:40:38 -04:00
Valentin Ruano-Rubio	311b6815b3	Fixed the QUAL calculation of the EXACT_INDEPENDENT. The QUAL value calculated by this Exact AF Calculator is very underestimated when there are more than one alternative allele (non-biallelic sites). The reason is that the QUAL was roughly calculated by adding the QUALs resulting of each alternative alleles vs all other alleles, reference and alts, collapsed. This is ok for MLEAC calculations but not for QUAL. Now, for calculating the QUAL we collapse all the alternatives as only one. This change improves sensitivy with a cost of additional false positives, but this is naturally expected. The resulting QUAL column is much closer to the one returned by the reference implementation. Story: https://www.pivotaltracker.com/story/show/75926368. Changes: Changed the QUAL calculation as described above. Updated MD5s. Fixed MD5s	2014-09-29 11:04:52 -04:00
Valentin Ruano Rubio	f7fc9cd839	Merge pull request #734 from broadinstitute/vrr_fix_mleac_general_ploidy_exact_af_calculator Fixed MLEAC and QUAL inaccuracy in GeneralPloidyExactAFCalculator.	2014-09-24 11:52:37 -04:00
Valentin Ruano-Rubio	0e52b8ba5a	Fixed MLEAC and QUAL inaccuracy in GeneralPloidyExactAFCalculator. The problem whas that the MLE table calculation aborted "unlikely" genotype combinations to aggresively. This also uncovered another bug where GeneralPloidyExactAFCalculation makes a slightly different use of StateTracker as compared to DiploidExactAFCalculation. We have changed StateTracker generalizing it to be able to work with both using code behaviors. Story: ----- * https://www.pivotaltracker.com/story/show/78920568 Changes: ------- * Fixes in GeneralPloidyExactAFCalculator. * Needed changes in StateTracker API and its consequences in DiploidExactAFCalculation. * Updated affected integrated tests' MD5s after fixing the GeneralPloidyExactAF.	2014-09-23 15:40:54 -04:00
Eric Banks	2da9bf7d09	Merge pull request #733 from broadinstitute/pd_allow_untrimmed_format Added -writeFullFormat engine-level argument	2014-09-17 23:34:09 -04:00
Phillip Dexheimer	1482a53aba	Added -writeFullFormat engine-level argument * This argument forces GATK to always write every record in the VCF format field, even if some records at the end are missing and could be removed * Revved htsjdk and picard * PT 70993484	2014-09-17 08:25:27 -04:00
Valentin Ruano Rubio	7cbb773c8f	Merge pull request #731 from broadinstitute/vrr_omniploidy_afcalc_base Support for mixed ploidy in GenotypeGVCFs and CombineGVCFs Story: * https://www.pivotaltracker.com/story/show/77891194	2014-09-12 16:30:31 -04:00
Valentin Ruano-Rubio	f6cb83d476	Renamed AFCalc to AFCalculator for a better class naming	2014-09-12 14:59:58 -04:00
Valentin Ruano-Rubio	95b45443ae	Updated test according to changes in the AF calculator framework. Changes: ------- * Updated current unit and integration test to use the new API components. * Added unit tests for new classes AFPriorProvider and AFCalculatorProviders. * Added integration test for mixed ploidy GenotypeGVCFs and CombineGVCFs	2014-09-12 14:59:47 -04:00
Valentin Ruano-Rubio	3cdeab6e9e	GenotypingEngines and walkers now use AFCalc(ulator) providers rathern than instanciate their own (fixed) calculators directly. Changes: ------- * GenotypingEngine uses now a AFCalc provider instead of its own thread-local with one-time initialized and fixed AF calculator. * All walkers that use a GenotypingEngine now are passing the appropiate AF calculator provider. For now most just use a fix calculator (FixedAFCalculatorProvider) except GenotypeGVCFs as this one now can cope with mixture of ploidies failing-over to a general-ploidy calculator when the preferred implementation is not capable to handle a site's analysis.	2014-09-12 14:25:09 -04:00
Valentin Ruano-Rubio	935bd1394b	AFCalculatorProvider components to allow for dynamic instantiation of different AFCalc(ulators) to cope with dynamic ploidy and max-alt-allele counts (the latter not used for now).	2014-09-12 14:23:45 -04:00
Valentin Ruano-Rubio	ce8e93fa51	Made the AF prior probability distribution dynamic respect to the total-ploidy (added ploidy accross samples). Changes: -------- * Instead of calculate a fixed log10 prior array with a fix total likelihood we use a new component, the AFPriorProvider to generate the priors for different total plodies on demand; these are cached however so there is no unecessary recompute involved.	2014-09-12 14:23:37 -04:00
Valentin Ruano-Rubio	31e58ae4ec	Refactored AFCalc to remove unecessary capability limits allowing to deal with mixed ploidies and max-alt-allele number changes dynamically. Changes: -------- * Moved the AFCalcFactory.Calculation enum in a top level class AFCalculatorImplementation. * Given more reponsabilities to the enum like resolving the constructor method once per implementation and the best-model selection algorithm. * Removed test-code only fields and methods from AFCalc; just used to perform unit-testing and not any actual functionality of this component. * Removed the fixed ploidy constraint of GeneralPloidyExactAFCalc implementation... now can deal with mixed ploidies that may change per site and sample. * Removed the fixed maxAltAllele restriction by allowing resizing of the stateTracker structures. * Due to previous two points now call the the AFCalc object are passed the default-ploidy to assume in case some genotype in the input VC does not have it and the max-alt-allele. * Also due to those changes, removed the now totally useless 3 int parameters from all AFCalc constructors. * Cleaned the code a bit from no further used components and methods.	2014-09-12 14:17:36 -04:00
Eric Banks	7f6e526a87	Merge pull request #732 from broadinstitute/rp_ignore_all_filters_in_VQSR Added ignore all filters options to VQSR walkers	2014-09-12 00:17:20 -04:00
Ryan Poplin	48252897b4	Added ignore all filters options to VQSR walkers	2014-09-11 15:11:41 -04:00
Eric Banks	31cea25c36	Merge pull request #730 from broadinstitute/eb_inbreeding_coeff_unit_test Cleaned up and fleshed out unit tests for the Inbreeding Coefficient annotation class	2014-09-10 09:32:49 -04:00
Eric Banks	5e490362ca	Cleaned up and fleshed out unit tests for the Inbreeding Coefficient annotation class.	2014-09-08 11:40:39 -04:00
Eric Banks	78a61dc3e8	Merge pull request #729 from broadinstitute/eb_improve_dangling_head_merging_PT74221002 Improve the accuracy of dangling head merging in the HC assembler.	2014-09-08 10:28:51 -04:00
Eric Banks	cc175bad40	Improve the accuracy of dangling head merging in the HC assembler. Dangling head merging (like with tails) in now enabled by default. The --recoverDanglingHeads argument is now deprecated so that users know not to use it anymore. We now also allow the user to set the minimum branch length for merging. This will be different for exomes and RNA (see below). The other changes in the code itself: 1. We no longer allow an arbitrarily large number of mismatches in the dangling head for merging 2. The max number of mismatches allowed in a dangling head is proportional to the kmer size There will be a difference in the RNA calling pipeline. Instead of invoking '--recoverDanglingHeads' the user will instead want to use '--minDanglingBranchLength 0'. Below are the knowledgebase results of the master branch vs. this one. For NA12878 DNA Exome: master SNPS TRUE_POSITIVE 36722 master SNPS CALLED_NOT_IN_DB_AT_ALL 2699 master SNPS REASONABLE_FILTERS_WOULD_FILTER_FP_SITE 292 master SNPS FALSE_POSITIVE_SITE_IS_FP 70 branch SNPS TRUE_POSITIVE 36867 branch SNPS CALLED_NOT_IN_DB_AT_ALL 2952 branch SNPS REASONABLE_FILTERS_WOULD_FILTER_FP_SITE 387 branch SNPS FALSE_POSITIVE_SITE_IS_FP 94 As I discussed with Ryan in person, there are a good number of FPs that are called in the new code, but they nearly all have bad strand bias and should be easily filtered by VQSR. Note that there is no change for indels. For NA12878 RNA from Ami: master SNPS TRUE_POSITIVE 11055 master SNPS CALLED_NOT_IN_DB_AT_ALL 831 master SNPS REASONABLE_FILTERS_WOULD_FILTER_FP_SITE 44 master SNPS FALSE_POSITIVE_SITE_IS_FP 96 branch SNPS TRUE_POSITIVE 11113 branch SNPS CALLED_NOT_IN_DB_AT_ALL 874 branch SNPS REASONABLE_FILTERS_WOULD_FILTER_FP_SITE 47 branch SNPS FALSE_POSITIVE_SITE_IS_FP 92 Again, there's basically no change for indels.	2014-09-07 08:55:59 -04:00
Eric Banks	56a2554bb0	Merge pull request #720 from broadinstitute/pd_standardize_args Moved arguments controlling options in output files into the engine	2014-09-06 20:49:58 -04:00
Phillip Dexheimer	a35f5b8685	Moved arguments controlling options in output files into the engine * Arguments involved are --no_cmdline_in_header, --sites_only, and --bcf for VCF files and --bam_compression, --simplifyBAM, --disable_bam_indexing, and --generate_md5 for BAM files * PT 52740563 * Removed ReadUtils.createSAMFileWriterWithCompression(), replaced with ReadUtils.createSAMFileWriter(), which applies all appropriate engine-level arguments * Replaced hard-coded field names in ArgumentDefinitionField (Queue extension generator) with a Reflections-based lookup that will fail noisily during extension generation if there's an error	2014-09-05 21:18:11 -04:00
droazen	5c4a3eb89c	Merge pull request #727 from broadinstitute/ks_gatk_queue_package_test_updates Various fixes for package tests.	2014-09-05 10:17:32 -04:00

1 2 3 4 5 ...

13675 Commits (7ec38bf2a28c3ef33c8034a3e3c6c45d4c3e6204) All Branches Search

13675 Commits (7ec38bf2a28c3ef33c8034a3e3c6c45d4c3e6204)

All Branches