Commit Graph

13622 Commits (9705df2f7a13afec01744e9672324e0d67c9fabc)

Author SHA1 Message Date
rpoplin 9705df2f7a Merge pull request #726 from broadinstitute/rp_haplotypeCaller_typo_fix
fixing a few small typos in the HaplotypeCaller and related classes
2014-09-04 15:04:41 -04:00
Ryan Poplin 1b809268d5 fixing a few small typos in the HaplotypeCaller and related classes 2014-09-04 14:48:27 -04:00
kshakir f53ea3b456 Merge pull request #725 from broadinstitute/ks_remove_ipflibrary_symlink
Fixed old symlink reference in IPFLibraryQueueTest.
2014-09-04 22:14:02 +08:00
Khalid Shakir f7c37eff06 Fixed old symlink reference in IPFLibraryQueueTest. 2014-09-04 22:13:17 +08:00
droazen 5c087a6e1f Merge pull request #724 from broadinstitute/ks_remove_test_qscript_symbolic_links
Removed symlink creation for tests and qscripts
2014-09-04 09:10:54 -04:00
Eric Banks 538537dbf1 Merge pull request #718 from broadinstitute/mf_rbp_fix
Fix MNP merging code to work with explicit HP phase representation
2014-09-02 20:39:22 -04:00
Eric Banks 01e725cd1a Merge pull request #723 from broadinstitute/eb_fix_rna_splitting_PT77878554
Make sure that the OverhangFixingManager (used for splitting RNA reads) ...
2014-09-02 20:39:01 -04:00
Menachem Fromer 10f9001738 Fix MNP merging code to work with explicit HP phase representation 2014-09-02 17:25:08 -04:00
Eric Banks ff91ab8ba2 Make sure that the OverhangFixingManager (used for splitting RNA reads) handles unmapped reads. 2014-09-02 16:56:17 -04:00
Valentin Ruano Rubio c7925f6e5c Merge pull request #719 from broadinstitute/vrr_generalize_ploidy_in_genotype_gvcfs
Adds support for omniploidy to GenotypeGVCFs and CombineGVCFs.
2014-09-02 16:51:02 -04:00
Valentin Ruano-Rubio d363725b4b Adds support for omniploidy to GenotypeGVCFs and CombineGVCFs.
Same changes fixed the problem for GenotypeGVCFs and CombineGVCFs.

Stories:

  - https://www.pivotaltracker.com/story/show/77626044
  - https://www.pivotaltracker.com/story/show/77626854

Changes:

  - Generalized the code for the merging in GATKVariantContextUtils to cope
    with ploidy != 2.
  - GenotypeGVCFs now check that the input's ploidy conform to the '-ploidy'
    argument.
  - Moved out Refernce Confidence VC merging code from GATKVariantContextUtils
    so that we can keep new code in protected.

Caveats:

  - GenotypeGVCFs only can deal with input files that have the same ploidy in
    all positions; the one that the user MUST indicate in the -ploidy argument
    (if different to the default 2).
  - CombineGVCFs won't necessarely complain if its passed mixed ploidy
    inputs but you won't be able to genotype it with GenotypeGVCFs.

Test:

   - Removed deprecated unit tests for GATKVariantContextUtils.
   - Moved unit-tests regarding GVCF merging from GATKVariantContextUtilsUnitTest
     to ReferenceConfidenceVariantContextUtilsUnitTest.
   - Added unit test for new code for mapping genotype indices between allele
     index encoding in GenotypeLikelihoodCalculator.
   - GenotypeGVCFs and CombineGVCFs original integration test are unaffected
     by the change.
   - Added tetraploid run integration tests to check on non-diploid execution
     of GenotypeGVCFs and CombineGVCFs.
2014-09-02 15:06:47 -04:00
Eric Banks fe86dafc41 Merge pull request #705 from broadinstitute/gg_simplify_gatkdocs_templates
Changed the GATKDocs format to PHP
2014-09-02 06:28:26 -04:00
Khalid Shakir fcb0eca203 Now passing in the path to the GATK directory to tests.
Changed tests and scripts to use gatkdir full path instead of relative testdata/qscripts symbolic links.
Although symlinks not created, left the symlink deletion script execution with a comment about future removal.
Re-enabled example UG pipeline queue test.
Replaced all hardcoded strings of {public,private}/testdata with BaseTest variables.
Refactored temp list creation method from ListFileUtilsUnitTest to BaseTest.createTempListFile.
Removed list files with hardcoded paths, now using createTempListFile instead with private test dir variable.
2014-09-02 01:40:59 +08:00
kshakir 9477a6ab1a Merge pull request #722 from broadinstitute/ks_mlinderm_taggable_RodBindingCollection
Update extension generator to recognize RodBindingCollection as 'taggable'
2014-09-01 18:07:07 +08:00
Michael Linderman 380cd67146 Update extension generator to recognize RodBindingCollection as 'taggable'
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2014-09-01 12:19:44 +08:00
Eric Banks 46b0c18603 Merge pull request #721 from broadinstitute/ks_save_analyze_covariates_integration_test_inputs
Don't delete "after" files in AnalyzeCovariatesIntegrationTest
2014-08-29 18:44:40 -04:00
Khalid Shakir 2d28972c88 The 'after' files are @Input files and commited in git, so don't delete them after tests. 2014-08-30 03:04:54 +08:00
Eric Banks b654590ed6 Merge pull request #717 from broadinstitute/eb_change_phasing_of_hom_vars
Changed the functionality of the physical phasing in the HC: now hom var...
2014-08-25 21:41:11 -04:00
Eric Banks 5b087c9897 Changed the functionality of the physical phasing in the HC: now hom vars are output as 0|1.
We do this for technical reasons, mostly because we don't genotype in the HC anymore; it's all
done downstream by GenotypeGVCFs so we can't be sure that the genotype will be hom var.  Also,
there are steps in the downstream pipeline where genotypes can change, so assuming anything in
the HC is a bad idea, and if we have phasing info in the het state, we want to propagate that forward.

Now, PGT tag fixing happens downstream in GenotypeGVCFs.
While I was in there I also cleaned up the code a bit and fixed a bug where annotation was happening
before genotype creation when using the --includeNonVariantSites argument.

Added tests accordingly.
2014-08-25 21:40:14 -04:00
Valentin Ruano Rubio 9324121af4 Merge pull request #716 from broadinstitute/vrr_broken_md5s
Fixes some missmerged md5 updates from a previous merge into master
2014-08-25 10:30:00 -04:00
Valentin Ruano-Rubio 6dc5cf0be0 Fixes some missmerged md5 updates from a previous merge into master 2014-08-24 20:47:07 -04:00
Eric Banks 9009c1e996 Merge pull request #715 from broadinstitute/vrr_disable_physical_phasing_for_nondiploid_hc
Disable physical phasing for non-diploid HC calling.
2014-08-23 20:58:51 -04:00
Eric Banks 34e5cce553 Merge pull request #714 from broadinstitute/pd_hc_samplename
Add the --sample_name argument to HaplotypeCaller
2014-08-23 20:57:44 -04:00
Valentin Ruano-Rubio 6695aeafd9 Disable physical phasing for non-diploid HC calling.
Story:

    https://www.pivotaltracker.com/story/show/77452256

Changes:

    If ploidy != 2, disable physical phasing and log an info message to let the user know.

Tests:

    Change md5s affected by this change.
2014-08-23 10:52:07 -04:00
Phillip Dexheimer 931890915f Add the --sample_name argument to HaplotypeCaller
* This is a shortcut for people who have multi-sample BAMs but would like to use GVCF mode.  Rather than creating single-sample BAMs with PrintReads, one could use the --sample_name argument to HaplotypeCaller to specify the single sample to make calls on
 * Completes PT 73075482
2014-08-22 23:22:03 -04:00
Valentin Ruano Rubio 2b129ba707 Merge pull request #713 from broadinstitute/vrr_mlpsacaf_standalone_annotations
Created the stand-alone AC and AF annotation AlleleCountsBySample
2014-08-22 22:19:28 -04:00
Valentin Ruano-Rubio fc5ce4b662 Created the stand-alone AC and AF annotation AlleleCountBySample
Story:

  https://www.pivotaltracker.com/story/show/77250524

Changes:

  - Remove the annotating code in GeneralPloidyExactAFCalc (GPEAFC) class.
  - Added the asAlleleList to GenotypeAlleleCounts class and get (GPEAFC) to use that instead of implementing its own (nicer and more reusable code).
  - Removed the explicit addition of AlleleCountBySample fields to the VCF header by the walker initialize
  - Added utility methods in Utils to wrap and int[] array into a List<Integer>, and double[] array into a List<Double> efficiently.

Test:

  - Added unit-testing for asAlleleList in GenotypeAlleleCountsUnitTest (within testFirst and testNext).
  - Added unit-testing for new methods in Utils : asList(int[]) and asList(double[])
  - Changed UG General Ploidy test to add explicitly those annotations.
  - Non-trivial changes in integration tests involving non-diploid runs (namelly haploid and tetraploid) as they are not showing
    those annotations anylonger, so the MD5s have been changed accordingly.
2014-08-22 20:33:25 -04:00
Eric Banks 36bdfa3918 Merge pull request #712 from broadinstitute/eb_physical_phasing_bug_PT77248992
Fixing bug in the physical phasing code, found by Valentin.
2014-08-21 15:25:51 -04:00
Eric Banks b1cb6196be Fixing bug in the physical phasing code, found by Valentin.
It turns out that there can be some really complex situations even with a single sample where
there are lots of unphasable hets around a hom.  Previously we were trying to phase each of the
hets against the hom, but that wasn't correct.  Instead we now detect that situation and don't
attempt to phase anything.
Added a unit test to cover this situation.
2014-08-21 15:24:09 -04:00
rpoplin e60dd77362 Merge pull request #711 from broadinstitute/ldg_deNovoAnnotation
Add bells and whistles for Genotype Refinement Pipeline
2014-08-21 15:06:42 -04:00
Laura Gauthier 9a5da41dd4 Add bells and whistles for Genotype Refinement Pipeline
New annotation for low= and high-confidence de novos (only annotates biallelics)
FamilyLikelihoodsUtils now add joint likelihood and joint posterior annotations
Restrict population priors based on discovered allele count to be valid for 10 or more samples.
2014-08-21 11:20:40 -04:00
Valentin Ruano Rubio 0c25eb7163 Merge pull request #709 from broadinstitute/vrr_fix_general_diploid_exact_af_index_out_of_bounds_bug
Fix for the GeneralPloidyExactAFCalc implementation that was preventing -ploidy != 2 GVCF/BP_RESOLUTION output to work.

Story:

  https://www.pivotaltracker.com/story/show/74471252

Tests:

    Enabled GVCF tests with ploidy != 2 and other checking for the original ArrayIndexOutOfBounds exception.
2014-08-20 16:57:51 -04:00
Valentin Ruano-Rubio d31c5536aa Fixed the bug first by indicating the actual possible number of alternatives alleles considering the extra <NON_REF> and second by resizing the StateTracker capacity when invoked by GeneralPloidyExactAFCalc deep within its implementation of computeLog10PNonRef which is ultimatelly what get rids of the exception.
Story:

  https://www.pivotaltracker.com/story/show/74471252
2014-08-20 14:42:42 -04:00
Eric Banks 78c2da1fef Merge pull request #708 from broadinstitute/ldg_SBannotationWarnings
Refactor StrandBiasTest (using template method) and add warnings for whe...
2014-08-20 09:30:06 -04:00
Laura Gauthier b512c7eac9 Refactor StrandBiasTest (using template method) and add warnings for when annotations may not be calculated successfully.
VariantAnnotator/FS behavior changes slightly: VA used to output zeros for FS if there was no strand bias info, now skips FS output (but will still show FS in header)
2014-08-20 08:18:53 -04:00
Valentin Ruano Rubio 86cb88e121 Merge pull request #675 from broadinstitute/vrr_hc_omniploidy_general_likelihood_calculation
HC omniploidy general likelihood calculation

Stories:

   https://www.pivotaltracker.com/story/show/72090992
   https://www.pivotaltracker.com/story/show/72091202
2014-08-19 14:43:49 -04:00
Valentin Ruano-Rubio 8d9a55ae60 Moving new omniploidy likelihood calculation classes to their final package (as far as this pull-request is concerned) in org.broadinstitute.gatk.tools.walkers.genotyper 2014-08-19 11:54:29 -04:00
Valentin Ruano-Rubio 611b7f25ea Adds unit-test and integration test for new omniploidy likelihood calculation components
Added md5 to HaplotypeCallerIntegrationTest.testHaplotypeCallerSingleSampleWithDbsnp
2014-08-19 11:53:19 -04:00
Valentin Ruano-Rubio 9ee9da36bb Generalize the calculation of the genotype likelihoods in HC to cope with haploid and multiploidy
Changes in several walker to use new sample, allele closed lists and new GenotypingEngine constructors signatures

Rebase adoption of new calculation system in walkers
2014-08-19 11:53:06 -04:00
Valentin Ruano-Rubio f08dcbc160 Added the genotype likelihoods model interface and implementation for the random speciment sample from an infinite population with homogeneous ploidy accross samples. 2014-08-19 11:50:13 -04:00
Valentin Ruano-Rubio 4f993e8dbe Added read-likelihoods array base structure to substitute existing Map-of-Map-of-Maps. 2014-08-19 11:50:12 -04:00
Valentin Ruano-Rubio 242cd0e58f Added genotype allele counts and likelihood calculator utilities for arbitrary ploidy and number of alleles 2014-08-19 11:50:12 -04:00
Valentin Ruano-Rubio b0a4cb9f0c Added close sample and allele list data-structures and utility classes 2014-08-19 11:50:12 -04:00
Geraldine Van der Auwera cdba069b02 changed the GATKDocs format to PHP 2014-08-18 18:04:07 -04:00
Eric Banks 1af78f707e Merge pull request #707 from broadinstitute/eb_improve_physical_phasing
Updated the physical phasing in the Haplotype Caller to address requests...
2014-08-18 16:27:40 -04:00
Eric Banks d3f06024f8 Updated the physical phasing in the Haplotype Caller to address requests from ATGU.
1. It is now turned on by default
2. It now phases homozygous variants
3. Most importantly, it also phases variants that are always on opposite haplotypes

Changed the INFO keys to be PID and PGT, as described in the header.
2014-08-18 14:38:29 -04:00
Eric Banks 7e0c326e1c Merge pull request #706 from broadinstitute/vrr_reduce_hc_integration_test_time
Reduce intervals of integration tests in HaplotypeCallerIntegrationTest ...
2014-08-15 17:37:57 -04:00
Eric Banks cab02f8401 Merge pull request #703 from broadinstitute/eb_keep_original_ac_for_multiallelics
Update the --keepOriginalAC functionality in SelectVariants to work for ...
2014-08-15 17:36:32 -04:00
Valentin Ruano-Rubio 2f79042dee Reduce intervals of integration tests in HaplotypeCallerIntegrationTest class
Story:

   https://www.pivotaltracker.com/story/show/74858854

Changes:

    Intervals have been shrunk so that the test run in 15s or less.
2014-08-15 14:20:10 -04:00
Eric Banks eb84091702 Update the --keepOriginalAC functionality in SelectVariants to work for sites that lose alleles in the selection. 2014-08-14 15:34:09 -04:00