Commit Graph

13035 Commits (cc9477aedb688545376ee0b5155b1e12a112a52e)

Author SHA1 Message Date
Joel Thibault cc9477aedb Minimal test for the multi-allelic reordering bug 2014-02-12 13:38:32 -05:00
Eric Banks 6facf695ab Merge pull request #506 from broadinstitute/eb_quick_fixes_to_single_sample_pipeline
Several improvements to the single sample combining steps.
2014-02-12 11:30:48 -05:00
Eric Banks 300b474c96 Several improvements to the single sample combining steps.
1. updated QualByDepth not to use AD-restricted depth if it is zero.
Added unit test this change.

2. Fixed small bug in CombineGVCFs where spanning deletions were not being treated consistently throughout.
Added test for this situation.

3. Make sure GenotypeGVCFs puts in the required headers.
Updated test files to make sure this is covered.

4. Have GenotypeGVCFs propagate up the MLEAC/AF (which were getting clobbered out).
Tests updated to account for this.
2014-02-12 10:15:12 -05:00
Eric Banks ff8c9575b0 Merge pull request #505 from broadinstitute/ks_build_na12878kb-utils
Ported na12878kb.jar assembly from Ant's build.xml to Maven
2014-02-12 10:10:52 -05:00
Khalid Shakir 403ce7bfb7 Ported na12878kb.jar assembly from Ant's build.xml to Maven at request of jsilter 2014-02-12 06:01:32 +08:00
droazen 3cfcfa4fa0 Merge pull request #504 from broadinstitute/dr_enable_kb_tests_in_maven
Add ability to run *KnowledgeBaseTests to maven
2014-02-11 15:32:24 -05:00
David Roazen 95e1402d21 Add ability to run *KnowledgeBaseTests to maven
Run with: mvn verify -Dsting.knowledgebasetests.skipped=false
2014-02-11 14:08:24 -05:00
droazen cc9a3b68fc Merge pull request #503 from broadinstitute/ks_maven_gatkdocs_patch
Patched PluginManager to ignore null classes
2014-02-11 13:28:24 -05:00
Eric Banks 96e29d3d94 Merge pull request #500 from broadinstitute/eb_make_qd_more_robust
Adding smarts to the QD annotation
2014-02-11 12:57:50 -05:00
Eric Banks 303a60c8c6 Adding smarts to the QD annotation:
when the AD annotation is present for a given genotype then we only use its depth for QD if the variant depth > 1.

Added new unit tests for QualByDepth.
2014-02-11 12:56:49 -05:00
Khalid Shakir 1666bb7e3a Patched PluginManager to ignore null classes, that will allow gatkdocs to build successfully when running from the source root directory, due to its hardcoded paths. 2014-02-12 00:48:58 +08:00
Eric Banks b33d9d9105 Merge pull request #502 from broadinstitute/eb_make_combine_gvcfs_faster
Refactoring of CombineGVCFs to make it run a lot faster.
2014-02-11 09:03:16 -05:00
Eric Banks 2e36dd9001 Refactoring of CombineGVCFs to make it run a lot faster.
Creating new VariantContexts each time we broke up a block was very expensive because we break up
blocks so often.  Also, calling into GATKVariantContextUtils.simpleMerge was really hurting performance.

MD5 changes because we no longer propogate any INFO fields (except for END) for reference blocks; the tests
have the now unused BLOCK_SIZE field that now get dropped.
2014-02-11 03:18:52 -05:00
Ryan Poplin b81494b704 Merge pull request #499 from broadinstitute/eb_fix_ad_updates
Fixed bug in generating AD values when new alleles are present for genot...
2014-02-09 17:55:00 -05:00
Eric Banks abb67cfa5e Fixed bug in generating AD values when new alleles are present for genotpying GVCFs.
This was a dumb mistake that wasn't well tested (but is now).
2014-02-09 15:15:19 -05:00
Eric Banks c2a2484a18 Merge pull request #498 from broadinstitute/ks_delete_pricard_private
Removed use of picard private.
2014-02-09 10:27:48 -05:00
Khalid Shakir 12bb6fd361 Removed use of picard private.
Updated picard-maven script to tag locally modified builds with -SNAPSHOT.
Removed old picard jars.
2014-02-09 17:08:52 +08:00
Eric Banks 597cc88f33 Merge pull request #497 from broadinstitute/eb_remove_ac0_alleles_PT65118652
Removing parameters that were incorrectly copied over from RegenotypeVariants
2014-02-08 23:46:23 -05:00
Eric Banks abef6cfcb6 Removing parameters that were incorrectly copied over from RegenotypeVariants. 2014-02-08 23:44:32 -05:00
Eric Banks e9189cd471 Merge pull request #496 from broadinstitute/eb_remove_test_for_blocksize
Removing the test for BLOCK_SIZE since we no longer emit it
2014-02-08 21:29:02 -05:00
Eric Banks 659a9f0e79 Removing the test for BLOCK_SIZE since we no longer emit it 2014-02-08 21:28:07 -05:00
Eric Banks a33d7ace11 Merge pull request #495 from broadinstitute/ks_increase_scala_memory
Made scala.maxmemory an argument, and defaulted it to 1g.
2014-02-08 21:13:26 -05:00
Eric Banks 4decb49ecb Merge pull request #494 from broadinstitute/vrr_reference_model_nocall_bugfix
Fixed nocall (./.) without PLs bug in GVCF output
2014-02-08 21:12:45 -05:00
Khalid Shakir 4e0f7521f2 Made scala.maxmemory an argument, and defaulted it to 1g. 2014-02-09 09:24:44 +08:00
Valentin Ruano-Rubio bf630abe88 Fixed nocall (./.) without PLs bug in GVCF output
Story:

https://www.pivotaltracker.com/story/show/65388246

Additional changes and notes:

1. The fix consist in forcing the output of all PLs by setting the standard flag for that '-allSitePLs'.

2. BP_RESOLUTION was handled differently to GVCF in some aspect that should be common. That has been fixed.
2014-02-07 19:30:26 -05:00
Eric Banks 8c922be684 Merge pull request #491 from broadinstitute/eb_get_AD_back_from_gvcfs
Fixed up some of the genotype-level annotations being propogated in the ...
2014-02-07 12:50:01 -05:00
Eric Banks d689f61005 Fixed up some of the genotype-level annotations being propogated in the single sample HC pipeline.
1. AD values now propogate up (they weren't before).
2. MIN_DP gets transferred over to DP and removed.
3. SB gets removed after FS is calculated.

Also, added a bunch of new integration tests for GenotypeGVCFs.
2014-02-07 12:47:54 -05:00
Eric Banks 0a1385a4d9 Merge pull request #493 from broadinstitute/eb_fix_failing_merge_for_tim
The UG engine can return a null VC if there are tons of alt alleles, cau...
2014-02-07 12:43:40 -05:00
Eric Banks 67ed0d2403 The UG engine can return a null VC if there are tons of alt alleles, causing Tim's merge jobs to fail.
Pushing the null check up so that it doesn't error out in such cases.
2014-02-07 12:41:20 -05:00
Eric Banks 335483bb53 Merge pull request #492 from broadinstitute/eb_fix_failing_unit_tests
Fixing failing unit tests
2014-02-07 12:25:10 -05:00
Eric Banks db68d3fa10 Fixing failing unit tests 2014-02-07 12:24:14 -05:00
Valentin Ruano Rubio 8e87e083ff Merge pull request #490 from broadinstitute/vrr_reference_model_unsorted_records_bugfix
Fixed out of order non-variant gVCF entries when trimming is active.
2014-02-07 12:21:12 -05:00
Valentin Ruano-Rubio 4a3c8e68fa Fixed out of order non-variant gVCF entries when trimming is active.
Story:

https://www.pivotaltracker.com/story/show/65319564
2014-02-07 11:03:26 -05:00
Eric Banks 369fbd4439 Merge pull request #489 from broadinstitute/eb_hierarchical_gvcf_merger
Eb hierarchical gvcf merger
2014-02-07 08:50:39 -05:00
Eric Banks eb463b505d Remove a whole bunch of unused annotations from gVCF output.
AC,AF,AN,FS,QD - they'll all be recomputed later.
BLOCK_SIZE and MIN_GQ were not necessary.

I also made the StrandBiasBySample annotation forced on when in gVCF mode.
It turns out that its output wasn't compatible with BCF so I patched it (and the variant jar too).
2014-02-07 08:49:36 -05:00
Eric Banks 2648219c42 Implementation of a hierarchical merger for gVCFs, called CombineGVCFs.
This tool will take any number of gVCFs and create a merged gVCF (as opposed to
GenotypeGVCFs which produces a standard VCF).

Added unit/integration tests and fixed up GATK docs.
2014-02-07 08:49:18 -05:00
Eric Banks 71b47a6148 Rename CombineReferenceCalculationVariants to GenotypeGVCFs 2014-02-06 15:46:19 -05:00
droazen b2b44c335f Merge pull request #488 from broadinstitute/ks_mvn_serial_test_updates
Ks mvn serial test updates
2014-02-06 15:12:08 -05:00
Khalid Shakir b21c35482e Packages link private/testdata, so that mvn test -Dsting.serialunittests.skipped=false works. 2014-02-06 08:25:38 -05:00
Khalid Shakir 3848159086 Added a set of serial tests to gatk/queue packages, which runs all tests under their package in one TestNG execution.
New properties to disable regenerating example resources artifact when each parallel test runs under packagetest.
Moved collection of packagetest parameters from shell scripts into maven profiles.
Fixed necessity of test-utils jar by removing incorrect dependenciesToScan element during packagetests.
When building picard libraries, run clean first.
Fixed tools jar dependency in picard pom.
Integration tests properly use the ant-bridge.sh test.debug.port variable, like unit tests.
2014-02-06 08:25:38 -05:00
Valentin Ruano Rubio 988e3b4890 Merge pull request #487 from broadinstitute/vrr_reference_model_with_trimming
Get gVCF to work without --dontTrimActiveRegions
2014-02-05 22:52:17 -05:00
Valentin Ruano-Rubio 98ffcf6833 Get gVCF to work without --dontTrimActiveRegions
Story:

https://www.pivotaltracker.com/story/show/65048706
https://www.pivotaltracker.com/story/show/65116908

Changes:

ActiveRegionTrimmer in now an argument collection and it returns not only the trimmed down active region but also the non-variant containing flanking regions
HaplotypeCaller code has been simplified significantly pushing some functionality two other classes like ActiveRegion and AssemblyResultSet.

Fixed a problem with the way the trimming was done causing some gVCF non-variant records no have conservative 0,0,0 PLs
2014-02-05 22:50:45 -05:00
Ryan Poplin 6a7a197362 Merge pull request #486 from broadinstitute/rp_fix_missing_annotations_CombineReferenceCalculationVariants
Bug fix for missing annotations in CombineReferenceCalculationVariants. ...
2014-02-05 14:22:59 -05:00
Ryan Poplin 693bfac341 Bug fix for missing annotations in CombineReferenceCalculationVariants. They were being dropped in the handoff between engines in a couple of places.
-- Updated single sample pipeline test data using Valentin's files and re-enabled CRCV tests
2014-02-05 12:58:48 -05:00
Eric Banks 8aa8acf81d Merge pull request #485 from broadinstitute/eb_more_combine_rc_variants_iterations
Eb more combine rc variants iterations
2014-02-05 11:32:30 -05:00
Eric Banks 740b33acbb We were never validating the sequence dictionary of tabix indexed VCFs for some reason. Fixed.
These changes happened in Tribble, but Joel clobbered them with his commit.
We can now change the logging priority on failures to validate the sequence dictionary to WARN.
Thanks to Tim F for indirectly pointing this out.
2014-02-05 10:12:38 -05:00
Eric Banks 9cac24d1e6 Moving logging status of VCF indexing to DEBUG instead of INFO, otherwise it's painful when reading in lots of files 2014-02-05 10:12:37 -05:00
Eric Banks 91bdf069d3 Some updates to CRCV.
1. Throw a user error when the input data for a given genotype does not contain PLs.
2. Add VCF header line for --dbsnp input
3. Need to check that the UG result is not null
4. Don't error out at positions with no gVCFs (which is possible when using a dbSNP rod)
2014-02-05 10:12:37 -05:00
droazen 22bcd10372 Merge pull request #484 from broadinstitute/jt_select_variants_nt_maven
Fix for the SelectVariants -nt race condition corruption of the AD and PL fields
2014-02-05 08:15:02 -05:00
Joel Thibault 7923e786e9 Rev Picard (public) to 1.107.1676
- Rename snappy to snappy-java
- Add maven-metadata-local.xml to .gitignore
2014-02-04 22:04:28 -05:00