Commit Graph

13003 Commits (4a3c8e68fa866bd4d5878e27f463d940fa20a6ea)

Author SHA1 Message Date
Valentin Ruano-Rubio 4a3c8e68fa Fixed out of order non-variant gVCF entries when trimming is active.
Story:

https://www.pivotaltracker.com/story/show/65319564
2014-02-07 11:03:26 -05:00
Eric Banks 369fbd4439 Merge pull request #489 from broadinstitute/eb_hierarchical_gvcf_merger
Eb hierarchical gvcf merger
2014-02-07 08:50:39 -05:00
Eric Banks eb463b505d Remove a whole bunch of unused annotations from gVCF output.
AC,AF,AN,FS,QD - they'll all be recomputed later.
BLOCK_SIZE and MIN_GQ were not necessary.

I also made the StrandBiasBySample annotation forced on when in gVCF mode.
It turns out that its output wasn't compatible with BCF so I patched it (and the variant jar too).
2014-02-07 08:49:36 -05:00
Eric Banks 2648219c42 Implementation of a hierarchical merger for gVCFs, called CombineGVCFs.
This tool will take any number of gVCFs and create a merged gVCF (as opposed to
GenotypeGVCFs which produces a standard VCF).

Added unit/integration tests and fixed up GATK docs.
2014-02-07 08:49:18 -05:00
Eric Banks 71b47a6148 Rename CombineReferenceCalculationVariants to GenotypeGVCFs 2014-02-06 15:46:19 -05:00
droazen b2b44c335f Merge pull request #488 from broadinstitute/ks_mvn_serial_test_updates
Ks mvn serial test updates
2014-02-06 15:12:08 -05:00
Khalid Shakir b21c35482e Packages link private/testdata, so that mvn test -Dsting.serialunittests.skipped=false works. 2014-02-06 08:25:38 -05:00
Khalid Shakir 3848159086 Added a set of serial tests to gatk/queue packages, which runs all tests under their package in one TestNG execution.
New properties to disable regenerating example resources artifact when each parallel test runs under packagetest.
Moved collection of packagetest parameters from shell scripts into maven profiles.
Fixed necessity of test-utils jar by removing incorrect dependenciesToScan element during packagetests.
When building picard libraries, run clean first.
Fixed tools jar dependency in picard pom.
Integration tests properly use the ant-bridge.sh test.debug.port variable, like unit tests.
2014-02-06 08:25:38 -05:00
Valentin Ruano Rubio 988e3b4890 Merge pull request #487 from broadinstitute/vrr_reference_model_with_trimming
Get gVCF to work without --dontTrimActiveRegions
2014-02-05 22:52:17 -05:00
Valentin Ruano-Rubio 98ffcf6833 Get gVCF to work without --dontTrimActiveRegions
Story:

https://www.pivotaltracker.com/story/show/65048706
https://www.pivotaltracker.com/story/show/65116908

Changes:

ActiveRegionTrimmer in now an argument collection and it returns not only the trimmed down active region but also the non-variant containing flanking regions
HaplotypeCaller code has been simplified significantly pushing some functionality two other classes like ActiveRegion and AssemblyResultSet.

Fixed a problem with the way the trimming was done causing some gVCF non-variant records no have conservative 0,0,0 PLs
2014-02-05 22:50:45 -05:00
Ryan Poplin 6a7a197362 Merge pull request #486 from broadinstitute/rp_fix_missing_annotations_CombineReferenceCalculationVariants
Bug fix for missing annotations in CombineReferenceCalculationVariants. ...
2014-02-05 14:22:59 -05:00
Ryan Poplin 693bfac341 Bug fix for missing annotations in CombineReferenceCalculationVariants. They were being dropped in the handoff between engines in a couple of places.
-- Updated single sample pipeline test data using Valentin's files and re-enabled CRCV tests
2014-02-05 12:58:48 -05:00
Eric Banks 8aa8acf81d Merge pull request #485 from broadinstitute/eb_more_combine_rc_variants_iterations
Eb more combine rc variants iterations
2014-02-05 11:32:30 -05:00
Eric Banks 740b33acbb We were never validating the sequence dictionary of tabix indexed VCFs for some reason. Fixed.
These changes happened in Tribble, but Joel clobbered them with his commit.
We can now change the logging priority on failures to validate the sequence dictionary to WARN.
Thanks to Tim F for indirectly pointing this out.
2014-02-05 10:12:38 -05:00
Eric Banks 9cac24d1e6 Moving logging status of VCF indexing to DEBUG instead of INFO, otherwise it's painful when reading in lots of files 2014-02-05 10:12:37 -05:00
Eric Banks 91bdf069d3 Some updates to CRCV.
1. Throw a user error when the input data for a given genotype does not contain PLs.
2. Add VCF header line for --dbsnp input
3. Need to check that the UG result is not null
4. Don't error out at positions with no gVCFs (which is possible when using a dbSNP rod)
2014-02-05 10:12:37 -05:00
droazen 22bcd10372 Merge pull request #484 from broadinstitute/jt_select_variants_nt_maven
Fix for the SelectVariants -nt race condition corruption of the AD and PL fields
2014-02-05 08:15:02 -05:00
Joel Thibault 7923e786e9 Rev Picard (public) to 1.107.1676
- Rename snappy to snappy-java
- Add maven-metadata-local.xml to .gitignore
2014-02-04 22:04:28 -05:00
Joel Thibault 0025fe190d Exclude sam's older TestNG 2014-02-04 22:04:27 -05:00
Joel Thibault 9eaee8c73c Integration test for the -nt race condition corrupting AD and PL fields 2014-02-04 22:04:27 -05:00
David Roazen 1de7a27471 Disable an additional test that is runtime dependent on one of the temporarily-disabled tests 2014-02-04 16:07:58 -05:00
David Roazen 76086f30b7 Temporarily disable tests that started failing post-maven
Joel is working on these failures in a separate branch. Since
maven (currently! we're working on this..) won't run the whole
test suite to completion if there's a failure early on, we need
to temporarily disable these tests in order to allow group members
to run tests on their branches again.
2014-02-04 15:31:24 -05:00
David Roazen 3b2f07990d Re-break the MWUnitTest for Joel to debug 2014-02-04 15:19:09 -05:00
David Roazen c9032f0b5c Fix failing unit tests 2014-02-04 03:05:30 -05:00
droazen 4eaa724be6 Merge pull request #483 from broadinstitute/ks_new_maven_build_system
New maven build system
2014-02-03 10:53:18 -08:00
David Roazen 60567c8d7e Minor ant-bridge.sh changes
-add "gatk" target to mimic old "ant gatk" target
-comment out release targets to prevent accidental releases
2014-02-03 13:50:47 -05:00
Khalid Shakir a4289711e2 Distinct failsafe summary reports, just like invoker report directories. 2014-02-03 13:50:47 -05:00
Khalid Shakir 5ab0b117d2 Re-split the invoked integration tests and verify into separate phases. 2014-02-03 13:50:47 -05:00
Khalid Shakir b10b42c10a Added a "dry" argument to ant-bridge.sh, that just prints the command that would run. 2014-02-03 13:50:47 -05:00
Khalid Shakir 857e6e0d6f Bumped version to 2.8-SNAPSHOT, using new update_pom_versions.sh script. 2014-02-03 13:50:46 -05:00
Khalid Shakir 20b471ef7b ant-bridge dist -> verify, test.compile -> test-compile
Added a utility script for running a single test, usually in parallel.
2014-02-03 13:50:46 -05:00
Khalid Shakir 9ca3004fc3 Setting the test-utils' type to test-jar, such that the multi-module build uses testClasses instead of classes as a directory dependency. 2014-02-03 13:50:46 -05:00
Khalid Shakir f968b8a58b Crash when integration tests fail by running install and verify at the same time, instead of trying to do them separately. 2014-02-03 13:50:46 -05:00
Khalid Shakir de13f41fc3 One step closer to a proper test-utils artifact. Using the maven-jar-plugin to create a test classifer, excluding actual tests, until we can properly separate the classes into separate artifacts/modules. 2014-02-03 13:50:46 -05:00
Khalid Shakir 25aee7164e Fixed missing "mvn" command execution in ant-bridge.
Added pom.xml workarounds for duplicate classpath error, due to gatk-framework dependency containing required BaseTest, and jarred *UnitTest/*IntegrationTest classes that also exist as files under target/test-classes.
2014-02-03 13:50:46 -05:00
Khalid Shakir caa76cdac4 Added maven pom.xmls for various artifacts. 2014-02-03 13:50:46 -05:00
Khalid Shakir d1a689af33 Added new utility files used by maven build, including the ant-bridge script. 2014-02-03 13:50:46 -05:00
Khalid Shakir 88150e0166 Switched commited dependency repository from ivy to maven. 2014-02-03 13:50:46 -05:00
Khalid Shakir 1e25a758f5 Moved files to maven directories.
Here are the git moved directories in case other files need to be moved during a merge:
  git-mv private/java/src/        private/gatk-private/src/main/java/
  git-mv private/R/scripts/       private/gatk-private/src/main/resources/
  git-mv private/java/test/       private/gatk-private/src/test/java/
  git-mv private/testdata/        private/gatk-private/src/test/resources/
  git-mv private/scala/qscript/   private/queue-private/src/main/qscripts/
  git-mv private/scala/src/       private/queue-private/src/main/scala/
  git-mv protected/java/src/      protected/gatk-protected/src/main/java/
  git-mv protected/java/test/     protected/gatk-protected/src/test/java/
  git-mv public/java/src/         public/gatk-framework/src/main/java/
  git-mv public/java/test/        public/gatk-framework/src/test/java/
  git-mv public/testdata/         public/gatk-framework/src/test/resources/
  git-mv public/scala/qscript/    public/queue-framework/src/main/qscripts/
  git-mv public/scala/src/        public/queue-framework/src/main/scala/
  git-mv public/scala/test/       public/queue-framework/src/test/scala/
2014-02-03 13:50:44 -05:00
Khalid Shakir faaef236ea Moved gsalib, R and other resources, Queue GATK extensions generator, Queue version java files. 2014-02-03 13:49:21 -05:00
Khalid Shakir eb52dc6a9b Moved build.xml, ivy.xml, ivysettings.xml, ivy properties, public/packages/*.xml into private/archive/ant 2014-02-03 13:49:20 -05:00
Eric Banks 83d07280ef Merge pull request #482 from broadinstitute/vrr_reference_model_alt_allele
gVCF <NON_REF> in all vcf lines including variant ones when –ERC gVCF is...
2014-01-30 08:25:43 -08:00
Valentin Ruano-Rubio 89c4e57478 gVCF <NON_REF> in all vcf lines including variant ones when –ERC gVCF is requested.
Changes:
-------

  <NON_REF> likelihood in variant sites is calculated as the maximum possible likelihood for an unseen alternative allele: for reach read is calculated as the second best likelihood amongst the reported alleles.

  When –ERC gVCF, stand_conf_emit and stand_conf_call are forcefully set to 0. Also dontGenotype is set to false for consistency sake.

  Integration test MD5 have been changed accordingly.

Additional fix:
--------------

  Specially after adding the <NON_REF> allele, but also happened without that, QUAL values tend to go to 0 (very large integer number in log 10) due to underflow when combining GLs (GenotypingEngine.combineGLs). To fix that combineGLs has been substituted by combineGLsPrecise that uses the log-sum-exp trick.

  In just a few cases this change results in genotype changes in integration tests but after double-checking using unit-test and difference between combineGLs and combineGLsPrecise in the affected integration test, the previous GT calls were either border-line cases and or due to the underflow.
2014-01-30 11:23:33 -05:00
Valentin Ruano Rubio 383a4f4a70 Merge pull request #481 from broadinstitute/vrr_pairhmm_log_probability_fix
Fix for the PairHMM transition probability miscalculation.
2014-01-27 10:59:08 -08:00
Valentin Ruano-Rubio 748d2fdf92 Added Integration test to verify the bugs are not there anymore as reported in pivotracker 2014-01-26 23:29:31 -05:00
Valentin Ruano-Rubio 9e7bf75e89 Fix for the PairHMM transition probability miscalculation.
Problem:

matchToMatch transition calculation was wrong resulting in transition probabilites coming out of the Match state that added more than 1.

Reports:

https://www.pivotaltracker.com/s/projects/793457/stories/62471780
https://www.pivotaltracker.com/s/projects/793457/stories/61082450

Changes:

The transition matrix update code has been moved to a common place in PairHMMModel to dry out its multiple copies.

MatchToMatch transtion calculation has been fixed and implemented in PairHMMModel.

Affected integration test md5 have been updated, there were no differences in GT fields and example differences always implied
small changes in likelihoods that is what is expected.
2014-01-26 16:30:36 -05:00
Ryan Poplin bdd06ebfc2 Merge pull request #478 from broadinstitute/eb_generalize_hc_values_as_args
Pulled out some hard-coded values from the read-threading and isActive c...
2014-01-21 09:01:54 -08:00
Eric Banks 8812278c2c Merge pull request #479 from broadinstitute/eb_move_test_up_one_level
Moving this test up one level to where it actually belongs.
2014-01-21 06:45:55 -08:00
Eric Banks 9e858270d7 Moving this test up one level to where it actually belongs. 2014-01-19 02:33:11 -05:00
Eric Banks 64d5bf650e Pulled out some hard-coded values from the read-threading and isActive code of the HC, and made them into a single argument.
In unifying the arguments it was clear that the values were inconsistent throughout the code, so now there's a
single value that is intended to be more liberal in what it allows in (in an attempt to increase sensitivity).

Very little code actually changes here, but just about every md5 in the HC integration tests are different (as
expected).  Added another integration test for the new argument.

To be used by David R to test his per-branch QC framework: does this commit make the HC look better against the KB?
2014-01-19 01:15:13 -05:00