Commit Graph

7796 Commits (01b16abc8dcc233e29258eaa4e1d1c0415180e91)

Author SHA1 Message Date
Laurent Francioli 01b16abc8d Genotype quality calculation modified to handle all genotypes the same way. This is inconsistent with GQ output by the UG but is correct even for cases of poor quality genotypes. 2011-10-24 10:24:41 +02:00
Laurent Francioli edea90786a Genotype quality is now recalculated for each of the phased Genotypes. Small problem is that we unnecessarily loose a little precision on the genotypes that do not change after assignment. 2011-10-20 17:04:19 +02:00
Laurent Francioli 1c61a57329 Original rewrite of PhaseByTransmission:
- Adapted to get the trio information from the SampleDB (i.e. from Pedigree file (ped)) => Multiple trios can be passed as argument
- Mendelian violations and trio phasing possibilities are pre-calculated and stored in Maps. => Runtime is ~3x faster
- Genotype combinations possible only given two MVs are now given a squared MV prior (e.g. 0/0+0/0=>1/1 is given 10^-16 prior if the MV prior is 10^-8)
- Corrected bug: In case the best genotype combination is Het/Het/Het, the genotypes are now set appropriately (before original genotypes were left even if they weren't Het/Het/Het)
- Basic reporting added:
-- mvf argument let the user specify a file to report remaining MVs
-- When the walker ends, some basic stats about the genotype reconfiguration and phasing are output

Known problems:
- GQ is not recalculated even if the genotype changes

Possible improvements:
- Phase partially typed trios
- Use standard Allele/Genotype Classes for the storage of the pre-calculated phase
2011-10-20 13:06:44 +02:00
Laurent Francioli ef6a6fdfe4 Added getAsMap -> returns the likelihoods as an EnumMap with Genotypes as keys and likelihoods as values. 2011-10-20 12:49:18 +02:00
Laurent Francioli 76dd816e70 Added getParents() -> returns an arrayList containing the sample's parent(s) if available 2011-10-20 12:47:27 +02:00
David Roazen aad99563be Merged bug fix from Stable into Unstable 2011-10-14 03:26:35 -04:00
David Roazen 442d33ba18 Enable testing of the jars produced by the packaging system.
-Added targets to run unit and integration tests on the fully-packaged GATK jar,
and pipeline tests on the fully-packaged Queue jar. Once enabled in Bamboo,
these will provide greatly-enhanced protection against breakage in the binary
release.

-Unconditionally include all of the subset of org.broadinstitute.sting
included in the intermediate jars GenomeAnalysisTK.jar, StingUtils.jar,
etc. in the final, fully-packaged jar. This:
    * is necessary to get tests to run on the fully-packaged jar
    * decreases the chances of a class that is a runtime-only
      dependency getting left out of the binary release
    * only slightly increases the size of the binary release
      (before: 9352465 bytes, after: 10985482 bytes)
2011-10-14 03:08:28 -04:00
David Roazen 4f01a742cb Merged bug fix from Stable into Unstable 2011-10-13 21:39:52 -04:00
David Roazen edfd6f8a06 Removing a public -> private dependency from the test suite.
The public integration test VariantContextIntegrationTest was dependent on the
private walker TestVariantContextWalker. Moved this walker to public/java/test
(NOT public/java/src, since this walker is only used by the test suite) to avoid
errors during public-only tests.
2011-10-13 21:32:52 -04:00
Mark DePristo 404ef741f1 Merged bug fix from Stable into Unstable 2011-10-13 18:02:06 -04:00
Mark DePristo 2ebdff074c Update MD5s for SOLiD recalibration
-- MD5 db had spelling error; fixed
-- Bug in AlignmentUtils resulted in some bases not being color space corrected.  The integration test caught the change, and it's clear that the new version is correct, as the prev. version was not considering the last the N qualities for reads with a ND operation.
2011-10-13 18:01:51 -04:00
Mark DePristo 5a881360df Merged bug fix from Stable into Unstable 2011-10-13 15:54:43 -04:00
Mark DePristo 7cab6f6bb0 Bug fixes for thread unsafe simple timer and bad Ns treatment in AlignmentUtils
-- SimpleTimer is now threadsafe using synchronized method keywords
-- Bug fix for alignmentToByteArray() where the N case was refPos++ not the now correct refPos += elementLength
2011-10-13 15:53:12 -04:00
Eric Banks 9aecd50473 Adding ability to exclude annotations from the VA and UG lists. As described in the docs, this argument trumps all others (including -all) so that we can get around the SnpEff issue brought up by Menachem. Added integration test for it. 2011-10-12 15:44:54 -04:00
Mauricio Carneiro e53a952aeb Added ION Torrent support to CountCovariates. 2011-10-12 01:57:02 -04:00
Mauricio Carneiro a2733a451f Added NotCalled feature to GAV
Added "not called" and "no status" to the truth table. Very useful.
2011-10-11 19:31:45 -04:00
David Roazen ae83420637 Merged bug fix from Stable into Unstable 2011-10-11 12:26:08 -04:00
David Roazen 794f275871 SnpEff is now marked as a RodRequiringAnnotation instead of an ExperimentalAnnotation.
Having SnpEff grouped with the Experimental annotations was proving problematic, since it
requires a rod. Placing it in its own group should improve the situation somewhat, making it
easier to request "all annotations except for SnpEff".
2011-10-11 12:08:56 -04:00
David Roazen cfd0ac8410 Merged bug fix from Stable into Unstable
Conflicts:
	public/java/test/org/broadinstitute/sting/gatk/walkers/genotyper/UnifiedGenotyperIntegrationTest.java
2011-10-11 12:03:51 -04:00
David Roazen 24b72334b3 UnifiedGenotyper now correctly initializes the VariantAnnotator engine.
This allows the annotation classes to perform any necessary initialization/validation.
For example, it allows the SnpEff annotator to (among other things) validate its rod binding.
This will prevent a NullPointerException when SnpEff annotation is requested but no rod binding
is present.

Added an integration test to cover this case so that it doesn't break again.
2011-10-11 12:02:05 -04:00
Guillermo del Angel 0429b38021 Merged bug fix from Stable into Unstable 2011-10-11 11:19:38 -04:00
Guillermo del Angel 1c485d8b5e Forgot that no matter how trivial a change it's a good idea to compile first 2011-10-11 11:18:41 -04:00
Guillermo del Angel 6418f4d69b Merged bug fix from Stable into Unstable 2011-10-11 11:13:18 -04:00
Guillermo del Angel 1975de1b32 Second try: hide --do_indel_quality in AnalyzeCovariates 2011-10-11 11:11:29 -04:00
Guillermo del Angel 6506ea83e8 Revert "Hide --do_indel_quality argument in AnalyzeCovariates. This shouldn't be documented nor used by external users"... a hidden passenger change made it through.
This reverts commit 70e10ccb1be90dcff8f4485ae6ee036db2d1ac86.
2011-10-11 11:03:12 -04:00
Guillermo del Angel 4c1d8c8d44 Hide --do_indel_quality argument in AnalyzeCovariates. This shouldn't be documented nor used by external users 2011-10-11 11:01:06 -04:00
Eric Banks cffc959e58 Moving to archive instead, since no one owns it 2011-10-10 23:55:57 -04:00
Eric Banks 77c983c5b5 No one claimed this walker and it doesn't have integration tests or GATKdocs so it doesn't belong in public. 2011-10-10 15:17:54 -04:00
Mark DePristo fb72bcf732 DiffObjects no longer prints out the file name in the status so MD5 are stable 2011-10-10 15:10:57 -04:00
Mark DePristo ac41b303fd Merge branch 'vcfAlleles' 2011-10-10 13:23:52 -04:00
Mark DePristo 0c0a619b08 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-10-10 13:23:40 -04:00
Mark DePristo b1124b2b61 Trivial Qscript to evaluate the impact of max deletion fraction in UG 2011-10-10 13:22:59 -04:00
Mark DePristo e3ff4f4266 Failing MD5 because output now contains absolute path 2011-10-10 11:05:02 -04:00
Mark DePristo 3e6c16d961 CombineVariants preserves allele order 2011-10-10 11:04:38 -04:00
Mark DePristo a4bb842958 RankSum tests have lightly different MD5 results based on allele order
-- UG GENOTYPE_GIVEN_ALLELES now uses the order of alleles in the VCF, so this changes the MD5
2011-10-10 11:04:07 -04:00
Mark DePristo 46e7370128 this.allele, getAlleles(), and getAltAlleles() now return List not set
-- Changes associated code throughout the codebase
-- Updated necessary (but minimal) UnitTests to reflect new behavior
-- Much better makealleles() function in VC.java that enforces a lot of key constraints in VC
2011-10-09 11:45:55 -07:00
Mark DePristo 822654b119 UnitTests for allele getting functions in VC in prep for move from set to list 2011-10-09 10:36:14 -07:00
Matt Hanna e19dba86a6 Merged bug fix from Stable into Unstable 2011-10-08 22:37:12 -04:00
Matt Hanna 0016c707a3 Codecs other than VCF accidentally bled into vcf.jar. Unfortunately, vcf.jar
turned out to be the *only* home for these other codecs.  This change pushes
them back into StingUtils.jar.
2011-10-08 22:24:35 -04:00
Mark DePristo c67f6c076b simpleMerge now preserves allele order
-- UnitTests for dangerous PL merging cases in the multi-allelic case.  The new behavior is correct
2011-10-08 17:39:53 -07:00
Mark DePristo e94e6ba101 A UnitTest to ensure that the order of alleles is maintained
-> A, C, T and A, T, C are different and must be maintained.  The constructors were doing this appropriately, so nothing needed to be changed
2011-10-08 08:47:58 -07:00
Mark DePristo ff3dccd062 Fixing errors in queueJobReport runtime unit 2011-10-07 12:04:53 -07:00
Mark DePristo ec14a4a606 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-10-07 08:38:50 -07:00
Matt Hanna 6fbd41724a Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-10-07 11:20:00 -04:00
Matt Hanna 4514bc350f More reliable way of finding the Tribble jar. 2011-10-07 11:19:29 -04:00
Eric Banks 181c76750e Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-10-06 22:38:55 -04:00
Eric Banks ca9cd9b688 Minor fix for merging intervals which hadn't been necessary when only merging from the left to right. Added integration tests to cover the parallelization of RTC. 2011-10-06 22:38:44 -04:00
Khalid Shakir f91b015e0e Made the BaseTest.testDir absolute 2011-10-06 22:33:21 -04:00
Mark DePristo c7864c7256 Filter application order is now deterministic, in the order defined by the walker
-- For no apparent reason we were using a HashSet to store the ReadFilters, so the order of operations was really arbitrarily applied.  The order now is

(1) the order of the walker intrinsic filters
(2) read group black list (if provided)
(3) command line filters (if provided)
2011-10-06 18:51:40 -07:00
Mark DePristo 0b88af4af9 Counts of records failing filters are displayed sorted
-- Stops random ordering of the output, as the counts are returned sorted by string name of the class
-- Deleted now unused sh*tty assessors in Utils
2011-10-06 18:42:26 -07:00