Commit Graph

223 Commits (85626e7a5dbae8a263b8e2ff2e64bd25656d6e9c)

Author SHA1 Message Date
Eric Banks 639a01f382 Updating integration test now that VE has been updated 2011-08-12 07:15:08 -04:00
Eric Banks 41f3da75d7 Implementation in VE was confusing 'variant' status vs. 'polymorphic' status. This led to issues because we now match types of eval and comp; specifically, subsetting a VC to a monomorphic sample can't change the 'variant' status of the VC (it's still a variant site or otherwise we'll never match the comps, which breaks GenotypeConcordance). CountVariants really got this wrong. Fixed. VE now passes all integration tests. 2011-08-12 02:22:44 -04:00
Eric Banks 45f973ab1f Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-12 00:40:18 -04:00
Eric Banks eba316621d Finish moving VE over to new rod system and fixing up the type inconsistency between eval and comp rods. Now the novel count is always 0 under the known stratification. :) 2011-08-12 00:40:08 -04:00
Menachem Fromer 9de06560df Update to new RodBinding system 2011-08-11 17:54:16 -04:00
Ryan Poplin f1d1252be2 Fixing syntax of BQSR and UG performance tests. 2011-08-11 17:04:09 -04:00
Ryan Poplin 902eb0c61e Adding dbsnp annotation back into the UG integration tests 2011-08-11 13:55:03 -04:00
Ryan Poplin c7b9a9ef0a Updating UnifiedGenotyper to use the new rod binding system. 2011-08-11 11:02:11 -04:00
Ryan Poplin 79c86e211f Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-11 09:59:20 -04:00
Ryan Poplin ea42ee4a95 Updating BQSR for the new rod binding system. 2011-08-11 09:58:42 -04:00
Mark DePristo 8cdc0cbd9c Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-11 08:58:49 -04:00
Mark DePristo 40e06f9afb Fixed broken RodBinding defaults.
-- Verified now to be correct at runtime
-- UnitTest covers this
-- createTypeDefault now takes a Type, not a Class, so that parameterized classes can have their parameter fetched in the defaults.
2011-08-11 08:58:30 -04:00
Eric Banks bdb1da30fd Better interface for getting RodBindings to the VariantAnnotatorEngine and its annotations: pass around an AnnotatorCompatibleWalker (interface) object. Updating VA to use the new rod system. 2011-08-10 22:43:08 -04:00
Eric Banks 07ad8c78a9 More tools moved over. Fixed the VariantContextIntegrationTest which was not useful because the md5s were all removed. In the future, instead of removing md5s (putting it in 'parameterization' mode), you should instead use @Test{enabled=false} since it's easier to track. 2011-08-10 14:24:40 -04:00
Eric Banks 8d14d32a62 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-10 13:42:37 -04:00
Eric Banks 749c8bfbcd Moving more tools over to the new rod system 2011-08-10 13:42:35 -04:00
David Roazen 0497170bc9 SnpEffCodec now implements SelfScopingFeatureCodec so that we no longer have to specify the codec name on the command line for SnpEff files. 2011-08-10 13:12:09 -04:00
Eric Banks a42f90db11 Moving more tools over to use the standard VC arg collection. Also, while I'm in there, I removed all of the empty references to @Requires given that it's no longer relevant. 2011-08-10 12:20:18 -04:00
Ryan Poplin c60cf52f73 Updating VQSR for new RodBinding syntax. Cleaning up indel specific parts of VQSR. 2011-08-10 10:20:37 -04:00
Eric Banks 1ea5ec276b Minor cleanup 2011-08-09 23:28:59 -04:00
Eric Banks bc2d4f554d Bringing Indel Realigner up to speed with the new rod binding syntax; now use -known to specify the known indels track. 2011-08-09 23:21:17 -04:00
Eric Banks 489e5cffc1 Missed a few 'variants' 2011-08-09 14:29:15 -04:00
Eric Banks b20c4d5286 Thanks to Mark for agreeing to transition from 'variants' back to 'variant'. I think I got them all but I've been jumping all around the code, so there might be a straggler or two. 2011-08-09 12:04:55 -04:00
Eric Banks 7afb5c9f1c More updates to be consistent with the new rod syntax. 2011-08-09 10:11:37 -04:00
Eric Banks 1e490e0dec Bringing up to speed with new syntax 2011-08-09 09:26:06 -04:00
Eric Banks 70b3daf689 VariantsToVCF is up and running again; integration tests are reenabled (and added one for dbSNP).ant 2011-08-09 03:03:43 -04:00
David Roazen 2efa376619 Made the necessary changes to get SnpEff support working with the new rodbinding system. 2011-08-08 23:29:39 -04:00
David Roazen b180a1311a Merge branch 'snpEff' 2011-08-08 22:12:14 -04:00
David Roazen 28d8c8fcbc Modified the SnpEff integration test to run on a much smaller interval. 2011-08-08 21:51:16 -04:00
David Roazen a13bc7b929 Added an integration test for the SnpEff annotation support, as well as some extra safety checks and comments. 2011-08-08 20:01:24 -04:00
Mark DePristo 80924d24de Single positional arguments are now treated as names unless they actually match a tribble feature 2011-08-08 19:26:27 -04:00
Mark DePristo f8a56bc64b Merge branch 'master' into rodRefactor 2011-08-08 16:58:18 -04:00
Mark DePristo f8ad91b16f Reverting a bunch of bad -B type drops 2011-08-08 16:57:38 -04:00
David Roazen 5e288136e0 Added unit tests for the SnpEff codec, and made minor adjustments to the codec itself. 2011-08-08 16:51:43 -04:00
Eric Banks d7813db217 Combine Variants was actually outputting invalid VCFs in cases where it was combining Variant Contexts with different alternate alleles: if any of the genotypes had PLs they were no longer valid/correct. Added a check for such cases (the combined VC has more alleles than an original VC) and strip out the PLs when triggered; added integration test to cover it. I also added the check to Select Variants, although it currently doesn't remove unused alleles so it should never trigger. Is there any reason not to strip out unused alleles after a select? 2011-08-08 16:25:35 -04:00
Mark DePristo 4f8fc0f2f1 VCF3 now dynamically determined 2011-08-08 15:05:47 -04:00
Mark DePristo ba7353c561 Updated IntegrationTests to use the new type free format for VCF files 2011-08-08 15:04:38 -04:00
Mark DePristo 0810c42309 GATK now does dynamic type determination for VCF files
Added UnitTests covering all of the cases.
2011-08-08 14:45:46 -04:00
Mark DePristo e36994e36b Refactored a FeatureManager class from RMDTrackBuilder
New class handles (vastly more cleanly) the db of tribble codecs, features, and names for use throughout the GATK.
Added SelfScopingFeatureCodec interface that allows a FeatureCodec to examine a file and determine if the file can be parsed.  This is the first step towards allowing the GATK to dynamically determine the type of a RodBinding.
2011-08-08 14:04:46 -04:00
Mark DePristo e5fde0d16b Merge branch 'master' into rodRefactor 2011-08-08 10:08:43 -04:00
Mark DePristo 526b524c3c CombineVariants with new RodBinding. Bugfix
-- CombineVariants now uses the new RodBinding syntax, -V / --variants.  Passed all integration tests on first run
-- Exposed gapping bug in the List<RodBinding<T>> system now fixed.  ParserEngine now has a addRodBinding() that is called by RodBindingArgumentTypeDescriptor when it encounters each RodBinding.  This allows the system to work with collection types that are recursively parsed by the system.
2011-08-07 20:16:51 -04:00
Ryan Poplin 6693407bd8 Merged bug fix from Stable into Unstable 2011-08-07 17:39:03 -04:00
Mark DePristo 1d8b1bae0a Need to rename the integration test argument -mask to -maskName 2011-08-07 13:32:26 -04:00
Mark DePristo ece8f0db5e Added b37dbSNP129, needed for Queue 2011-08-07 11:26:07 -04:00
Mark DePristo b0e91f85cf fix merge from Khalid's Queue fix 2011-08-07 10:33:20 -04:00
Mark DePristo 4d88e72958 Merge remote-tracking branch 'remotes/khalid/rodRefactor' into rodRefactor
Conflicts:
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java
	public/java/test/org/broadinstitute/sting/BaseTest.java
2011-08-07 10:32:27 -04:00
Khalid Shakir f049461120 Changed @Argument to @Input on input RodBindings.
Changed shortname collision with longname.
Restored scala builds.
Updated HSP to use new syntax.
2011-08-06 20:44:19 -04:00
Mark DePristo 573700d18d Adding missing import 2011-08-04 21:57:00 -04:00
Mark DePristo 14e43c3382 Final fix to RodBindingUnitTest to reset global counter variable 2011-08-04 21:52:39 -04:00
Mark DePristo 9308fbe3fb VariantEval Integration Test parameterized for new novelty stratification 2011-08-04 18:08:47 -04:00
Ryan Poplin 98a96f07c1 Updated standard deviation parameter in VQSR to our current recommended value 2011-08-04 14:06:26 -04:00
Mark DePristo 58a60d4901 Merge branch 'master' into rodRefactor 2011-08-04 12:48:56 -04:00
Mark DePristo d2078f09b2 Minor fixes to ITs 2011-08-04 12:47:55 -04:00
Eric Banks f10588420c Fixing path to dbSNP file as the other one was replaced 2011-08-04 12:36:24 -04:00
Mark DePristo f0d798d47c Bug fix: call RodBinding.resetNameCounter() in new ParsingEngine() so that we don't magically misnumber arguments in the integration tests where the GATK is only instantiated once. 2011-08-04 12:06:10 -04:00
Mark DePristo 490ca475fc Replacing hardcoded dbsnp129 with BaseTest variable 2011-08-03 22:15:22 -04:00
Eric Banks a831af1166 Another misprint when removing the references to -D 2011-08-03 21:29:21 -04:00
Mark DePristo d0279bb28c RodBinding names are now defaulting to the ArgumentTypeDescriptor fullname
Nearly all of the tools are passing integrationtests
2011-08-03 20:48:11 -04:00
Mark DePristo d8f1ebf8c6 Parameterized RecalibrationWalkers with clean unstable database 2011-08-03 20:06:00 -04:00
Mark DePristo 41b3840d26 Took latest VEIT and updated to use dbsnp132 vcf 2011-08-03 18:40:32 -04:00
Mark DePristo 0ef85647f7 A working version of a GATKReportDiffableReader for the diffEngine! 2011-08-03 18:21:18 -04:00
Mark DePristo acbd3d0922 Fixing up integration tests so more 2011-08-03 17:26:35 -04:00
Mark DePristo 8f696c7731 Continuing progress towards RodBinding 1.0
-- Cleaning up old interface to RMDT, docs and contracts added
-- Proper type checking for RodBinding for cases where the Tribble type isn't found or is the wrong type
2011-08-03 17:19:28 -04:00
Mark DePristo 800bb97f0b Removed getFeaturesAsGATKFeature and created createGenomeLoc(Feature) in genomeLocParser
Updated all walkers that used the now deleted methods.
2011-08-03 16:04:51 -04:00
Mark DePristo 79e4a8f6d3 Merge
Conflicts:
	private/java/src/org/broadinstitute/sting/gatk/walkers/qc/TestVariantContextWalker.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/phasing/PhaseByTransmission.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantrecalibration/VariantDataManager.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantValidationAssessor.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersIntegrationTest.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersPerformanceTest.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/varianteval/VariantEvalIntegrationTest.java
	public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-03 15:09:47 -04:00
Mark DePristo b25140db83 Contracts and documentation for some of RefMetaDataTracker
Continuing to fix integration tests that don't pass / run
2011-08-03 13:34:20 -04:00
Eric Banks 3de10b1ef8 Fixing misprint from Ryan's commit 2011-08-03 12:37:50 -04:00
Eric Banks db2e0aaa1a Darn, forgot to update unit tests. 2011-08-03 12:31:08 -04:00
Eric Banks 020b2408a8 Adding integration test for left alignment of indels 2011-08-03 12:19:44 -04:00
Eric Banks 5dc324ff35 Dealing with merge confict 2011-08-03 11:03:47 -04:00
Eric Banks 7c89fe01b3 Instead of having the padded reference base be some hackish attribute it is now an actual variable in the Variant Context class. More importantly, we now always require that it be present when padding is necessary - and validate as such upon construction of the VC. This cleans up the interface significantly because we no longer require that a reference base be passed in when writing a VC/VCF record. 2011-08-03 11:00:36 -04:00
Mark DePristo d9bc673ff2 Fixed bad constructor in RMDTUnitTest 2011-08-03 09:42:43 -04:00
Khalid Shakir 5dcac7b064 GATKReport v0.2:
- Floating point column widths are measured correctly
- Using fixed width columns instead of white space separated which allows spaces embedded in cell values
- Legacy support for parsing white space separated v0.1 tables where the columns may not be fixed width
- Enforcing that table descriptions do not contain newlines so that tables can be parsed correctly
Replaced GATKReportTableParser with existing functionality in GATKReport
2011-08-03 00:24:47 -04:00
Mark DePristo 2874835997 Bug fix for type checking RodBindings
Now compares the feature class not the codec class.
UnitTests improvements
integrationtests on their way to actually running
2011-08-02 22:25:41 -04:00
Mark DePristo b5e843f8f0 Approaching the end for the new RodBinding system
-- support for explicit naming of bindings (-X:name,type x)
-- support for automatic naming of bindings in lists (-X:vcf foo.vcf -X:vcf bar.vcf will generate internal names X and X2)
-- ParserEngineUnitTest expanded to cover all of the Rodbinding cases
-- RodBindingUnitTest tests all of the low-level accessors
-- Parsing engine throws UserExceptions when bad bindings are provided on the command line
2011-08-02 22:00:06 -04:00
Mark DePristo 83891271b5 --variants throughout integrationtests 2011-08-02 20:28:47 -04:00
Mark DePristo 3a27a25cfc Validates that the tribble binding provides the right object types at startup
Tests to ensure this remains working
2011-08-02 20:11:24 -04:00
Ryan Poplin b2cde87378 Removing --DBSNP syntax from BQSR integration tests 2011-08-02 15:34:38 -04:00
Mark DePristo e4a67f3df1 RefMetaDataTracker has complete set of get() functions for List<RodBinding<T>>
Including unit tests
2011-08-02 14:28:35 -04:00
Mark DePristo 03741fb640 Merge branch 'master' into rodRefactor
Conflicts:
	public/java/src/org/broadinstitute/sting/gatk/walkers/annotator/VariantAnnotatorEngine.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerIntegrationTest.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerPerformanceTest.java
	public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-02 14:21:58 -04:00
Mark DePristo a366f9a18d Updating tools to use the RodBinding<T> syntax 2011-08-02 14:05:51 -04:00
Eric Banks b9d0d2af22 Adding back temporarily removed integration test now that the file permissions have been fixed. 2011-08-02 12:39:11 -04:00
Eric Banks 1c387848de No more use of -D in the integration tests but instead stick with VCFs only. Since all of these tests were duplicated (one each for dbSNP format and for VCF), we don't actually lose coverage in the integration tests. 2011-08-02 10:39:50 -04:00
Eric Banks 2c5e526eb7 Don't use the mismatch fraction by default in the RealignerTargetCreator (since it's only useful when using SW in the indel realigner). Also, no more use of -D but instead move over to using VCFs. One integration test is temporarily commented out while I wait for a VCF file to get fixed. 2011-08-02 10:34:46 -04:00
Eric Banks 5626199bb6 The Unified Genotyper now does NOT emit SLOD/SB by default; to compute SB use --computeSLOD 2011-08-02 10:14:21 -04:00
Mark DePristo 8b1adb8c95 Removed getVariantContext() code 2011-08-01 13:41:09 -04:00
Mark DePristo f69bff5dd6 Commented out, because these fail the now removed dbSNP conversion. 2011-08-01 13:34:25 -04:00
Mark DePristo 7b07c4e04e RefMetaDataTracker now has get() methods accepting RodBindings
RodBinding no longer duplicates the get() methods in RMDT.  This is just an object now that connects the command line system to the RMDT.
Updated programs to use new style
Added UnitTests for the RodBinding accessors.
2011-07-30 15:34:11 -04:00
Mark DePristo 3b799db61a RefMetaDataTracker cleanup and unit tests
You know have to provide an explicit list of RODRecordLists upfront to the constructor.  RefMetaDataTracker is now immutable.  Changes in engine to incorporate these differences
Extensive UnitTests for RefMetaDataTracker now.
2011-07-29 13:23:17 -04:00
Mark DePristo 39b4e76fde Continuing refactoring of RefMetaDataTracker.
On the path towards converging getVariantContext() and getValues() in tracker so that we can have a single approach to get values from RODs with the new RodBinding() types
2011-07-28 17:48:28 -04:00
Mark DePristo 7c5c656b46 Uncovered fundamental accounting bug in VariantEval. Will be fixed by dev. team
Problem is that Novelty sees multiple records at a site (SNP, INDEL) to calculate whether a site is novel, but VariantEvalWalker makes an arbitrary decision which to use for analysis and CompOverlap may not see a comp record of the same type as eval.  So you get lines where the stratification is known but there are 10 novel sites!
2011-07-28 14:19:27 -04:00
Eric Banks 33b32c4211 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-28 13:57:22 -04:00
Eric Banks 7a2a65155f Merged bug fix from Stable into Unstable 2011-07-28 13:56:43 -04:00
Eric Banks 1afc49a297 There are some really 'interesting' (but apparently valid) records in the Mus musculus dbSNP file. Generalized the handling of complex cases in the dbSNP adaptor to handle it all. I just grabbed the actual Mus musculus dbSNP file as a test, ran it whole genome, and confirmed that we finally produce a valid VCF on it. Should be the last commit needed on this adaptor. 2011-07-28 13:55:58 -04:00
Mark DePristo c83f9432eb Cleaned up RefMetaDataTracker
Renamed many functions to more clearly state what they are actually doing
Removed unnecessary / unused functionality, reducing interface complexity
Updated all uses of this code in GATK
Added generic, type-safe accessors to RefMetaDataTracker such as public <T> List<T> getValues(final String name, Class<T> clazz)
Added standard refMetaDataTracker accessors to RodBinding, so you can do everything you can for generic rods with the tracker directly with with the RodBinding
2011-07-27 23:25:52 -04:00
Eric Banks 1865211b6d Merged bug fix from Stable into Unstable 2011-07-27 22:52:06 -04:00
Eric Banks 6230315ff2 Along with my half-written commit message from earlier, I also forgot to commit the integration test updates. This is what happens when you try to do things 30 seconds before you leave for the day. To finish up from before: complex events weren't being padded with the reference base as per the VCF spec. They are now. 2011-07-27 22:51:21 -04:00
Mark DePristo f3ad4ec94b Removed annoying FastaSequenceIndexBuilderProgressListener infrastructure that was just a boolean switch on whether to print progress or not. 2011-07-27 22:06:23 -04:00
Eric Banks ff31fa7990 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-27 16:15:23 -04:00
Mark DePristo 15be383d5b Merge branch 'master' into rodRefactor 2011-07-27 15:36:49 -04:00