Commit Graph

237 Commits (edfd6f8a06bfbaaddfbb26677dc98f33af9f6318)

Author SHA1 Message Date
Guillermo del Angel 269ed1206c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-19 09:32:20 -04:00
Mark DePristo a5e279d697 Dynamic typing of vcf.gz files
-- CombineVariantsIntegrationTests now use dynamic typing of vcf.gz files
-- FeatureManagerUnitTests tests for correctness.
2011-08-19 09:05:11 -04:00
Guillermo del Angel 58560a6d50 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-18 16:17:52 -04:00
Guillermo del Angel 3dfb60a46e Fixing up and refactoring usage of indel categories. On a variant context, isInsertion() and isDeletion() are now removed because behavior before was wrong in case of multiallelic sites. Now, methods isSimpleInsertion() and isSimpleDeletion() will return true only if sites are biallelic. For multiallelic sites, isComplex() will return true in all cases.
VariantEval module CountVariants is corrected and an additional column is added so that we log mixed events and complex indels separately (before they were being conflated).
VariantEval module IndelStatistics is considerably simplified as the sample stratification was wrong and redundant, now it should work with the VE-generic Sample stratification. Several columns are renamed or removed since they're not really useful
2011-08-18 16:17:38 -04:00
Mark DePristo c2287c93d7 Cleanup of codec locations. No more dbSNPHelper
-- refdata/features now in utils/codecs with the other codecs
-- Deleted dbsnpHelper.  rsID function now in VCFutils.  Remaining code either deleted or put into VariantContextAdaptors
-- Many associated import updates due to code move
2011-08-18 10:02:46 -04:00
Eric Banks b75a1807e3 Adding integration test to cover sample exclusion 2011-08-17 22:40:09 -04:00
David Roazen 53006da9a5 Improved descriptions for the SnpEff annotations in the VCF header
(based on Eric's feedback).
2011-08-17 16:09:10 -04:00
Mark DePristo 6e828260a0 Removed -B support. Now explodes with error if -B provided. 2011-08-16 16:13:47 -04:00
David Roazen 9d2cda3d41 Removed a public -> private dependency in our test suite. 2011-08-12 17:29:10 -04:00
Menachem Fromer 9121b8ed65 Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-12 12:24:19 -04:00
Menachem Fromer 7ed120361d Fixed bug that required symbolic alleles to be padded with reference base and added integration test to test parsing and output of symbolic alleles 2011-08-12 12:23:44 -04:00
Eric Banks 27f0748b33 Renaming the HapMap codec and feature to RawHapMap so that we don't get esoteric errors when trying to bind a rod with the name 'hapmap' (since it was also a feature). 2011-08-12 11:11:56 -04:00
Eric Banks 005bd71be3 Working too quickly earlier. Fixing syntax. 2011-08-12 10:29:36 -04:00
Menachem Fromer c7ca33cbff Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-12 10:12:09 -04:00
Eric Banks 639a01f382 Updating integration test now that VE has been updated 2011-08-12 07:15:08 -04:00
Eric Banks 41f3da75d7 Implementation in VE was confusing 'variant' status vs. 'polymorphic' status. This led to issues because we now match types of eval and comp; specifically, subsetting a VC to a monomorphic sample can't change the 'variant' status of the VC (it's still a variant site or otherwise we'll never match the comps, which breaks GenotypeConcordance). CountVariants really got this wrong. Fixed. VE now passes all integration tests. 2011-08-12 02:22:44 -04:00
Eric Banks 45f973ab1f Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-12 00:40:18 -04:00
Eric Banks eba316621d Finish moving VE over to new rod system and fixing up the type inconsistency between eval and comp rods. Now the novel count is always 0 under the known stratification. :) 2011-08-12 00:40:08 -04:00
Menachem Fromer 9de06560df Update to new RodBinding system 2011-08-11 17:54:16 -04:00
Ryan Poplin f1d1252be2 Fixing syntax of BQSR and UG performance tests. 2011-08-11 17:04:09 -04:00
Ryan Poplin 902eb0c61e Adding dbsnp annotation back into the UG integration tests 2011-08-11 13:55:03 -04:00
Ryan Poplin c7b9a9ef0a Updating UnifiedGenotyper to use the new rod binding system. 2011-08-11 11:02:11 -04:00
Ryan Poplin 79c86e211f Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-11 09:59:20 -04:00
Ryan Poplin ea42ee4a95 Updating BQSR for the new rod binding system. 2011-08-11 09:58:42 -04:00
Mark DePristo 8cdc0cbd9c Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-11 08:58:49 -04:00
Mark DePristo 40e06f9afb Fixed broken RodBinding defaults.
-- Verified now to be correct at runtime
-- UnitTest covers this
-- createTypeDefault now takes a Type, not a Class, so that parameterized classes can have their parameter fetched in the defaults.
2011-08-11 08:58:30 -04:00
Eric Banks bdb1da30fd Better interface for getting RodBindings to the VariantAnnotatorEngine and its annotations: pass around an AnnotatorCompatibleWalker (interface) object. Updating VA to use the new rod system. 2011-08-10 22:43:08 -04:00
Eric Banks 07ad8c78a9 More tools moved over. Fixed the VariantContextIntegrationTest which was not useful because the md5s were all removed. In the future, instead of removing md5s (putting it in 'parameterization' mode), you should instead use @Test{enabled=false} since it's easier to track. 2011-08-10 14:24:40 -04:00
Eric Banks 8d14d32a62 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-10 13:42:37 -04:00
Eric Banks 749c8bfbcd Moving more tools over to the new rod system 2011-08-10 13:42:35 -04:00
David Roazen 0497170bc9 SnpEffCodec now implements SelfScopingFeatureCodec so that we no longer have to specify the codec name on the command line for SnpEff files. 2011-08-10 13:12:09 -04:00
Eric Banks a42f90db11 Moving more tools over to use the standard VC arg collection. Also, while I'm in there, I removed all of the empty references to @Requires given that it's no longer relevant. 2011-08-10 12:20:18 -04:00
Ryan Poplin c60cf52f73 Updating VQSR for new RodBinding syntax. Cleaning up indel specific parts of VQSR. 2011-08-10 10:20:37 -04:00
Eric Banks 1ea5ec276b Minor cleanup 2011-08-09 23:28:59 -04:00
Eric Banks bc2d4f554d Bringing Indel Realigner up to speed with the new rod binding syntax; now use -known to specify the known indels track. 2011-08-09 23:21:17 -04:00
Eric Banks 489e5cffc1 Missed a few 'variants' 2011-08-09 14:29:15 -04:00
Eric Banks b20c4d5286 Thanks to Mark for agreeing to transition from 'variants' back to 'variant'. I think I got them all but I've been jumping all around the code, so there might be a straggler or two. 2011-08-09 12:04:55 -04:00
Eric Banks 7afb5c9f1c More updates to be consistent with the new rod syntax. 2011-08-09 10:11:37 -04:00
Eric Banks 1e490e0dec Bringing up to speed with new syntax 2011-08-09 09:26:06 -04:00
Eric Banks 70b3daf689 VariantsToVCF is up and running again; integration tests are reenabled (and added one for dbSNP).ant 2011-08-09 03:03:43 -04:00
David Roazen 2efa376619 Made the necessary changes to get SnpEff support working with the new rodbinding system. 2011-08-08 23:29:39 -04:00
David Roazen b180a1311a Merge branch 'snpEff' 2011-08-08 22:12:14 -04:00
David Roazen 28d8c8fcbc Modified the SnpEff integration test to run on a much smaller interval. 2011-08-08 21:51:16 -04:00
David Roazen a13bc7b929 Added an integration test for the SnpEff annotation support, as well as some extra safety checks and comments. 2011-08-08 20:01:24 -04:00
Mark DePristo 80924d24de Single positional arguments are now treated as names unless they actually match a tribble feature 2011-08-08 19:26:27 -04:00
Mark DePristo f8a56bc64b Merge branch 'master' into rodRefactor 2011-08-08 16:58:18 -04:00
Mark DePristo f8ad91b16f Reverting a bunch of bad -B type drops 2011-08-08 16:57:38 -04:00
David Roazen 5e288136e0 Added unit tests for the SnpEff codec, and made minor adjustments to the codec itself. 2011-08-08 16:51:43 -04:00
Eric Banks d7813db217 Combine Variants was actually outputting invalid VCFs in cases where it was combining Variant Contexts with different alternate alleles: if any of the genotypes had PLs they were no longer valid/correct. Added a check for such cases (the combined VC has more alleles than an original VC) and strip out the PLs when triggered; added integration test to cover it. I also added the check to Select Variants, although it currently doesn't remove unused alleles so it should never trigger. Is there any reason not to strip out unused alleles after a select? 2011-08-08 16:25:35 -04:00
Mark DePristo 4f8fc0f2f1 VCF3 now dynamically determined 2011-08-08 15:05:47 -04:00
Mark DePristo ba7353c561 Updated IntegrationTests to use the new type free format for VCF files 2011-08-08 15:04:38 -04:00
Mark DePristo 0810c42309 GATK now does dynamic type determination for VCF files
Added UnitTests covering all of the cases.
2011-08-08 14:45:46 -04:00
Mark DePristo e36994e36b Refactored a FeatureManager class from RMDTrackBuilder
New class handles (vastly more cleanly) the db of tribble codecs, features, and names for use throughout the GATK.
Added SelfScopingFeatureCodec interface that allows a FeatureCodec to examine a file and determine if the file can be parsed.  This is the first step towards allowing the GATK to dynamically determine the type of a RodBinding.
2011-08-08 14:04:46 -04:00
Mark DePristo e5fde0d16b Merge branch 'master' into rodRefactor 2011-08-08 10:08:43 -04:00
Mark DePristo 526b524c3c CombineVariants with new RodBinding. Bugfix
-- CombineVariants now uses the new RodBinding syntax, -V / --variants.  Passed all integration tests on first run
-- Exposed gapping bug in the List<RodBinding<T>> system now fixed.  ParserEngine now has a addRodBinding() that is called by RodBindingArgumentTypeDescriptor when it encounters each RodBinding.  This allows the system to work with collection types that are recursively parsed by the system.
2011-08-07 20:16:51 -04:00
Ryan Poplin 6693407bd8 Merged bug fix from Stable into Unstable 2011-08-07 17:39:03 -04:00
Mark DePristo 1d8b1bae0a Need to rename the integration test argument -mask to -maskName 2011-08-07 13:32:26 -04:00
Mark DePristo ece8f0db5e Added b37dbSNP129, needed for Queue 2011-08-07 11:26:07 -04:00
Mark DePristo b0e91f85cf fix merge from Khalid's Queue fix 2011-08-07 10:33:20 -04:00
Mark DePristo 4d88e72958 Merge remote-tracking branch 'remotes/khalid/rodRefactor' into rodRefactor
Conflicts:
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java
	public/java/test/org/broadinstitute/sting/BaseTest.java
2011-08-07 10:32:27 -04:00
Khalid Shakir f049461120 Changed @Argument to @Input on input RodBindings.
Changed shortname collision with longname.
Restored scala builds.
Updated HSP to use new syntax.
2011-08-06 20:44:19 -04:00
Mark DePristo 573700d18d Adding missing import 2011-08-04 21:57:00 -04:00
Mark DePristo 14e43c3382 Final fix to RodBindingUnitTest to reset global counter variable 2011-08-04 21:52:39 -04:00
Mark DePristo 9308fbe3fb VariantEval Integration Test parameterized for new novelty stratification 2011-08-04 18:08:47 -04:00
Ryan Poplin 98a96f07c1 Updated standard deviation parameter in VQSR to our current recommended value 2011-08-04 14:06:26 -04:00
Mark DePristo 58a60d4901 Merge branch 'master' into rodRefactor 2011-08-04 12:48:56 -04:00
Mark DePristo d2078f09b2 Minor fixes to ITs 2011-08-04 12:47:55 -04:00
Eric Banks f10588420c Fixing path to dbSNP file as the other one was replaced 2011-08-04 12:36:24 -04:00
Mark DePristo f0d798d47c Bug fix: call RodBinding.resetNameCounter() in new ParsingEngine() so that we don't magically misnumber arguments in the integration tests where the GATK is only instantiated once. 2011-08-04 12:06:10 -04:00
Mark DePristo 490ca475fc Replacing hardcoded dbsnp129 with BaseTest variable 2011-08-03 22:15:22 -04:00
Eric Banks a831af1166 Another misprint when removing the references to -D 2011-08-03 21:29:21 -04:00
Mark DePristo d0279bb28c RodBinding names are now defaulting to the ArgumentTypeDescriptor fullname
Nearly all of the tools are passing integrationtests
2011-08-03 20:48:11 -04:00
Mark DePristo d8f1ebf8c6 Parameterized RecalibrationWalkers with clean unstable database 2011-08-03 20:06:00 -04:00
Mark DePristo 41b3840d26 Took latest VEIT and updated to use dbsnp132 vcf 2011-08-03 18:40:32 -04:00
Mark DePristo 0ef85647f7 A working version of a GATKReportDiffableReader for the diffEngine! 2011-08-03 18:21:18 -04:00
Mark DePristo acbd3d0922 Fixing up integration tests so more 2011-08-03 17:26:35 -04:00
Mark DePristo 8f696c7731 Continuing progress towards RodBinding 1.0
-- Cleaning up old interface to RMDT, docs and contracts added
-- Proper type checking for RodBinding for cases where the Tribble type isn't found or is the wrong type
2011-08-03 17:19:28 -04:00
Mark DePristo 800bb97f0b Removed getFeaturesAsGATKFeature and created createGenomeLoc(Feature) in genomeLocParser
Updated all walkers that used the now deleted methods.
2011-08-03 16:04:51 -04:00
Mark DePristo 79e4a8f6d3 Merge
Conflicts:
	private/java/src/org/broadinstitute/sting/gatk/walkers/qc/TestVariantContextWalker.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/phasing/PhaseByTransmission.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantrecalibration/VariantDataManager.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java
	public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantValidationAssessor.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersIntegrationTest.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersPerformanceTest.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/varianteval/VariantEvalIntegrationTest.java
	public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-03 15:09:47 -04:00
Mark DePristo b25140db83 Contracts and documentation for some of RefMetaDataTracker
Continuing to fix integration tests that don't pass / run
2011-08-03 13:34:20 -04:00
Eric Banks 3de10b1ef8 Fixing misprint from Ryan's commit 2011-08-03 12:37:50 -04:00
Eric Banks db2e0aaa1a Darn, forgot to update unit tests. 2011-08-03 12:31:08 -04:00
Eric Banks 020b2408a8 Adding integration test for left alignment of indels 2011-08-03 12:19:44 -04:00
Eric Banks 5dc324ff35 Dealing with merge confict 2011-08-03 11:03:47 -04:00
Eric Banks 7c89fe01b3 Instead of having the padded reference base be some hackish attribute it is now an actual variable in the Variant Context class. More importantly, we now always require that it be present when padding is necessary - and validate as such upon construction of the VC. This cleans up the interface significantly because we no longer require that a reference base be passed in when writing a VC/VCF record. 2011-08-03 11:00:36 -04:00
Mark DePristo d9bc673ff2 Fixed bad constructor in RMDTUnitTest 2011-08-03 09:42:43 -04:00
Khalid Shakir 5dcac7b064 GATKReport v0.2:
- Floating point column widths are measured correctly
- Using fixed width columns instead of white space separated which allows spaces embedded in cell values
- Legacy support for parsing white space separated v0.1 tables where the columns may not be fixed width
- Enforcing that table descriptions do not contain newlines so that tables can be parsed correctly
Replaced GATKReportTableParser with existing functionality in GATKReport
2011-08-03 00:24:47 -04:00
Mark DePristo 2874835997 Bug fix for type checking RodBindings
Now compares the feature class not the codec class.
UnitTests improvements
integrationtests on their way to actually running
2011-08-02 22:25:41 -04:00
Mark DePristo b5e843f8f0 Approaching the end for the new RodBinding system
-- support for explicit naming of bindings (-X:name,type x)
-- support for automatic naming of bindings in lists (-X:vcf foo.vcf -X:vcf bar.vcf will generate internal names X and X2)
-- ParserEngineUnitTest expanded to cover all of the Rodbinding cases
-- RodBindingUnitTest tests all of the low-level accessors
-- Parsing engine throws UserExceptions when bad bindings are provided on the command line
2011-08-02 22:00:06 -04:00
Mark DePristo 83891271b5 --variants throughout integrationtests 2011-08-02 20:28:47 -04:00
Mark DePristo 3a27a25cfc Validates that the tribble binding provides the right object types at startup
Tests to ensure this remains working
2011-08-02 20:11:24 -04:00
Ryan Poplin b2cde87378 Removing --DBSNP syntax from BQSR integration tests 2011-08-02 15:34:38 -04:00
Mark DePristo e4a67f3df1 RefMetaDataTracker has complete set of get() functions for List<RodBinding<T>>
Including unit tests
2011-08-02 14:28:35 -04:00
Mark DePristo 03741fb640 Merge branch 'master' into rodRefactor
Conflicts:
	public/java/src/org/broadinstitute/sting/gatk/walkers/annotator/VariantAnnotatorEngine.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerIntegrationTest.java
	public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerPerformanceTest.java
	public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-02 14:21:58 -04:00
Mark DePristo a366f9a18d Updating tools to use the RodBinding<T> syntax 2011-08-02 14:05:51 -04:00
Eric Banks b9d0d2af22 Adding back temporarily removed integration test now that the file permissions have been fixed. 2011-08-02 12:39:11 -04:00
Eric Banks 1c387848de No more use of -D in the integration tests but instead stick with VCFs only. Since all of these tests were duplicated (one each for dbSNP format and for VCF), we don't actually lose coverage in the integration tests. 2011-08-02 10:39:50 -04:00
Eric Banks 2c5e526eb7 Don't use the mismatch fraction by default in the RealignerTargetCreator (since it's only useful when using SW in the indel realigner). Also, no more use of -D but instead move over to using VCFs. One integration test is temporarily commented out while I wait for a VCF file to get fixed. 2011-08-02 10:34:46 -04:00
Eric Banks 5626199bb6 The Unified Genotyper now does NOT emit SLOD/SB by default; to compute SB use --computeSLOD 2011-08-02 10:14:21 -04:00
Mark DePristo 8b1adb8c95 Removed getVariantContext() code 2011-08-01 13:41:09 -04:00