Commit Graph

430 Commits (f392d330c3ba13a0ed8bb83f7b99f4d2a2cde522)

Author SHA1 Message Date
Mark DePristo 7d09c0064b Bug fixes and code cleanup throughout
-- chromosomeCounts now takes builder as well, cleaning up a lot of code throughout the codebase.
2011-11-19 18:40:15 -05:00
Mark DePristo 707bd30b3f Should have been @BeforeMethod 2011-11-19 16:10:09 -05:00
Mark DePristo 8f7eebbaaf Bugfix for pError not being checked correctly in CommonInfo
-- UnitTests to ensure correct behavior
-- UnitTests to ensure correct behavior for pass filters vs. failed filters vs. unfiltered
2011-11-19 15:58:59 -05:00
Mark DePristo b7b57ef39a Updating MD5 to reflect canonical ordering of calculation
-- We should no longer have md5s changing because of hashmaps changing their sort order on us
-- Added GenotypeLikelihoodsUnitTests
-- Refactored ExactAFCaclculation to put the PL -> QUAL calculation in the GenotypeLikelihoods class to avoid the code copy.
2011-11-19 15:57:33 -05:00
Mark DePristo 73119c8e3c Merge with master
-- A few bug fixes
2011-11-19 09:56:06 -05:00
Mark DePristo f685fff79b Killing the final versions of old new VariantContext interface 2011-11-18 21:32:43 -05:00
Mark DePristo 6cf315e17b Change interface to getNegLog10PError to getLog10PError 2011-11-18 21:07:30 -05:00
Matt Hanna 8bb4d4dca3 First pass of the asynchronous block loader.
Block loads are only triggered on queue empty at this point.  Disabled by
default (enable with nt:io=?).
2011-11-18 15:02:59 -05:00
Mark DePristo f54afc19b4 VariantContextBuilder
-- New approach to making VariantContexts modeled on StringBuilder
-- No more modify routines -- use VariantContextBuilder
-- Renamed isPolymorphic to isPolymorphicInSamples.   Same for mono
-- getChromosomeCount -> getCalledChrCount
-- Walkers changed to use new VariantContext.  Some deprecated new VariantContext calls remain
-- VCFCodec now uses optimized cached information to create GenotypesContext.
2011-11-18 12:39:10 -05:00
Mark DePristo 7490dbb6eb First version of VariantContextBuilder 2011-11-18 11:06:15 -05:00
Mark DePristo fa454c88bb UnitTests for VariantContext for chrCount, getSampleNames, Order function
-- Major change to how chromosomeCounts is computed.  Now NO_CALL alleles are always excluded.  So ChromosomeCounts(A/.) is 1, the previous result would have been 2.
-- Naming changes for getSamplesNameInOrder()
2011-11-17 20:37:22 -05:00
Mark DePristo 02f22cc9f8 No more VC integration tests. All tests are now unit tests 2011-11-17 15:33:09 -05:00
Khalid Shakir c50274e02e During flanking interval creation merging overlapping flanks so that on scatter the list doesn't accidentally genotype the same site twice.
Moved flanking interval utilies to IntervalUtils with UnitTests.
2011-11-17 13:56:42 -05:00
Eric Banks bad19779b9 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-17 13:29:43 -05:00
Eric Banks 16a021992b Updated header description for the INFO and FORMAT DP fields to be more accurate. 2011-11-17 13:17:53 -05:00
Mark DePristo 7e66677769 Expanded UnitTests for VariantContext
Tests for
-- getGenotype and getGenotypes
-- subContextBySample
-- modify routines
2011-11-16 20:45:15 -05:00
Mauricio Carneiro 72f00e2883 Merging Roger's Unit tests for Reduce Reads from RR repository 2011-11-16 17:26:49 -05:00
Mark DePristo aa0610ea92 GenotypeCollection renamed to GenotypesContext 2011-11-16 16:24:05 -05:00
Mark DePristo 974daaca4d V13 version in archive. Can you pulled out wholesale for performance testing 2011-11-16 16:08:46 -05:00
Mark DePristo 101ffc4dfd Expanded, contrastive VariantContextBenchmark
-- Compares performance across a bunch of common operations with GATK 1.3 version of VariantContext and GATK 1.4
-- 1.3 VC and associated utilities copied wholesale into test directory under v13
2011-11-16 13:35:16 -05:00
Mark DePristo e56d52006a Continuing bugfixes to get new VC working 2011-11-16 10:39:17 -05:00
Eric Banks c2ebe58712 Merge remote-tracking branch 'Laurent/master' 2011-11-16 09:34:47 -05:00
David Roazen 0d163e3f52 SnpEff 2.0.4 support
-Modified the SnpEff parser to work with the SnpEff 2.0.4 VCF output format
-Assigning functional classes and effect impacts now handled directly
 by SnpEff rather than the GATK
-Removed support for SnpEff 2.0.2, as we no longer trust the output of that
 version since it doesn't exclude effects associated with certain nonsensical
 transcripts. These effects are excluded as of 2.0.4.
-Updated unit and integration tests

This support is based on a *release-candidate* of SnpEff 2.0.4, and so is subject
to change between now and the next GATK release.
2011-11-15 18:36:22 -05:00
Mark DePristo df415da4ab More bug fixes on the way to passing all tests 2011-11-15 17:38:12 -05:00
Laurent Francioli fb685f88ec Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-15 16:23:53 -05:00
Mark DePristo 460a51f473 ID field now stored in the VariantContext itself, not the attributes 2011-11-15 14:56:33 -05:00
Eric Banks 7fada320a9 The right fix for this test is just to delete it. 2011-11-15 14:53:27 -05:00
Mark DePristo 233e581828 Merging in Master 2011-11-15 09:28:24 -05:00
Mark DePristo 6e1a86bc3e Bug fixes to VariantContext and GenotypeCollection 2011-11-15 09:21:30 -05:00
Roger Zurawicki 284430d61d Added more basic UnitTests for ReadClipper
hardClipByReadCoordinatesWorks
hardClipLowQualTailsWorks
2011-11-15 00:13:52 -05:00
Roger Zurawicki 8e91e19229 Merge branch 'master' of ssh://nickel/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-15 00:13:37 -05:00
Mauricio Carneiro cde829899d compress Reduce Read counts bytes by offset
compressed the representation of the reduce reads counts by offset results in 17% average compression in final BAM file size.

Example compression -->

from : 10, 10, 11, 11, 12, 12, 12, 11, 10
to:      10, 0, 1, 1,2, 2, 2, 1, 0
2011-11-14 18:30:24 -05:00
Mark DePristo 4ff8225d78 GenotypeMap -> GenotypeCollection part 3
-- Test code actually builds
2011-11-14 17:51:41 -05:00
Mark DePristo f0234ab67f GenotypeMap -> GenotypeCollection part 2
-- Code actually builds
2011-11-14 17:42:55 -05:00
Mark DePristo 2e9d5363e7 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-14 15:32:06 -05:00
Mark DePristo 1fbdcb4f43 GenotypeMap -> GenotypeCollection 2011-11-14 15:32:03 -05:00
Eric Banks 7b2a7cfbe7 Transfer headers from the resource VCF when possible when using expressions. While there, VA was modified so that it didn't assume that the ID field was present in the VC's info map in preparation for Mark's upcoming changes. 2011-11-14 14:31:27 -05:00
Mark DePristo 9b5c79b49d Renamed InferredGeneticContext to CommonInfo
-- I have no idea why I named this InferredGeneticContext, a totally meaningless term
-- Renamed to CommonInfo.
-- Made package protected, as no one should use this outside of VariantContext and Genotype
-- UGEngine was using IGC constant, but it's now using the public one in VariantContext.
2011-11-14 14:28:52 -05:00
Mark DePristo 077397cb4b Deleted MutableVariantContext
-- All methods that used this capable now use VariantContext directly instead
2011-11-14 14:19:06 -05:00
Mark DePristo 79987d685c GenotypeMap contains a Map, not extends it
-- On path to replacing it with GenotypeCollection
2011-11-14 12:55:03 -05:00
Laurent Francioli 1347beef40 Merge branch 'PhaseByTransmission' 2011-11-14 11:31:28 +01:00
Laurent Francioli 6881d4800c Added Integration tests for Phasing by Transmission 2011-11-14 10:47:51 +01:00
Laurent Francioli 34acf8b978 Added Unit tests for new methods in GenotypeLikelihoods 2011-11-14 10:47:02 +01:00
Roger Zurawicki 1202a809cb Added Basic Unit Tests for ReadClipper
Tests some but not all functions
Some tests have been disabled because they are not working
2011-11-13 22:27:49 -05:00
Mark DePristo fee9b367e4 VariantContext genotypes are now stored as GenotypeMap objects
-- Enables further sophisticated optimizations, as this class can be smarter about storing the data and will directly support operations like subset to samples
-- All instances in the gatk that used Map<String, Genotype> now use GenotypeMap type.
-- Amazingly, there were many places where HashMap<String, Genotype> is used, so that the order of the genotypes is technically undefined and could be dangerous.  Now everything uses GenotypeMap with a specific ordering of samples (by name)
-- Integrationtests updated and all pass
2011-11-11 15:00:35 -05:00
Mark DePristo 4938569b3a More general handling of parameters for VariantContextBenchmark 2011-11-11 10:22:19 -05:00
Mark DePristo e216e85465 First working version of VariantContextBenchmark 2011-11-11 09:56:00 -05:00
Mark DePristo ee40791776 Attributes are now Map<String,Object> not Map<String,?>
-- Allows us to avoid an unnecessary copy when creating InferredGeneticContext (whose name really needs to change).
2011-11-11 09:55:42 -05:00
Mark DePristo 153e52ffed VariantEvalIntegrationTest for IntervalStratification 2011-11-10 14:10:39 -05:00
Mauricio Carneiro d00b2c6599 Adding a synthetic read for filtered data
* Generalized the concept of a synthetic read to cread both running consensus and a synthetic reads of filtered data.
* Synthetic reads can now have deletions (but not insertions)
* New reduced read tag for filtered data synthetic reads *(RF)*
* Sliding window header now keeps information of consensus and filtered data
* Synthetic reads are created simultaneously, new functionality is controlled internally by addToSyntheticReads
2011-11-09 20:16:22 -05:00