Commit Graph

8117 Commits (e7d41d8d334221c12dc89a080a3ec55bb9cc4bfb)

Author SHA1 Message Date
Eric Banks e7d41d8d33 Minor cleanup 2011-11-17 12:00:28 -05:00
Eric Banks f250b47228 Someone broke this for SNPs when adding support for indels 2011-11-16 10:49:27 -05:00
Matt Hanna eb8e031f75 Merged bug fix from Stable into Unstable 2011-11-16 09:57:37 -05:00
Matt Hanna 6a5d5e7ac9 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/stable 2011-11-16 09:57:13 -05:00
Matt Hanna 7ac5cf8430 Getting rid of unsupported CountReadPairs walker in stable. Removal of
remainder of pairs processing framework to follow in unstable.
2011-11-16 09:53:59 -05:00
Eric Banks c2ebe58712 Merge remote-tracking branch 'Laurent/master' 2011-11-16 09:34:47 -05:00
Laurent Francioli 7d77fc51f5 Corrected bug causing PhaseByTransmission to crash in case of new Genotype.Type 2011-11-16 03:32:43 -05:00
David Roazen 0d163e3f52 SnpEff 2.0.4 support
-Modified the SnpEff parser to work with the SnpEff 2.0.4 VCF output format
-Assigning functional classes and effect impacts now handled directly
 by SnpEff rather than the GATK
-Removed support for SnpEff 2.0.2, as we no longer trust the output of that
 version since it doesn't exclude effects associated with certain nonsensical
 transcripts. These effects are excluded as of 2.0.4.
-Updated unit and integration tests

This support is based on a *release-candidate* of SnpEff 2.0.4, and so is subject
to change between now and the next GATK release.
2011-11-15 18:36:22 -05:00
Laurent Francioli fb685f88ec Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-15 16:23:53 -05:00
Eric Banks 7fada320a9 The right fix for this test is just to delete it. 2011-11-15 14:53:27 -05:00
Mauricio Carneiro 231b8e9f74 Do not output deletion only synthetic reads
If a synthetic read is composed exclusively of deletions, do not output it.
2011-11-15 13:24:43 -05:00
Eric Banks b45d10e6f1 The DP in the FORMAT field (per sample) must also use the representative count or else it's always 1 for reduced reads. 2011-11-15 10:23:59 -05:00
Eric Banks b66556f4a0 Update error message so that it's clear ReadPair Walkers are exceptions 2011-11-15 09:22:57 -05:00
Mauricio Carneiro cde829899d compress Reduce Read counts bytes by offset
compressed the representation of the reduce reads counts by offset results in 17% average compression in final BAM file size.

Example compression -->

from : 10, 10, 11, 11, 12, 12, 12, 11, 10
to:      10, 0, 1, 1,2, 2, 2, 1, 0
2011-11-14 18:30:24 -05:00
Mauricio Carneiro a1ce3d8141 Not reporting counts to reduced deletions (temporary patch)
Deletions will not have counts represented in the reduced form. This may change in the future with a ReadBackedPileup refactor.
2011-11-14 18:30:24 -05:00
David Roazen ab0ee9b847 Perform only necessary validation in VariantContext modify methods 2011-11-14 16:49:59 -05:00
Guillermo del Angel 5c38a9cfd6 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-14 15:00:03 -05:00
Guillermo del Angel f1db31f072 Attempt to reduce memory footprint of ValidationSiteSelector (if this doesn't work then a radical rewrite of the walker to make it two-pass will be necessary): don't log any attributes of original VCF, if we need chr counts later we can reannotate from original inputs. As things stand, we can't select SNP's genomewide due to memory usage. 2011-11-14 14:56:09 -05:00
Eric Banks 4dc9dbe890 One quick fix to previous commit 2011-11-14 14:42:12 -05:00
Eric Banks b3313e1445 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-14 14:31:38 -05:00
Eric Banks 7b2a7cfbe7 Transfer headers from the resource VCF when possible when using expressions. While there, VA was modified so that it didn't assume that the ID field was present in the VC's info map in preparation for Mark's upcoming changes. 2011-11-14 14:31:27 -05:00
Guillermo del Angel 509ecc62cc Another bug fix for when no samples are specified in ValidationSiteSelectionWalker 2011-11-14 13:02:51 -05:00
Eric Banks 7aee80cd3b Fix to deal with reduced reads containing a deletion 2011-11-14 12:23:46 -05:00
Eric Banks 3d2970453b Misc minor cleanup 2011-11-14 09:41:54 -05:00
Laurent Francioli 1347beef40 Merge branch 'PhaseByTransmission' 2011-11-14 11:31:28 +01:00
Laurent Francioli 6881d4800c Added Integration tests for Phasing by Transmission 2011-11-14 10:47:51 +01:00
Laurent Francioli 34acf8b978 Added Unit tests for new methods in GenotypeLikelihoods 2011-11-14 10:47:02 +01:00
Eric Banks b7c33116af Minor docs update 2011-11-12 23:21:07 -05:00
Eric Banks 76d357be40 Updating docs example to use -L since that's best practice 2011-11-12 23:20:05 -05:00
Guillermo del Angel af8e39c04d Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-12 08:42:24 -05:00
Guillermo del Angel c95f015d77 a) Bug fix in validation site selector, b) Initial qscript for selection of random snps and indels for validation experiment 2011-11-12 08:41:53 -05:00
Mauricio Carneiro 8cd077f009 Writing a GATKReport table as output.
just to standardize the output.
2011-11-11 18:52:58 -05:00
Guillermo del Angel cd3146f4cf Add hidden option to ValidationAmplicons to output slightly modified format to make file work with downstream SQNM tools more seamlessly at request of GAP: one line per record, keep probe identifier to 20 characters, no * in ref allele. 2011-11-11 14:07:07 -05:00
Ryan Poplin 40fbeafa37 VQSR will now detect if the negative model failed to converge properly because of having too few data points and automatically retry with more appropriate clustering parameters. 2011-11-11 11:52:30 -05:00
Eric Banks 59945a41e8 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 23:05:20 -05:00
Eric Banks 0c32281484 Adding a benchmarking class for parsing VCF files. Not complete. 2011-11-10 23:05:13 -05:00
Mauricio Carneiro 9c013374fd A walker to calculate the coverage of a target
in targeted sequencing projects, we pay a penalty to get to a minimum coverage in 80% of the targets. This walker will help us understand what is the ratio between the targeted site (usually in the middle of the interval) and the targeted region.
2011-11-10 17:16:51 -05:00
Mauricio Carneiro ffa6bc66ec Eliminating excessive debug tests 2011-11-10 17:16:51 -05:00
Mauricio Carneiro 5a1170078a Using centralized reduce read facilities 2011-11-10 17:16:51 -05:00
Mark DePristo dc9b351b5e Meaningful error message when an IntervalArg file fails to parse correctly 2011-11-10 17:10:26 -05:00
Mark DePristo bb7bf74aa8 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 16:05:43 -05:00
Mark DePristo 153e52ffed VariantEvalIntegrationTest for IntervalStratification 2011-11-10 14:10:39 -05:00
Mauricio Carneiro 060c7ce8ae It wouldn't harm integrationtests if we had our logic right... :-) 2011-11-10 14:03:22 -05:00
Mauricio Carneiro bb4cd59475 Filtered and consensus reads will now use the same tag 2011-11-10 13:58:31 -05:00
Mauricio Carneiro 7a46273d75 Consensus reads had filtered data read names
fixed.
2011-11-10 13:58:31 -05:00
Mauricio Carneiro c14b182501 Add reads in the recursive call
was missing consensus reads that got added from the recursive call. This is was a side-effect of the filtered data implementation. Fixed.
2011-11-10 13:58:31 -05:00
Ryan Poplin 07dbf0bd40 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 13:39:24 -05:00
Ryan Poplin 26762d6c6f Folding recent HMM changes into Haplotype Caller. Misc bug fixes throughout HC. 2011-11-10 13:36:03 -05:00
Eric Banks 39678b6a20 Check for reads with missing read groups and throw a UserException when encountered. Mauricio said this wouldn't break integration tests. 2011-11-10 13:34:45 -05:00
Mark DePristo 18f829f76b Towards a full G1KPhaseI table creation script 2011-11-10 13:27:54 -05:00