Commit Graph

8076 Commits (7b2a7cfbe763cfa8aeca37205dfe9b1ffedee094)

Author SHA1 Message Date
Eric Banks 7b2a7cfbe7 Transfer headers from the resource VCF when possible when using expressions. While there, VA was modified so that it didn't assume that the ID field was present in the VC's info map in preparation for Mark's upcoming changes. 2011-11-14 14:31:27 -05:00
Eric Banks 7aee80cd3b Fix to deal with reduced reads containing a deletion 2011-11-14 12:23:46 -05:00
Eric Banks 3d2970453b Misc minor cleanup 2011-11-14 09:41:54 -05:00
Eric Banks b7c33116af Minor docs update 2011-11-12 23:21:07 -05:00
Eric Banks 76d357be40 Updating docs example to use -L since that's best practice 2011-11-12 23:20:05 -05:00
Guillermo del Angel af8e39c04d Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-12 08:42:24 -05:00
Guillermo del Angel c95f015d77 a) Bug fix in validation site selector, b) Initial qscript for selection of random snps and indels for validation experiment 2011-11-12 08:41:53 -05:00
Mauricio Carneiro 8cd077f009 Writing a GATKReport table as output.
just to standardize the output.
2011-11-11 18:52:58 -05:00
Guillermo del Angel cd3146f4cf Add hidden option to ValidationAmplicons to output slightly modified format to make file work with downstream SQNM tools more seamlessly at request of GAP: one line per record, keep probe identifier to 20 characters, no * in ref allele. 2011-11-11 14:07:07 -05:00
Ryan Poplin 40fbeafa37 VQSR will now detect if the negative model failed to converge properly because of having too few data points and automatically retry with more appropriate clustering parameters. 2011-11-11 11:52:30 -05:00
Eric Banks 59945a41e8 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 23:05:20 -05:00
Eric Banks 0c32281484 Adding a benchmarking class for parsing VCF files. Not complete. 2011-11-10 23:05:13 -05:00
Mauricio Carneiro 9c013374fd A walker to calculate the coverage of a target
in targeted sequencing projects, we pay a penalty to get to a minimum coverage in 80% of the targets. This walker will help us understand what is the ratio between the targeted site (usually in the middle of the interval) and the targeted region.
2011-11-10 17:16:51 -05:00
Mauricio Carneiro ffa6bc66ec Eliminating excessive debug tests 2011-11-10 17:16:51 -05:00
Mauricio Carneiro 5a1170078a Using centralized reduce read facilities 2011-11-10 17:16:51 -05:00
Mark DePristo dc9b351b5e Meaningful error message when an IntervalArg file fails to parse correctly 2011-11-10 17:10:26 -05:00
Mark DePristo bb7bf74aa8 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 16:05:43 -05:00
Mark DePristo 153e52ffed VariantEvalIntegrationTest for IntervalStratification 2011-11-10 14:10:39 -05:00
Mauricio Carneiro 060c7ce8ae It wouldn't harm integrationtests if we had our logic right... :-) 2011-11-10 14:03:22 -05:00
Mauricio Carneiro bb4cd59475 Filtered and consensus reads will now use the same tag 2011-11-10 13:58:31 -05:00
Mauricio Carneiro 7a46273d75 Consensus reads had filtered data read names
fixed.
2011-11-10 13:58:31 -05:00
Mauricio Carneiro c14b182501 Add reads in the recursive call
was missing consensus reads that got added from the recursive call. This is was a side-effect of the filtered data implementation. Fixed.
2011-11-10 13:58:31 -05:00
Ryan Poplin 07dbf0bd40 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 13:39:24 -05:00
Ryan Poplin 26762d6c6f Folding recent HMM changes into Haplotype Caller. Misc bug fixes throughout HC. 2011-11-10 13:36:03 -05:00
Eric Banks 39678b6a20 Check for reads with missing read groups and throw a UserException when encountered. Mauricio said this wouldn't break integration tests. 2011-11-10 13:34:45 -05:00
Mark DePristo 18f829f76b Towards a full G1KPhaseI table creation script 2011-11-10 13:27:54 -05:00
Mark DePristo dd1810140f -stratIntervals is optional 2011-11-10 13:27:32 -05:00
Mark DePristo 67b022c34b Cleanup for new SampleUtils function
-- getVCFHeadersFromRods(rods) is now available so that you don't have getVCFHeadersFromRods(rods, null) throughout the codebase
2011-11-10 13:27:13 -05:00
Ryan Poplin 9490d71bc8 Folding recent HMM changes into Haplotype Caller. Misc bug fixes throughout HC. 2011-11-10 13:26:29 -05:00
Mark DePristo 35fe9c8a06 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 11:11:33 -05:00
Mark DePristo 714cac21c9 Testdata for IntervalStratification 2011-11-10 11:08:34 -05:00
Mark DePristo dc4932f93d VariantEval module to stratify the variants by whether they overlap an interval set
The primary use of this stratification is to provide a mechanism to divide asssessment of a call set up by whether a variant overlaps an interval or not.  I use this to differentiate between variants occurring in CCDS exons vs. those in non-coding regions, in the 1000G call set, using a command line that looks like:

-T VariantEval -R human_g1k_v37.fasta -eval 1000G.vcf -stratIntervals:BED ccds.bed -ST IntervalStratification

Note that the overlap algorithm properly handles symbolic alleles with an INFO field END value.  In order to safely use this module you should provide entire contigs worth of variants, and let the interval strat decide overlap, as opposed to using -L which will not properly work with symbolic variants.

Minor improvements to create() interval in GenomeLocParser.
2011-11-10 10:58:40 -05:00
Mauricio Carneiro 0d8983feee outputting the RG information
setReadGroup now sets the read group attribute for the GATKSAMRecord
2011-11-09 23:35:00 -05:00
Eric Banks 315ac68b0b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-09 22:37:36 -05:00
Eric Banks 6313aae2c4 Adding checks for hasBasePileup() before calling getBasePileup() as per GS thread 2011-11-09 22:37:26 -05:00
Ryan Poplin 74a18d3de8 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-09 22:29:40 -05:00
Ryan Poplin 24712c0221 Merged bug fix from Stable into Unstable 2011-11-09 22:28:27 -05:00
Ryan Poplin 8942406aa2 Use MathUtils to compare doubles instead of testing for equality 2011-11-09 22:05:21 -05:00
Ryan Poplin 348f2db7fd Fix for HMM optimization. If the two penalty arrays match exactly the function should return the end of the array instead of 0. 2011-11-09 22:00:52 -05:00
Mauricio Carneiro 9a4486a9e6 BaseCounts now include N's
Fixing unit tests accordingly.
2011-11-09 21:29:33 -05:00
Eric Banks 82bf09edf3 Mark Standard Annotations with an asterisk 2011-11-09 20:42:31 -05:00
Eric Banks 04b122be29 Fix for bug reported on GetSatisfaction 2011-11-09 20:33:36 -05:00
Mauricio Carneiro d00b2c6599 Adding a synthetic read for filtered data
* Generalized the concept of a synthetic read to cread both running consensus and a synthetic reads of filtered data.
* Synthetic reads can now have deletions (but not insertions)
* New reduced read tag for filtered data synthetic reads *(RF)*
* Sliding window header now keeps information of consensus and filtered data
* Synthetic reads are created simultaneously, new functionality is controlled internally by addToSyntheticReads
2011-11-09 20:16:22 -05:00
Mauricio Carneiro 3afbd0e526 Sliding Window Header now includes filtered data information
This is a necessary framework for the filtered data consensus reads to be produced.
2011-11-09 20:16:22 -05:00
Mauricio Carneiro 6ee90ada14 Quick optimization to the SlidingWindow builder
from O(n^2) to O(n). Not bad.
2011-11-09 20:16:21 -05:00
Eric Banks 21bf43f3bb Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-09 15:34:40 -05:00
Eric Banks 02d5e3025e Added integration test for intervals from bed file 2011-11-09 15:34:19 -05:00
Christopher Hartl 85bffe1dca Merged bug fix from Stable into Unstable 2011-11-09 15:29:14 -05:00
Christopher Hartl d828eba7f4 Allow comments in a table-formatted file to precede the header line. 2011-11-09 15:27:38 -05:00
Eric Banks 8205efbb29 Merge branch 'master' into intervals 2011-11-09 15:27:15 -05:00