Commit Graph

1119 Commits (7b2a7cfbe763cfa8aeca37205dfe9b1ffedee094)

Author SHA1 Message Date
Eric Banks 7b2a7cfbe7 Transfer headers from the resource VCF when possible when using expressions. While there, VA was modified so that it didn't assume that the ID field was present in the VC's info map in preparation for Mark's upcoming changes. 2011-11-14 14:31:27 -05:00
Eric Banks 7aee80cd3b Fix to deal with reduced reads containing a deletion 2011-11-14 12:23:46 -05:00
Eric Banks 3d2970453b Misc minor cleanup 2011-11-14 09:41:54 -05:00
Eric Banks b7c33116af Minor docs update 2011-11-12 23:21:07 -05:00
Eric Banks 76d357be40 Updating docs example to use -L since that's best practice 2011-11-12 23:20:05 -05:00
Guillermo del Angel cd3146f4cf Add hidden option to ValidationAmplicons to output slightly modified format to make file work with downstream SQNM tools more seamlessly at request of GAP: one line per record, keep probe identifier to 20 characters, no * in ref allele. 2011-11-11 14:07:07 -05:00
Ryan Poplin 40fbeafa37 VQSR will now detect if the negative model failed to converge properly because of having too few data points and automatically retry with more appropriate clustering parameters. 2011-11-11 11:52:30 -05:00
Mark DePristo dc9b351b5e Meaningful error message when an IntervalArg file fails to parse correctly 2011-11-10 17:10:26 -05:00
Mark DePristo bb7bf74aa8 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 16:05:43 -05:00
Mark DePristo 153e52ffed VariantEvalIntegrationTest for IntervalStratification 2011-11-10 14:10:39 -05:00
Mauricio Carneiro 060c7ce8ae It wouldn't harm integrationtests if we had our logic right... :-) 2011-11-10 14:03:22 -05:00
Eric Banks 39678b6a20 Check for reads with missing read groups and throw a UserException when encountered. Mauricio said this wouldn't break integration tests. 2011-11-10 13:34:45 -05:00
Mark DePristo dd1810140f -stratIntervals is optional 2011-11-10 13:27:32 -05:00
Mark DePristo 67b022c34b Cleanup for new SampleUtils function
-- getVCFHeadersFromRods(rods) is now available so that you don't have getVCFHeadersFromRods(rods, null) throughout the codebase
2011-11-10 13:27:13 -05:00
Mark DePristo 35fe9c8a06 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-10 11:11:33 -05:00
Mark DePristo dc4932f93d VariantEval module to stratify the variants by whether they overlap an interval set
The primary use of this stratification is to provide a mechanism to divide asssessment of a call set up by whether a variant overlaps an interval or not.  I use this to differentiate between variants occurring in CCDS exons vs. those in non-coding regions, in the 1000G call set, using a command line that looks like:

-T VariantEval -R human_g1k_v37.fasta -eval 1000G.vcf -stratIntervals:BED ccds.bed -ST IntervalStratification

Note that the overlap algorithm properly handles symbolic alleles with an INFO field END value.  In order to safely use this module you should provide entire contigs worth of variants, and let the interval strat decide overlap, as opposed to using -L which will not properly work with symbolic variants.

Minor improvements to create() interval in GenomeLocParser.
2011-11-10 10:58:40 -05:00
Mauricio Carneiro 0d8983feee outputting the RG information
setReadGroup now sets the read group attribute for the GATKSAMRecord
2011-11-09 23:35:00 -05:00
Eric Banks 315ac68b0b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-09 22:37:36 -05:00
Eric Banks 6313aae2c4 Adding checks for hasBasePileup() before calling getBasePileup() as per GS thread 2011-11-09 22:37:26 -05:00
Ryan Poplin 74a18d3de8 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-09 22:29:40 -05:00
Ryan Poplin 24712c0221 Merged bug fix from Stable into Unstable 2011-11-09 22:28:27 -05:00
Ryan Poplin 8942406aa2 Use MathUtils to compare doubles instead of testing for equality 2011-11-09 22:05:21 -05:00
Ryan Poplin 348f2db7fd Fix for HMM optimization. If the two penalty arrays match exactly the function should return the end of the array instead of 0. 2011-11-09 22:00:52 -05:00
Eric Banks 82bf09edf3 Mark Standard Annotations with an asterisk 2011-11-09 20:42:31 -05:00
Eric Banks 04b122be29 Fix for bug reported on GetSatisfaction 2011-11-09 20:33:36 -05:00
Mauricio Carneiro d00b2c6599 Adding a synthetic read for filtered data
* Generalized the concept of a synthetic read to cread both running consensus and a synthetic reads of filtered data.
* Synthetic reads can now have deletions (but not insertions)
* New reduced read tag for filtered data synthetic reads *(RF)*
* Sliding window header now keeps information of consensus and filtered data
* Synthetic reads are created simultaneously, new functionality is controlled internally by addToSyntheticReads
2011-11-09 20:16:22 -05:00
Eric Banks 21bf43f3bb Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-11-09 15:34:40 -05:00
Eric Banks 02d5e3025e Added integration test for intervals from bed file 2011-11-09 15:34:19 -05:00
Christopher Hartl 85bffe1dca Merged bug fix from Stable into Unstable 2011-11-09 15:29:14 -05:00
Christopher Hartl d828eba7f4 Allow comments in a table-formatted file to precede the header line. 2011-11-09 15:27:38 -05:00
Eric Banks 8205efbb29 Merge branch 'master' into intervals 2011-11-09 15:27:15 -05:00
Eric Banks d64f8a89a9 Instead of the SelfScopingFeatureCodec interface, pushed this functionality into Tribble itself. Now we can e.g. determine that a file can be parsed by the BedCodec on the fly. 2011-11-09 15:24:29 -05:00
Mauricio Carneiro f080f64f99 Preserve RG information on new GATKSAMRecord from SAMRecord 2011-11-09 14:39:20 -05:00
Mauricio Carneiro f9530e0768 Clean unnecessary attributes from the read
this gives on average 40% file size reduction.
2011-11-09 14:39:20 -05:00
Mauricio Carneiro 9427ada498 Fixing no cigar bug
empty GATKSAMRecords will have a null cigar. Treat them accordingly.
2011-11-09 14:39:20 -05:00
Mark DePristo e639f0798e mergeEvals allows you to treat -eval 1.vcf -eval 2.vcf as a single call set
-- A bit of code cleanup in VCFUtils
-- VariantEval table to create 1000G Phase I variant summary table
-- First version of 1000G Phase I summary table Qscript
2011-11-09 14:35:50 -05:00
Christopher Hartl 149b79eaad Merged bug fix from Stable into Unstable 2011-11-09 11:26:30 -05:00
Christopher Hartl 11abb4f9d1 Better error message. 2011-11-09 11:25:28 -05:00
Christopher Hartl d3a533b82e Revert "a"
This reverts commit 1175f50ddbf389f5da74d27dc725596582ae15af.
2011-11-09 11:22:26 -05:00
Christopher Hartl 5eaf800281 a 2011-11-09 11:22:20 -05:00
Christopher Hartl 5451fbc2b2 Merged bug fix from Stable into Unstable 2011-11-09 11:06:15 -05:00
Christopher Hartl 091229e4db MVLikelihoodRatio now checks if the family string is provided before attempting to instantiate. Also check that variant contexts have both genotypes and genotype likelihoods.
Table codec now yells at users for not providing a HEADER with the table - parsing tables without a header line was causing the first line of the file to be eaten.
Table feature now has a toString method.

These are minor bug fixes.
2011-11-09 11:03:29 -05:00
Mauricio Carneiro e1b4c3968f Fixing GATKSAMRecord bug
when constructing a GATKSAMRecord from scratch, we should set "mRestOfBinaryData" to null so the BAMRecord doesn't try to retrieve missing information from the non-existent bam file.
2011-11-08 16:50:36 -05:00
Ryan Poplin e973ca2010 fixing merge conflict. 2011-11-08 14:55:05 -05:00
Ryan Poplin b0e6afec48 Bug fix for HMM optimization. Need to also check the gap continuation penalty array for the index with the first discrepancy. 2011-11-08 14:51:25 -05:00
Ryan Poplin 94dc447a70 Merged bug fix from Stable into Unstable 2011-11-07 15:26:35 -05:00
Ryan Poplin 0b181be61f Bug fix in SelectVariants when using a discordance track but no sample specifications. Added integration test to test this. 2011-11-07 15:25:16 -05:00
Ryan Poplin 0534149708 Merged bug fix from Stable into Unstable 2011-11-07 14:07:08 -05:00
Ryan Poplin 2d1e385ca4 Adding note to VQSR docs about Rscript being needed in the environment PATH. 2011-11-07 14:04:13 -05:00
Eric Banks 759f4fe6b8 Moving unclaimed walker with bad integration test to archive 2011-11-07 13:16:38 -05:00