Ryan Poplin
2a8b8efd2f
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-15 16:26:35 -04:00
Ryan Poplin
2f58fdb369
Adding expected output doc to CountCovariates
2011-09-15 16:26:11 -04:00
Eric Banks
fd1831b4a5
Updating docs to include more details
2011-09-15 16:25:03 -04:00
Eric Banks
6d02a34bfb
Updating docs to include output
2011-09-15 16:17:54 -04:00
Eric Banks
4ef6a4598c
Updating docs to include output
2011-09-15 16:10:34 -04:00
Eric Banks
fe474b77f8
Updating docs so printing looks nicer
2011-09-15 16:05:39 -04:00
Eric Banks
f04e51c6c2
Adding docs from Andrey since his repo was all screwed up.
2011-09-15 15:38:56 -04:00
Eric Banks
d369d10593
Adding documentation before the release for GATK wiki page
2011-09-15 13:56:23 -04:00
Eric Banks
202405b1a1
Updating the FunctionalClass stratification in VariantEval to handle the snpEff annotations; this change really needs to be in before the release so that the pipeline can output semi-meaningful plots. This commit maintains backwards compatibility with the crappy Genomic Annotator output. However, I did clean up the code a bit so that we now use an Enum instead of hard-coded values (so it's now much easier to change things if we choose to do so in the future). I do not see this as the final commit on this topic - I think we need to make some changes to the snpEff annotator to preferentially choose certain annotations within effect classes; Mark, let's chat about this for a bit when you get back next week. Also, for the record, I should be blamed for David's temporary commit the other day because I gave him the green light (since when do you care about backwards compatibility anyways?). In any case, at least now we have something that works for both the old and new annotations.
2011-09-15 13:52:31 -04:00
David Roazen
1e682deb26
Minor html-formatting-related documentation fix to the SnpEff class.
2011-09-15 13:07:50 -04:00
David Roazen
3db457ed01
Revert "Modified VariantEval FunctionalClass stratification to remove hardcoded GenomicAnnotator keynames"
...
After discussing this with Mark, it seems clear that the old version of the
VariantEval FunctionalClass stratification is preferable to this version.
By reverting, we maintain backwards compatibility with legacy output files
from the old GenomicAnnotator, and can add SnpEff support later without
breaking that backwards compatibility.
This reverts commit b44acd1abd9ab6eec37111a19fa797f9e2ca3326.
2011-09-14 10:47:28 -04:00
David Roazen
e0c8c0ddcb
Modified VariantEval FunctionalClass stratification to remove hardcoded GenomicAnnotator keynames
...
This is a temporary and hopefully short-lived solution. I've modified
the FunctionalClass stratification to stratify by effect impact as
defined by SnpEff annotations (high, moderate, and low impact) rather
than by the silent/missense/nonsense categories.
If we want to bring back the silent/missense/nonsense stratification,
we should probably take the approach of asking the SnpEff author
to add it as a feature to SnpEff rather than coding it ourselves,
since the whole point of moving to SnpEff was to outsource genomic
annotation.
2011-09-14 07:09:47 -04:00
David Roazen
1213b2f8c6
SnpEff 2.0.2 support
...
-Rewrote SnpEff support in VariantAnnotator to support the latest SnpEff release (version 2.0.2)
-Removed support for SnpEff 1.9.6 (and associated tribble codec)
-Will refuse to parse SnpEff output files produced by unsupported versions (or without a version tag)
-Correctly matches ref/alt alleles before annotating a record, unlike the previous version
-Correctly handles indels (again, unlike the previous version
2011-09-14 07:09:47 -04:00
Guillermo del Angel
5b1bf6e244
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-13 17:04:43 -04:00
Guillermo del Angel
c6672f2397
Intermediate (but necessary) fix for Beagle walkers: if a marker is absent in the Beagle output files, but present in the input vcf, there's no reason why it should be omitted in the output vcf. Rather, the vc is written as is from the input vcf
2011-09-13 16:57:37 -04:00
Matt Hanna
64707c33bb
Merged bug fix from Stable into Unstable
2011-09-12 21:54:11 -04:00
Matt Hanna
e63d9d8f8e
Mauricio pointed out to me that dynamic merging the unmapped regions of multiple BAMs ('-L unmapped' with a BAM list)
...
was completely broken. Sorry about this! Fixed.
2011-09-12 21:50:59 -04:00
Eric Banks
ec4b30de6d
Patch from Laurent: typo leads to bad error messages.
2011-09-12 14:45:53 -04:00
David Roazen
9d9d438bc4
New VariantAnnotatorEngine capability: an initialize() method for all annotation classes.
...
All VariantAnnotator annotation classes may now have an (optional) initialize() method
that gets called by the VariantAnnotatorEngine ONCE before annotation starts.
As an example of how this can be used, the SnpEff annotation class will use the initialize()
method to check whether the SnpEff version number stored in the vcf header is a supported
version, and also to verify that its required RodBinding is present.
2011-09-12 13:00:53 -04:00
Ryan Poplin
981b78ea50
Changing the VQSR command line syntax back to the parsed tags approach. This cleans up the code and makes sure we won't be parsing the same rod file multiple times. I've tried to update the appropriate qscripts.
2011-09-12 12:17:43 -04:00
Ryan Poplin
60ebe68aff
Fixing issue in VariantEval in which insertion and deletion events weren't treated symmetrically. Added new option to require strict allele matching.
2011-09-12 09:43:23 -04:00
Guillermo del Angel
9344938360
Uncomment code to add deleted bases covering an indel to per-sample genotype reporting, update integration tests accordingly
2011-09-10 19:41:01 -04:00
Guillermo del Angel
b399424a9c
Fix integration test affected by non-calling all-zero PL samples, and add a more complicated multi-sample integration test from a phase 1 case, GBR with mixed technologies and complex input alleles
2011-09-09 20:44:47 -04:00
Guillermo del Angel
e95d484757
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-09 18:31:14 -04:00
Guillermo del Angel
a807205fc3
a) Minor optimization to softMax() computation to avoid redundant operations, results in about 5-10% increase in speed in indel calling.
...
b) Added (but left commented out since it may affect integration tests and to isolate commits) fix to per-sample DP reporting, so that deletions are included in count.
c) Bug fix to avoid having non-reference genotypes assigned to samples with PL=0,0,0. Correct behavior should be to no-call these samples, and to ignore these samples when computing AC distribution since their likelihoods are not informative.
2011-09-09 18:00:23 -04:00
Mauricio Carneiro
9e650dfc17
Fixing SelectVariants documentation
...
getting rid of messages telling users to go for the YAML file. The idea is to not support these anymore.
2011-09-09 16:25:31 -04:00
Ryan Poplin
1953edcd2d
updating Validate Variants deletion integration test
2011-09-09 13:39:08 -04:00
Ryan Poplin
9ada9b3ed4
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-09 13:15:36 -04:00
Ryan Poplin
354529bff3
adding Validate Variants integration test with a deletion
2011-09-09 13:15:24 -04:00
Ryan Poplin
91c949db74
Fixing ValidateVariants so that it validates deletion records. Fixing GATKdocs.
2011-09-09 12:57:14 -04:00
Eric Banks
51eb95d638
Missed these tests before
2011-09-09 11:46:37 -04:00
Eric Banks
6ad8943ca0
CompOverlap no longer keeps track of the number of comp sites since it wasn't (and cannot) keeping track of them correctly.
2011-09-09 09:45:24 -04:00
Eric Banks
eaaba6eb51
Confirming that when stratifying by sample in VE the monomorphic sites for a given sample are not counted for the relevant metrics. Adding integration test to cover it.
2011-09-08 13:17:34 -04:00
Ryan Poplin
2636d216de
Adding indel vqsr integration test
2011-09-08 10:38:13 -04:00
Ryan Poplin
9cba1019c8
Another fix for genotype given alleles for indels. Expanding the indel integration tests to include multiallelics and indel records that overlap
2011-09-08 09:25:13 -04:00
Ryan Poplin
e0020b2b29
Fixing PrintRODs. Now has input and only prints out one copy of each record
2011-09-08 08:58:37 -04:00
Ryan Poplin
29c968ab60
clean up
2011-09-08 08:42:43 -04:00
Ryan Poplin
59841f8232
Fixing genotype given alleles for indels. Only take the records that start at this locus.
2011-09-08 08:41:16 -04:00
Guillermo del Angel
45d54f6258
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-07 16:49:49 -04:00
Guillermo del Angel
9604fb2ba3
Necessary but not sufficient step to fix GenotypeGivenAlleles mode in UG which is now busted
2011-09-07 16:49:16 -04:00
Mark DePristo
2ded027762
Removed dysfunctional tranches support from VariantEval
2011-09-07 16:09:24 -04:00
Eric Banks
aa9e32f2f1
Reverting Mark's previous commit as per the open discussion. Now the eval modules check isPolymorphic() before accruing stats when appropriate. Fixed the IndelLengthHistogram module not to error out if the indel isn't simple (that would have been bad). Only integration test that needed to be updated was the tranches one based on a separate commit from Mark.
2011-09-07 15:48:06 -04:00
Mark DePristo
d7e355b4b6
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-07 14:54:16 -04:00
Mark DePristo
9127849f5d
BugFix for unit test
2011-09-07 14:54:10 -04:00
Eric Banks
3a04955a30
We already had isPolymorphic and isMonomorphic in the VariantContext, but the implementation was incorrect for many edge cases (e.g. sites-only files, sites with samples who were no-called). Fixing. Moving on to VE now.
2011-09-07 14:01:42 -04:00
Guillermo del Angel
743bf7784c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-07 13:21:26 -04:00
Guillermo del Angel
5f22ef9a8c
Added missing javadoc info to Beagle arguments
2011-09-07 13:21:11 -04:00
Mark DePristo
3bcbfa6e06
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-09-07 13:13:17 -04:00
Mark DePristo
430da23446
At least 2 minutes must pass before a status message is printed, further stabilizing time estimates
2011-09-07 13:13:07 -04:00
Mauricio Carneiro
6857d0324e
Merge branch 'master' into rr
2011-09-07 12:59:08 -04:00