gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Eric Banks	649dfe98f0	Add VCF header for any expressions that are requested	2011-10-28 10:22:19 -04:00
Eric Banks	19e27d4568	Removing all instances of -BTI (in tests and in GATKdocs) and replacing them with the appropriate alternative.	2011-10-27 23:55:11 -04:00
Mark DePristo	1b722c21cf	merge master	2011-10-25 16:08:39 -04:00
Mark DePristo	42bf9adede	Initial version of "fast" FragmentPileup code -- Uses mayOverlapRoutine in ReadUtils -- Attempts to be smart when doing overlap calculation, to avoid unnecessary allocations -- PileupElement now comparable (sorts on offset than on start) -- Caliper microbenchmark to assess performance	2011-10-22 21:36:37 -04:00
Guillermo del Angel	f4b409fa0d	CombineVariants bug fix: when merging records with disparate alleles we were leaving AC,AF fields intact. This had as a consequence that we could end up with a record with 3 alt alleles but only 2 values in AC,AF fields. Now, if alleles in combined vc are different from original, and if AC,AF fields can't be recomputed from genotypes, we remove attributes from vc map since they'll be invalid anyway. Integration test md5 changed since there were several badly merged records in result	2011-10-21 14:07:20 -04:00
David Roazen	4f01a742cb	Merged bug fix from Stable into Unstable	2011-10-13 21:39:52 -04:00
David Roazen	edfd6f8a06	Removing a public -> private dependency from the test suite. The public integration test VariantContextIntegrationTest was dependent on the private walker TestVariantContextWalker. Moved this walker to public/java/test (NOT public/java/src, since this walker is only used by the test suite) to avoid errors during public-only tests.	2011-10-13 21:32:52 -04:00
Mark DePristo	404ef741f1	Merged bug fix from Stable into Unstable	2011-10-13 18:02:06 -04:00
Mark DePristo	2ebdff074c	Update MD5s for SOLiD recalibration -- MD5 db had spelling error; fixed -- Bug in AlignmentUtils resulted in some bases not being color space corrected. The integration test caught the change, and it's clear that the new version is correct, as the prev. version was not considering the last the N qualities for reads with a ND operation.	2011-10-13 18:01:51 -04:00
Eric Banks	9aecd50473	Adding ability to exclude annotations from the VA and UG lists. As described in the docs, this argument trumps all others (including -all) so that we can get around the SnpEff issue brought up by Menachem. Added integration test for it.	2011-10-12 15:44:54 -04:00
David Roazen	cfd0ac8410	Merged bug fix from Stable into Unstable Conflicts: public/java/test/org/broadinstitute/sting/gatk/walkers/genotyper/UnifiedGenotyperIntegrationTest.java	2011-10-11 12:03:51 -04:00
David Roazen	24b72334b3	UnifiedGenotyper now correctly initializes the VariantAnnotator engine. This allows the annotation classes to perform any necessary initialization/validation. For example, it allows the SnpEff annotator to (among other things) validate its rod binding. This will prevent a NullPointerException when SnpEff annotation is requested but no rod binding is present. Added an integration test to cover this case so that it doesn't break again.	2011-10-11 12:02:05 -04:00
Mark DePristo	fb72bcf732	DiffObjects no longer prints out the file name in the status so MD5 are stable	2011-10-10 15:10:57 -04:00
Mark DePristo	e3ff4f4266	Failing MD5 because output now contains absolute path	2011-10-10 11:05:02 -04:00
Mark DePristo	3e6c16d961	CombineVariants preserves allele order	2011-10-10 11:04:38 -04:00
Mark DePristo	a4bb842958	RankSum tests have lightly different MD5 results based on allele order -- UG GENOTYPE_GIVEN_ALLELES now uses the order of alleles in the VCF, so this changes the MD5	2011-10-10 11:04:07 -04:00
Mark DePristo	46e7370128	this.allele, getAlleles(), and getAltAlleles() now return List not set -- Changes associated code throughout the codebase -- Updated necessary (but minimal) UnitTests to reflect new behavior -- Much better makealleles() function in VC.java that enforces a lot of key constraints in VC	2011-10-09 11:45:55 -07:00
Eric Banks	ca9cd9b688	Minor fix for merging intervals which hadn't been necessary when only merging from the left to right. Added integration tests to cover the parallelization of RTC.	2011-10-06 22:38:44 -04:00
Eric Banks	61a3dfae24	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-06 15:58:04 -04:00
Eric Banks	6eb87bf58a	RTC now caches all intervals as GenomeLocs (which is expected to take < 1Gb whole genome based on back of the envelope calculations with Matt) so that 1) we don't have to worry about emitting outside of the leaves in the hierarchical reductions and 2) we can emit the intervals in sorted order which is a big performance plus for the realigner. Integration tests change only because intervals whose start=stop are now printed as chr:start instead of chr:start-stop.	2011-10-06 15:57:49 -04:00
Mark DePristo	6d9c210460	Updating MD5s for updated BAM with read groups	2011-10-06 12:15:48 -07:00
Mark DePristo	ffdfdcde3f	Updating MD5s -- Interval test now uses RG containing BAM -- DoC sample name ordering has changed.	2011-10-04 15:54:45 -07:00
Mark DePristo	e1d6c7a50a	Updating MD5 that have changed due to sample ordering differences	2011-10-04 09:33:23 -07:00
Mark DePristo	343a7b6b2f	Updating UG integration tests for arbitrary impact of sample order changes on downsampling	2011-10-04 08:14:00 -07:00
Guillermo del Angel	3eef800889	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-24 21:20:11 -04:00
Guillermo del Angel	203517fbb7	a) Cleanups/bug fixes to previous commit to CombineVariants. b) Change md5 to reflect records that are now merged correctly. c) Change unit merge alleles test to reflect the fact that a null non-variant vc object is not valid and not supported because there's no way to codify such object in a vcf. The code correctly converts this to a non-variant single-base event with whatever the reference is at that location.	2011-09-24 19:08:00 -04:00
Guillermo del Angel	cd058dd10f	a) Fixed md5 for legit change in UG output that now also no-calls genotypes w/0,0,0 in PL's in SNP case. b) First reimplementation of new vc merger of different types. Previous version did it in two steps, first merging all vc's per type and then trying to see if resulting vc's would be merged if alleles of one type were a subset of another, but this won't work when uniquifying genotypes since sample names would be messed up and GT sample names wouldn't match VC sample names. Now, it's actually simpler: when splitting vc's by type before merging, we check for alleles of one vc being a subset of alleles of vc of another type and if so we put them together in same list.	2011-09-24 13:40:11 -04:00
Mark DePristo	8d9e136bba	Merge branch 'stable'	2011-09-24 09:26:28 -04:00
Mark DePristo	c0bb0cb465	Make DiploidGenotype enum private to walkers.genotyper	2011-09-24 08:48:33 -04:00
David Roazen	40202c85e0	Merged bug fix from Stable into Unstable	2011-09-23 16:35:55 -04:00
David Roazen	e1cb5f6459	SnpEff annotator now assigns a functional class to each effect and distinguishes between actual effects and mere modifiers. -We now assign a functional class (nonsense, missense, silent, or none) to each SnpEff effect, and add a SNPEFF_FUNCTIONAL_CLASS annotation to the INFO field of the output VCF. -Effects are now prioritized according to both biological impact and functional class, instead of impact only. -Many of SnpEff's "low-impact" effects are now classified as "modifiers" with lower priority than every other effect. This includes such "effects" as DOWNSTREAM, UPSTREAM, INTRON, GENE, EXON, and others that really describe the location of the variant rather than its biological effect. This code will be short-lived (likely 1.2-only), as the next version of SnpEff will include most of these features directly. Checking this change into Stable+Unstable instead of Unstable because the current functional class stratification in VariantEval is basically broken and urgently needs to be fixed for production purposes.	2011-09-23 16:06:52 -04:00
Eric Banks	a8e0fb26ea	Updating md5 because the file changed	2011-09-23 07:33:20 -04:00
Eric Banks	15a410b24b	Updating md5 for fixed file	2011-09-22 13:15:41 -04:00
Mark DePristo	3fdee2b9ed	Merge from stable into unstable	2011-09-22 11:19:43 -04:00
Mark DePristo	c514df6d18	Merge of stable into unstable	2011-09-22 10:34:27 -04:00
Mark DePristo	f81a41b889	Updating MD5s for CombineVariants -- Old version had broken RSIDs, new version is fixed. No longer see rs1234,. as it is now just rs1234	2011-09-22 10:30:25 -04:00
Eric Banks	b8ea9ceb68	Adding integration test that uses the -V:dbsnp binding to make sure it won't fail later on if someone messes with Tribble.	2011-09-21 22:43:31 -04:00
Mark DePristo	74f9ccf6dd	Merge	2011-09-21 11:30:11 -04:00
Mark DePristo	7d11f93b82	Final bugfix for CombineVariants -- Now handles multiple records at a site, so that you don't see records like set=dbsnp-dbsnp-dbsnp when combining something with dbsnp -- Proper handling of ids. If you are merging files with multiple ids for the same record, the ids are merged into a comma separated list	2011-09-21 10:58:32 -04:00
Mark DePristo	a91ac0c5db	Intermediate commit of bugfixes to CombineVariants	2011-09-21 10:15:05 -04:00
David Roazen	b04d8eab55	Merged bug fix from Stable into Unstable	2011-09-20 17:24:14 -04:00
David Roazen	d9ea764611	SnpEff annotator now adds OriginalSnpEffVersion and OriginalSnpEffCmd lines to the header of the VCF output file. This change is urgently required for production, which is why it's going into Stable+Unstable instead of just Unstable. The keys for the SnpEff version and command header lines in the VCF file output by VariantAnnotator (OriginalSnpEffVersion and OriginalSnpEffCmd) are intentionally different from the keys for those same lines in the SnpEff output file (SnpEffVersion and SnpEffCmd), so that output files from VariantAnnotator won't be confused with output files from SnpEff itself.	2011-09-20 16:30:55 -04:00
Guillermo del Angel	7fa1e237d9	Forgot to git stash pop new MD5's for CombineVariants integration test	2011-09-16 12:53:54 -04:00
David Roazen	d78e00e5b2	Renaming VariantAnnotator SnpEff keys This is to head off potential confusion with the output from the SnpEff tool itself, which also uses a key named EFF.	2011-09-15 17:42:15 -04:00
Eric Banks	9dc6354130	Oops didn't mean to touch this test before	2011-09-15 16:55:24 -04:00
Eric Banks	202405b1a1	Updating the FunctionalClass stratification in VariantEval to handle the snpEff annotations; this change really needs to be in before the release so that the pipeline can output semi-meaningful plots. This commit maintains backwards compatibility with the crappy Genomic Annotator output. However, I did clean up the code a bit so that we now use an Enum instead of hard-coded values (so it's now much easier to change things if we choose to do so in the future). I do not see this as the final commit on this topic - I think we need to make some changes to the snpEff annotator to preferentially choose certain annotations within effect classes; Mark, let's chat about this for a bit when you get back next week. Also, for the record, I should be blamed for David's temporary commit the other day because I gave him the green light (since when do you care about backwards compatibility anyways?). In any case, at least now we have something that works for both the old and new annotations.	2011-09-15 13:52:31 -04:00
David Roazen	3db457ed01	Revert "Modified VariantEval FunctionalClass stratification to remove hardcoded GenomicAnnotator keynames" After discussing this with Mark, it seems clear that the old version of the VariantEval FunctionalClass stratification is preferable to this version. By reverting, we maintain backwards compatibility with legacy output files from the old GenomicAnnotator, and can add SnpEff support later without breaking that backwards compatibility. This reverts commit b44acd1abd9ab6eec37111a19fa797f9e2ca3326.	2011-09-14 10:47:28 -04:00
David Roazen	e0c8c0ddcb	Modified VariantEval FunctionalClass stratification to remove hardcoded GenomicAnnotator keynames This is a temporary and hopefully short-lived solution. I've modified the FunctionalClass stratification to stratify by effect impact as defined by SnpEff annotations (high, moderate, and low impact) rather than by the silent/missense/nonsense categories. If we want to bring back the silent/missense/nonsense stratification, we should probably take the approach of asking the SnpEff author to add it as a feature to SnpEff rather than coding it ourselves, since the whole point of moving to SnpEff was to outsource genomic annotation.	2011-09-14 07:09:47 -04:00
David Roazen	1213b2f8c6	SnpEff 2.0.2 support -Rewrote SnpEff support in VariantAnnotator to support the latest SnpEff release (version 2.0.2) -Removed support for SnpEff 1.9.6 (and associated tribble codec) -Will refuse to parse SnpEff output files produced by unsupported versions (or without a version tag) -Correctly matches ref/alt alleles before annotating a record, unlike the previous version -Correctly handles indels (again, unlike the previous version	2011-09-14 07:09:47 -04:00
Guillermo del Angel	5b1bf6e244	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-13 17:04:43 -04:00

1 2 3 4

195 Commits (c1da8cd5e7376b70d3d2d8b8735f2140b999727c)