gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Eric Banks	43b0c98298	Fix docs	2011-09-27 20:34:46 -04:00
Eric Banks	232a6df11c	Add longhand form to the error message.	2011-09-27 20:29:31 -04:00
Eric Banks	1d6fcb6eb1	Revert "Add longhand form to the error message to prevent users from posting borderline dumb posts to GS." This reverts commit 75b2600527cfce05ae683cb394290ff2a80e8552.	2011-09-27 20:27:00 -04:00
Eric Banks	269b9826b6	Add longhand form to the error message to prevent users from posting borderline dumb posts to GS.	2011-09-27 20:26:36 -04:00
Mauricio Carneiro	3b6e43b7c4	Use reads that span multiple intervals * RR will now compress reads that span across multiple intervals correctly and output them in the correct order. * Fixed bug in getReadCoordinateForReferenceCoordinate where if the requested reference coordinate fell inside a deletion in the read the read would be clipped up to one element past the deletion.	2011-09-27 18:39:06 -04:00
Khalid Shakir	84bd355690	Merged bug fix from Stable into Unstable	2011-09-27 14:34:39 -04:00
Khalid Shakir	b090751f62	Fixed Ant / PluginManager issue where reflections was picking up all class files under current working directory due to "." in jar manifest classpaths. Updates to HybridSelectionPipeline: - Added annotations back via snpEff - Minor updates to VQSR paths and lowered memory	2011-09-27 14:33:57 -04:00
Eric Banks	26e71f6688	The Omni files have multiple records (with the same ALT) at a particular location, with one PASSing and the other(s) filtered. Chris, this is why using this file as both eval and comp leads to ref/no-call cells in the GenotypeConcordance table. However, this led to non-determinism in VE because the VCs were placed in a HashSet; we use a LinkedHashMap instead to bring back determinism.	2011-09-27 11:03:17 -04:00
Guillermo del Angel	ceffefa6a6	Intermediate version with banded pair HMM	2011-09-27 10:18:58 -04:00
Mark DePristo	e99ff3caae	Removed lots of old, and not to be used, HMM options -- resulted in massive code cleanup -- GdA will integrate his new banded algorithm here -- Removed: DO_CONTEXT_DEPENDENT_PENALTIES, GET_GAP_PENALTIES_FROM_DATA, INDEL_RECAL_FILE, dovit, GSA_PRODUCTION_ONLY	2011-09-27 10:08:40 -04:00
Mark DePristo	fa0efbc4ca	Refactoring of PairHMM to support reduced reads	2011-09-26 13:28:56 -04:00
Mark DePristo	a6b65d6347	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-26 13:26:21 -04:00
Mark DePristo	4f09453470	Refactored reduced read utilities -- UnitTests for key functions on reduced reads -- PileupElement calls static functions in ReadUtils -- Simple routine that takes a reduced read and fills in its quals with its reduced qual	2011-09-26 12:58:31 -04:00
Eric Banks	234b74dd05	Merged bug fix from Stable into Unstable	2011-09-26 11:47:23 -05:00
Eric Banks	317b95fa57	Fixing some annotator docs	2011-09-26 11:46:45 -05:00
Mauricio Carneiro	b76dbc72f0	Fixed interval navigation bug. If a read was hard clipped away from the current interval, all subsequent reads within that interval (not hardclipped) would be filtered out. Fixed.	2011-09-26 08:13:44 -04:00
Guillermo del Angel	9afccd11b1	Minor refactoring: add ability to MathUtils.normalizeFromLog10 to not go to linear domain but just substract max value from log values and return. Use this function in snp and indel GL computation.	2011-09-25 21:18:56 -04:00
Guillermo del Angel	3eef800889	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-24 21:20:11 -04:00
Guillermo del Angel	4707ab4a7d	Added unit tests to test genotype merges with PL's	2011-09-24 21:17:15 -04:00
Guillermo del Angel	203517fbb7	a) Cleanups/bug fixes to previous commit to CombineVariants. b) Change md5 to reflect records that are now merged correctly. c) Change unit merge alleles test to reflect the fact that a null non-variant vc object is not valid and not supported because there's no way to codify such object in a vcf. The code correctly converts this to a non-variant single-base event with whatever the reference is at that location.	2011-09-24 19:08:00 -04:00
Mauricio Carneiro	c31f4cb2f6	Cleaning leading insertions With the current implementation, a read cannot start with a deletion or an insertion. Maybe this will change in the future, but for now, chop the leading insertion off.	2011-09-24 14:33:32 -04:00
Guillermo del Angel	cd058dd10f	a) Fixed md5 for legit change in UG output that now also no-calls genotypes w/0,0,0 in PL's in SNP case. b) First reimplementation of new vc merger of different types. Previous version did it in two steps, first merging all vc's per type and then trying to see if resulting vc's would be merged if alleles of one type were a subset of another, but this won't work when uniquifying genotypes since sample names would be messed up and GT sample names wouldn't match VC sample names. Now, it's actually simpler: when splitting vc's by type before merging, we check for alleles of one vc being a subset of alleles of vc of another type and if so we put them together in same list.	2011-09-24 13:40:11 -04:00
Mark DePristo	bb11951255	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-24 09:26:45 -04:00
Mark DePristo	8d9e136bba	Merge branch 'stable'	2011-09-24 09:26:28 -04:00
Mark DePristo	6804ab6d2f	Bug fix for NPE in very short GATK runs -- Was already in unstable, but not stable...	2011-09-24 09:25:29 -04:00
Mark DePristo	92acff46e5	Moved Haplotype into Utils root	2011-09-24 09:14:05 -04:00
Mark DePristo	f792353dcd	Framework for genotype unit test	2011-09-24 08:56:45 -04:00
Mark DePristo	c0bb0cb465	Make DiploidGenotype enum private to walkers.genotyper	2011-09-24 08:48:33 -04:00
Guillermo del Angel	3a4469a236	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-23 21:58:34 -04:00
Guillermo del Angel	0e74cc3c74	a) Treat SNP genotype likelihoods just as indels, in the sense that they're always normalized as PL's so one of them will always be zero. This creates minor numerical differences in Qual and annotations due to numerical approximations in AF computation. b) Intermediate CombineVariants fixes, not ready yet	2011-09-23 21:58:20 -04:00
Khalid Shakir	1803bd6ae2	Merged bug fix from Stable into Unstable	2011-09-23 21:05:00 -04:00
Khalid Shakir	8ceb93b8ac	Fixed an integration test which crashed on the out of date LSF DRMAA library when run against the obsolete LSF dotkit instead of .combined_LSF_SGE	2011-09-23 21:03:22 -04:00
Mauricio Carneiro	7cac75ae1d	Merged bug fix from Stable into Unstable	2011-09-23 19:00:43 -04:00
Mauricio Carneiro	fbe3c1e0b3	Adding warning on HardClipping Hard Clipping is still under heavy development and should not be used by anyone less prepared than MacGyver.	2011-09-23 19:00:19 -04:00
Mark DePristo	b66841f179	Static cache for binomial probability -- Very low level performance optimization	2011-09-23 17:29:34 -04:00
Mauricio Carneiro	1a45c331b2	bringing the latest bug fixes to Reduce Reads	2011-09-23 16:40:06 -04:00
Mauricio Carneiro	9ea40f2e41	Deletions/Insertions in hard clip and bug fixes * Deletions now count as hard clipped bases in order to recover the original alignment start of a clipped read. * Insertions do not count as hard clipped bases for the same reason. * This created a bug in the previous cigar cleaning function. Fixed.	2011-09-23 16:37:08 -04:00
David Roazen	40202c85e0	Merged bug fix from Stable into Unstable	2011-09-23 16:35:55 -04:00
David Roazen	e1cb5f6459	SnpEff annotator now assigns a functional class to each effect and distinguishes between actual effects and mere modifiers. -We now assign a functional class (nonsense, missense, silent, or none) to each SnpEff effect, and add a SNPEFF_FUNCTIONAL_CLASS annotation to the INFO field of the output VCF. -Effects are now prioritized according to both biological impact and functional class, instead of impact only. -Many of SnpEff's "low-impact" effects are now classified as "modifiers" with lower priority than every other effect. This includes such "effects" as DOWNSTREAM, UPSTREAM, INTRON, GENE, EXON, and others that really describe the location of the variant rather than its biological effect. This code will be short-lived (likely 1.2-only), as the next version of SnpEff will include most of these features directly. Checking this change into Stable+Unstable instead of Unstable because the current functional class stratification in VariantEval is basically broken and urgently needs to be fixed for production purposes.	2011-09-23 16:06:52 -04:00
Matt Hanna	e388c357ca	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-23 14:53:28 -04:00
Matt Hanna	cc23b0b8a9	Fix for recent change modelling unmapped shards: don't invoke optimization to combine mapped and unmapped shards.	2011-09-23 14:52:31 -04:00
Mark DePristo	e3d4efb283	Remove N2 EXACT model code, which should never be used	2011-09-23 11:55:21 -04:00
Mark DePristo	27ce3c822e	Merge branch 'stable'	2011-09-23 09:04:52 -04:00
Mark DePristo	2bb77a7978	Docs for all VariantAnnotator annotations	2011-09-23 09:04:16 -04:00
Mark DePristo	dd65ba5bae	@Hidden for DocumentationTest and GATKDocsExample	2011-09-23 09:03:37 -04:00
Mark DePristo	dfce301beb	Looks for @Hidden annotation on all classes and excludes them from the docs	2011-09-23 09:03:04 -04:00
Mark DePristo	106a26c42d	Minor file cleanup	2011-09-23 08:25:20 -04:00
Mark DePristo	a9f073fa68	Genotype merging unit tests for simpleMerge -- Remaining TODOs are all for GdA	2011-09-23 08:24:49 -04:00
Mark DePristo	4397ce8653	Moved removePLs to VariantContextUtils	2011-09-23 08:24:20 -04:00
Eric Banks	a8e0fb26ea	Updating md5 because the file changed	2011-09-23 07:33:20 -04:00
Mark DePristo	c49cc623de	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-22 17:26:21 -04:00
Mark DePristo	dab7232e9a	simpleMerge UnitTest for not annotating and annotating to different info key	2011-09-22 17:26:11 -04:00
Mark DePristo	30ab3af0c8	A few more simpleMerge UnitTest tests for filtered vcs	2011-09-22 17:14:59 -04:00
Mark DePristo	5cf82f9236	simpleMerge UnitTest tests filtered VC merging	2011-09-22 17:05:12 -04:00
Mark DePristo	46ca33dc04	TestDataProvider now can be named	2011-09-22 17:04:32 -04:00
Mauricio Carneiro	96c875399c	Merging many bug fixes to reduce reads	2011-09-22 17:04:11 -04:00
Mauricio Carneiro	39b54211d0	Fixed hard clipping soft clipped bases after hard clips if soft clipped bases were after a hard clipped section of the read, the hard clip was clipping the left soft clip tail as if it were a right tail. Mayhem.	2011-09-22 15:46:55 -04:00
Mark DePristo	68da555932	UnitTest for simpleMerge for alleles	2011-09-22 15:16:37 -04:00
Mauricio Carneiro	1acf7945c5	Fixed hard clipped cigar and alignment start * Hard clipped Cigar now includes all insertions that were hard clipped and not the deletions. * The alignment start is now recalculated according to the new hard clipped cigar representation	2011-09-22 14:51:14 -04:00
Eric Banks	80d7300de4	Unit test was passing in FORMAT as one of the sample names. There used to be a hack in the VCFHeader to check for this and remove it and I couldn't figure out why, but now I know. Hack was removed and now the unit test passes in only the sample names as per the contract.	2011-09-22 13:28:42 -04:00
Mauricio Carneiro	4e9020c9f7	Fixed alignment start for hard clipping insertions	2011-09-22 13:28:25 -04:00
Eric Banks	9c1728416c	Revert "Updating md5 for fixed file" because this was fixed properly in unstable (but will break SnpEff if put into Stable). This reverts commit 6b4182c6ab3e214da4c73bc6f3687ac6d1c0b72c.	2011-09-22 13:16:42 -04:00
Eric Banks	888d8697b1	Merged bug fix from Stable into Unstable	2011-09-22 13:16:31 -04:00
Eric Banks	15a410b24b	Updating md5 for fixed file	2011-09-22 13:15:41 -04:00
Mark DePristo	ba5f83fee2	start of VariantContextUtils UnitTest -- tests rsID merging	2011-09-22 12:10:39 -04:00
Mark DePristo	93dd1faa5f	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-22 11:20:10 -04:00
Mark DePristo	a05c959e5a	Empty unit tests for VariantContextUtils -- will be expanded over the day	2011-09-22 11:20:07 -04:00
Mark DePristo	3fdee2b9ed	Merge from stable into unstable	2011-09-22 11:19:43 -04:00
Christopher Hartl	4f4a0fc38a	Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/chartl/dev/git	2011-09-22 11:01:58 -04:00
Christopher Hartl	982c47bfa7	Remove duplicate effort in ReadUtils (with apologies to Mauricio) Big (but not major) cleanup of code in ILG - mostly excising the old likelihood model Activated the early-abort check for ILG. I think it should be better this way.	2011-09-22 10:58:26 -04:00
Mark DePristo	c514df6d18	Merge of stable into unstable	2011-09-22 10:34:27 -04:00
Mark DePristo	f81a41b889	Updating MD5s for CombineVariants -- Old version had broken RSIDs, new version is fixed. No longer see rs1234,. as it is now just rs1234	2011-09-22 10:30:25 -04:00
Eric Banks	b8ea9ceb68	Adding integration test that uses the -V:dbsnp binding to make sure it won't fail later on if someone messes with Tribble.	2011-09-21 22:43:31 -04:00
Eric Banks	8f8b59a932	My interpretation of the VCF spec is that the FORMAT field should only be present if there is genotype/sample data. So the VCFCodec now throws an exception when it encounters such a case. I had to fix one of the integration test VCFs.	2011-09-21 22:23:28 -04:00
Christopher Hartl	dc96f6da79	Merge branch 'master' of ssh://chartl@gsa2/humgen/gsa-scr1/chartl/dev/git	2011-09-21 18:18:41 -04:00
Christopher Hartl	f9cdc119af	Added a method to ReadUtils that converts reads of the form 10S20M10S to 40M (just unclips the soft-clips). Be careful when using this - if you're writing a bam file it will be potentially written out of order (since the previous alignment start was at the M, not the S).	2011-09-21 18:16:42 -04:00
Christopher Hartl	faff6e4019	Failed to commit changes to the GATKReport required for more easy access when using the files as data sources (read: histograms) for walkers	2011-09-21 18:15:23 -04:00
Mauricio Carneiro	96768c8a18	Sending latest bug fixes to Reduce Reads to the main repository	2011-09-21 17:43:11 -04:00
Mauricio Carneiro	70335b2b0a	Hard clipping soft clipped reads to fix misalignments. Pre-softclipped reads (with high qual) are a complicated event to deal with in the Reduced Reads environment. I chose to hard clip them out for now and added a todo item to bring them back on in the future, perhaps as a variant region.	2011-09-21 17:12:01 -04:00
Christopher Hartl	ef05827c7b	Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-21 16:40:47 -04:00
Christopher Hartl	3b51d9106a	Adding in likelihood calculations for mendelian violations. Also fixing a minor and rare bug in SelectVariants when specifying family structure on the command line.	2011-09-21 16:40:29 -04:00
Mark DePristo	04968c88b3	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-09-21 15:43:25 -04:00
Mark DePristo	6bcfce225f	Fix for dynamic type determination for bgzip files -- GZipInputStream handles bgzip files under linux, but not mac -- Added BlockCompressedInputStream test as well, which works properly on bgzip files	2011-09-21 15:39:19 -04:00
Mark DePristo	9f6f0c443c	Marginally cleaner isVCFStream() function -- cleanup trying to debug minor bug. Failed to fix the bug, but the code is nicer now	2011-09-21 15:25:01 -04:00
Ryan Poplin	5fef6dc5d0	Merged bug fix from Stable into Unstable	2011-09-21 15:23:06 -04:00
Ryan Poplin	2585fc3d6c	Updating Rscript path doc text for Broad users	2011-09-21 15:22:26 -04:00
Mark DePristo	74f9ccf6dd	Merge	2011-09-21 11:30:11 -04:00
Mark DePristo	6592972f82	Putative fix for BAQ array out of bounds -- Old code required qual to be <64, which isn't strictly necessary. Now uses the Picard SAMUtils.MAX_PHRED_SCORE constant -- Unittest to enforce this behavior	2011-09-21 11:25:08 -04:00
Eric Banks	174859fc68	Don't allow whitespace in the INFO field	2011-09-21 11:14:54 -04:00
Mark DePristo	ecc7f34774	Putative fix for BAQ problem.	2011-09-21 11:09:54 -04:00
Mark DePristo	7d11f93b82	Final bugfix for CombineVariants -- Now handles multiple records at a site, so that you don't see records like set=dbsnp-dbsnp-dbsnp when combining something with dbsnp -- Proper handling of ids. If you are merging files with multiple ids for the same record, the ids are merged into a comma separated list	2011-09-21 10:58:32 -04:00
Mark DePristo	a91ac0c5db	Intermediate commit of bugfixes to CombineVariants	2011-09-21 10:15:05 -04:00
David Roazen	b04d8eab55	Merged bug fix from Stable into Unstable	2011-09-20 17:24:14 -04:00
Mauricio Carneiro	758ecf2d43	Bringing latest updates of ReduceReads to the master repository	2011-09-20 16:35:09 -04:00
David Roazen	d9ea764611	SnpEff annotator now adds OriginalSnpEffVersion and OriginalSnpEffCmd lines to the header of the VCF output file. This change is urgently required for production, which is why it's going into Stable+Unstable instead of just Unstable. The keys for the SnpEff version and command header lines in the VCF file output by VariantAnnotator (OriginalSnpEffVersion and OriginalSnpEffCmd) are intentionally different from the keys for those same lines in the SnpEff output file (SnpEffVersion and SnpEffCmd), so that output files from VariantAnnotator won't be confused with output files from SnpEff itself.	2011-09-20 16:30:55 -04:00
Mark DePristo	bffd3cca6f	Bug fix for reduced read; only adds regular bases for calculation -- No longer passes on deletions for genotyping	2011-09-20 15:07:06 -04:00
Mark DePristo	a1b4cafe7a	Bug fix for NPE when timer wasn't initialized	2011-09-20 13:59:59 -04:00
Mark DePristo	b7511c5ff3	Fixed long-standing bug in tribble index creation -- Previously, on the fly indices didn't have dictionary set on the fly, so the GATK would read, add dictionary, and rewrite the index. This is now fixed, so that the on the fly index contains the reference dictionary when first written, avoiding the unnecessary read and write -- Added a GenomeAnalysisEngine and Walker function called getMasterSequenceDictionary() that fetches the reference sequence dictionary. This can be used conveniently everywhere, and is what's written into the Tribble index -- Refactored tribble index utilities from RMDTrackBuilder into IndexDictionaryUtils -- VCFWriter now requires the master sequence dictionary -- Updated walkers that create VCFWriters to provide the master sequence dictionary	2011-09-20 10:53:18 -04:00
Mark DePristo	230e16d7c0	Merge branch 'master' into rodrewrite	2011-09-20 06:54:18 -04:00
Mark DePristo	aa8afa3899	Merge	2011-09-19 21:16:47 -04:00

1 2 3 4 5 ...

849 Commits (c1cf6bc45ac8dfed24c7ec13bbf0e843f6d7cf2e)