gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Mark DePristo	ab2efe3bd3	Reverting bad exact model changes	2011-11-21 16:14:40 -05:00
Eric Banks	44554b2bfd	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-21 15:01:45 -05:00
Eric Banks	022832bd74	Very bad use of the == operator with Strings was ensuring that validating GenomeLocs was very inefficient. This fix resulted in a significant speedup for a simple RodWalker.	2011-11-21 14:49:47 -05:00
Mark DePristo	1561af22af	Exact model code cleanup -- Fixed up code when fixing a bug detected by aggressive contracts in GenotypesContext.	2011-11-21 14:35:15 -05:00
Mark DePristo	2c501364b8	GenotypesContext no longer have immutability in constructor -- additional bug fixes throughout VariantContext and GenotypesContext objects	2011-11-21 14:34:31 -05:00
David Roazen	1296dd41be	Removing the legacy -L "interval1;interval2" syntax This syntax predates the ability to have multiple -L arguments, is inconsistent with the syntax of all other GATK arguments, requires quoting to avoid interpretation by the shell, and was causing problems in Queue. A UserException is now thrown if someone tries to use this syntax.	2011-11-21 13:18:53 -05:00
Mark DePristo	e467b8e1ae	More contracts on LazyGenotypesContext	2011-11-21 09:34:57 -05:00
Mark DePristo	2e9ecf639e	Generalized interface to LazyGenotypesContext -- Now you provide a LazyParsing object -- LazyGenotypesContext now knows nothing about the VCF parser itself. The parser holds all of the necessary data to parse the VCF genotypes when necessarily, and the LGC only has a pointer to this object -- Using new interface added LazyGenotypesContext to unit tests with a simple lazy version -- Deleted VCFParser interface, as it was no longer necessary	2011-11-21 09:30:40 -05:00
Mark DePristo	bc44f6fd9e	Utility function Collection<Genotype> -> Collection<String>	2011-11-20 18:26:56 -05:00
Mark DePristo	9445326c6c	Genotype is Comparable via sampleName	2011-11-20 18:26:27 -05:00
Mark DePristo	f9e25081ab	Completed documented LazyGenotypesContext	2011-11-20 08:35:52 -05:00
Mark DePristo	9cb3fe3a59	Vastly better way of doing on-demand genotyping loading -- With our GenotypesContext class we can naturally create a LazyGenotypesContext subclass that does the on-demand loading. -- This new class was replaced all of the old, complex functionality -- Better still, there were many cases were the genotypes were being loaded unnecessarily, resulting in efficiency. This was detected because some of the integration tests changed as the genotypes were no longer being parsing unnecessarily -- Misc. bug fixes throughout the system -- Bug fixes for PhaseByTransmission with new GenotypesContext	2011-11-20 08:23:09 -05:00
Mark DePristo	f392d330c3	Proper use of builder. Previous conversion attempt was flawed	2011-11-19 22:09:56 -05:00
Mark DePristo	7d09c0064b	Bug fixes and code cleanup throughout -- chromosomeCounts now takes builder as well, cleaning up a lot of code throughout the codebase.	2011-11-19 18:40:15 -05:00
Mark DePristo	8f7eebbaaf	Bugfix for pError not being checked correctly in CommonInfo -- UnitTests to ensure correct behavior -- UnitTests to ensure correct behavior for pass filters vs. failed filters vs. unfiltered	2011-11-19 15:58:59 -05:00
Mark DePristo	b7b57ef39a	Updating MD5 to reflect canonical ordering of calculation -- We should no longer have md5s changing because of hashmaps changing their sort order on us -- Added GenotypeLikelihoodsUnitTests -- Refactored ExactAFCaclculation to put the PL -> QUAL calculation in the GenotypeLikelihoods class to avoid the code copy.	2011-11-19 15:57:33 -05:00
Mark DePristo	73119c8e3c	Merge with master -- A few bug fixes	2011-11-19 09:56:06 -05:00
Mark DePristo	f685fff79b	Killing the final versions of old new VariantContext interface	2011-11-18 21:32:43 -05:00
Mark DePristo	6cf315e17b	Change interface to getNegLog10PError to getLog10PError	2011-11-18 21:07:30 -05:00
Mark DePristo	c7f2d5c7c7	Final minor fix to contract	2011-11-18 19:40:05 -05:00
Mauricio Carneiro	b5de182014	isEmpty now checks if mReadBases is null Since newly created reads have mReadBases == null. This is an effort to centralize the place to check for empty GATKSAMRecords.	2011-11-18 18:34:05 -05:00
Mauricio Carneiro	8ab3ee9c65	Merge remote-tracking branch 'unstable/master' into rr	2011-11-18 16:50:25 -05:00
Mauricio Carneiro	333e5de812	returning read instead of GATKSAMRecord Do not create new GATKSAMRecord when read has been fully clipped, because it is essentially the same as returning the currently fully clipped read.	2011-11-18 16:49:59 -05:00
Matt Hanna	8bb4d4dca3	First pass of the asynchronous block loader. Block loads are only triggered on queue empty at this point. Disabled by default (enable with nt:io=?).	2011-11-18 15:02:59 -05:00
Mark DePristo	a2e79fbe8a	Fixes to contracts	2011-11-18 14:18:53 -05:00
Mark DePristo	660d6009a2	Documentation and contracts for GenotypesContext and VariantContextBuilder	2011-11-18 13:59:30 -05:00
Mark DePristo	f54afc19b4	VariantContextBuilder -- New approach to making VariantContexts modeled on StringBuilder -- No more modify routines -- use VariantContextBuilder -- Renamed isPolymorphic to isPolymorphicInSamples. Same for mono -- getChromosomeCount -> getCalledChrCount -- Walkers changed to use new VariantContext. Some deprecated new VariantContext calls remain -- VCFCodec now uses optimized cached information to create GenotypesContext.	2011-11-18 12:39:10 -05:00
Eric Banks	6459784351	Merged bug fix from Stable into Unstable	2011-11-18 12:34:57 -05:00
Eric Banks	c62082ba1b	Making this class public again as per request from Cancer folks	2011-11-18 12:34:27 -05:00
Eric Banks	8710673a97	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-18 12:29:33 -05:00
Eric Banks	768b27322b	I figured out why we were getting tons of hom var genotype calls with Mauricio's low quality (synthetic) reduced reads: the RR implementation in the UG was not capping the base quality by the mapping quality, so all the low quality reads were used to generate GLs. Fixed.	2011-11-18 12:29:15 -05:00
Mark DePristo	7490dbb6eb	First version of VariantContextBuilder	2011-11-18 11:06:15 -05:00
Roger Zurawicki	f48d4cfa79	Bug fix: fully clipping GATKSAMRecords and flushing ops Reads that are emptied after clipping become new GATKSAMRecords. When applying ClippingOps, the ops are cleared after the clipping	2011-11-18 00:24:39 -05:00
Mark DePristo	fa454c88bb	UnitTests for VariantContext for chrCount, getSampleNames, Order function -- Major change to how chromosomeCounts is computed. Now NO_CALL alleles are always excluded. So ChromosomeCounts(A/.) is 1, the previous result would have been 2. -- Naming changes for getSamplesNameInOrder()	2011-11-17 20:37:22 -05:00
Mark DePristo	23359d1c6c	Bugfix for pruneVariantContext, which was dropping the ref base for padding	2011-11-17 15:32:52 -05:00
Mark DePristo	473b860312	Major determinism fix for UG and RankSumTest -- Now these routines all iterate in sample name order (genotypes.iterateInSampleNameOrder) so that the results of UG and the annotator do not depend on the particular order of samples we see for the exact model and the RankSumTest	2011-11-17 15:31:45 -05:00
Khalid Shakir	c50274e02e	During flanking interval creation merging overlapping flanks so that on scatter the list doesn't accidentally genotype the same site twice. Moved flanking interval utilies to IntervalUtils with UnitTests.	2011-11-17 13:56:42 -05:00
Eric Banks	16a021992b	Updated header description for the INFO and FORMAT DP fields to be more accurate.	2011-11-17 13:17:53 -05:00
Eric Banks	e7d41d8d33	Minor cleanup	2011-11-17 12:00:28 -05:00
Mark DePristo	7e66677769	Expanded UnitTests for VariantContext Tests for -- getGenotype and getGenotypes -- subContextBySample -- modify routines	2011-11-16 20:45:15 -05:00
Mark DePristo	aa0610ea92	GenotypeCollection renamed to GenotypesContext	2011-11-16 16:24:05 -05:00
Mark DePristo	caf6080402	Better algorithm for merging genotypes in CombineVariants	2011-11-16 15:17:33 -05:00
Mark DePristo	e56d52006a	Continuing bugfixes to get new VC working	2011-11-16 10:39:17 -05:00
Matt Hanna	eb8e031f75	Merged bug fix from Stable into Unstable	2011-11-16 09:57:37 -05:00
Matt Hanna	6a5d5e7ac9	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/stable	2011-11-16 09:57:13 -05:00
Matt Hanna	7ac5cf8430	Getting rid of unsupported CountReadPairs walker in stable. Removal of remainder of pairs processing framework to follow in unstable.	2011-11-16 09:53:59 -05:00
Eric Banks	c2ebe58712	Merge remote-tracking branch 'Laurent/master'	2011-11-16 09:34:47 -05:00
Laurent Francioli	0dc3d20d58	Corrected bug causing PhaseByTransmission to crash in case of new Genotype.Type	2011-11-16 09:33:13 +01:00
Laurent Francioli	7d77fc51f5	Corrected bug causing PhaseByTransmission to crash in case of new Genotype.Type	2011-11-16 03:32:43 -05:00
David Roazen	0d163e3f52	SnpEff 2.0.4 support -Modified the SnpEff parser to work with the SnpEff 2.0.4 VCF output format -Assigning functional classes and effect impacts now handled directly by SnpEff rather than the GATK -Removed support for SnpEff 2.0.2, as we no longer trust the output of that version since it doesn't exclude effects associated with certain nonsensical transcripts. These effects are excluded as of 2.0.4. -Updated unit and integration tests This support is based on a release-candidate of SnpEff 2.0.4, and so is subject to change between now and the next GATK release.	2011-11-15 18:36:22 -05:00
Mark DePristo	df415da4ab	More bug fixes on the way to passing all tests	2011-11-15 17:38:12 -05:00
Mark DePristo	0be23aae4e	Bugfixes on way to a working refactored VariantContext	2011-11-15 17:20:14 -05:00
Mark DePristo	231c47c039	Bugfixes on way to a working refactored VariantContext	2011-11-15 16:42:50 -05:00
Laurent Francioli	fb685f88ec	Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-15 16:23:53 -05:00
Mark DePristo	2b2514dad2	Moved many unused phasing walkers and utilities to archive	2011-11-15 16:14:50 -05:00
Mark DePristo	460a51f473	ID field now stored in the VariantContext itself, not the attributes	2011-11-15 14:56:33 -05:00
Eric Banks	b45d10e6f1	The DP in the FORMAT field (per sample) must also use the representative count or else it's always 1 for reduced reads.	2011-11-15 10:23:59 -05:00
Mark DePristo	233e581828	Merging in Master	2011-11-15 09:28:24 -05:00
Eric Banks	b66556f4a0	Update error message so that it's clear ReadPair Walkers are exceptions	2011-11-15 09:22:57 -05:00
Mark DePristo	6e1a86bc3e	Bug fixes to VariantContext and GenotypeCollection	2011-11-15 09:21:30 -05:00
Mauricio Carneiro	cde829899d	compress Reduce Read counts bytes by offset compressed the representation of the reduce reads counts by offset results in 17% average compression in final BAM file size. Example compression --> from : 10, 10, 11, 11, 12, 12, 12, 11, 10 to: 10, 0, 1, 1,2, 2, 2, 1, 0	2011-11-14 18:30:24 -05:00
Mark DePristo	f0234ab67f	GenotypeMap -> GenotypeCollection part 2 -- Code actually builds	2011-11-14 17:42:55 -05:00
David Roazen	ab0ee9b847	Perform only necessary validation in VariantContext modify methods	2011-11-14 16:49:59 -05:00
Mark DePristo	2e9d5363e7	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-14 15:32:06 -05:00
Mark DePristo	1fbdcb4f43	GenotypeMap -> GenotypeCollection	2011-11-14 15:32:03 -05:00
Eric Banks	4dc9dbe890	One quick fix to previous commit	2011-11-14 14:42:12 -05:00
Eric Banks	7b2a7cfbe7	Transfer headers from the resource VCF when possible when using expressions. While there, VA was modified so that it didn't assume that the ID field was present in the VC's info map in preparation for Mark's upcoming changes.	2011-11-14 14:31:27 -05:00
Mark DePristo	9b5c79b49d	Renamed InferredGeneticContext to CommonInfo -- I have no idea why I named this InferredGeneticContext, a totally meaningless term -- Renamed to CommonInfo. -- Made package protected, as no one should use this outside of VariantContext and Genotype -- UGEngine was using IGC constant, but it's now using the public one in VariantContext.	2011-11-14 14:28:52 -05:00
Mark DePristo	077397cb4b	Deleted MutableVariantContext -- All methods that used this capable now use VariantContext directly instead	2011-11-14 14:19:06 -05:00
Mark DePristo	b11c535527	Deleted MutableGenotype -- This class wasn't really used anywhere, and so removed to control code bloat.	2011-11-14 13:16:36 -05:00
Mark DePristo	79987d685c	GenotypeMap contains a Map, not extends it -- On path to replacing it with GenotypeCollection	2011-11-14 12:55:03 -05:00
Eric Banks	7aee80cd3b	Fix to deal with reduced reads containing a deletion	2011-11-14 12:23:46 -05:00
Eric Banks	3d2970453b	Misc minor cleanup	2011-11-14 09:41:54 -05:00
Laurent Francioli	1347beef40	Merge branch 'PhaseByTransmission'	2011-11-14 11:31:28 +01:00
Eric Banks	b7c33116af	Minor docs update	2011-11-12 23:21:07 -05:00
Eric Banks	76d357be40	Updating docs example to use -L since that's best practice	2011-11-12 23:20:05 -05:00
Mark DePristo	fee9b367e4	VariantContext genotypes are now stored as GenotypeMap objects -- Enables further sophisticated optimizations, as this class can be smarter about storing the data and will directly support operations like subset to samples -- All instances in the gatk that used Map<String, Genotype> now use GenotypeMap type. -- Amazingly, there were many places where HashMap<String, Genotype> is used, so that the order of the genotypes is technically undefined and could be dangerous. Now everything uses GenotypeMap with a specific ordering of samples (by name) -- Integrationtests updated and all pass	2011-11-11 15:00:35 -05:00
Guillermo del Angel	cd3146f4cf	Add hidden option to ValidationAmplicons to output slightly modified format to make file work with downstream SQNM tools more seamlessly at request of GAP: one line per record, keep probe identifier to 20 characters, no * in ref allele.	2011-11-11 14:07:07 -05:00
Ryan Poplin	40fbeafa37	VQSR will now detect if the negative model failed to converge properly because of having too few data points and automatically retry with more appropriate clustering parameters.	2011-11-11 11:52:30 -05:00
Mark DePristo	ef9f8b5d46	Added subContextOfSamples to VariantContext -- This is a more convenient accesssor than subContextOfGenotypes, represents nearly all of the use cases of the former function, and potentially can be implemented more efficiently.	2011-11-11 10:07:11 -05:00
Mark DePristo	ee40791776	Attributes are now Map<String,Object> not Map<String,?> -- Allows us to avoid an unnecessary copy when creating InferredGeneticContext (whose name really needs to change).	2011-11-11 09:55:42 -05:00
Mark DePristo	dc9b351b5e	Meaningful error message when an IntervalArg file fails to parse correctly	2011-11-10 17:10:26 -05:00
Mark DePristo	bb7bf74aa8	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-10 16:05:43 -05:00
Mauricio Carneiro	060c7ce8ae	It wouldn't harm integrationtests if we had our logic right... :-)	2011-11-10 14:03:22 -05:00
Eric Banks	39678b6a20	Check for reads with missing read groups and throw a UserException when encountered. Mauricio said this wouldn't break integration tests.	2011-11-10 13:34:45 -05:00
Mark DePristo	dd1810140f	-stratIntervals is optional	2011-11-10 13:27:32 -05:00
Mark DePristo	67b022c34b	Cleanup for new SampleUtils function -- getVCFHeadersFromRods(rods) is now available so that you don't have getVCFHeadersFromRods(rods, null) throughout the codebase	2011-11-10 13:27:13 -05:00
Mark DePristo	35fe9c8a06	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-10 11:11:33 -05:00
Mark DePristo	dc4932f93d	VariantEval module to stratify the variants by whether they overlap an interval set The primary use of this stratification is to provide a mechanism to divide asssessment of a call set up by whether a variant overlaps an interval or not. I use this to differentiate between variants occurring in CCDS exons vs. those in non-coding regions, in the 1000G call set, using a command line that looks like: -T VariantEval -R human_g1k_v37.fasta -eval 1000G.vcf -stratIntervals:BED ccds.bed -ST IntervalStratification Note that the overlap algorithm properly handles symbolic alleles with an INFO field END value. In order to safely use this module you should provide entire contigs worth of variants, and let the interval strat decide overlap, as opposed to using -L which will not properly work with symbolic variants. Minor improvements to create() interval in GenomeLocParser.	2011-11-10 10:58:40 -05:00
Mauricio Carneiro	0d8983feee	outputting the RG information setReadGroup now sets the read group attribute for the GATKSAMRecord	2011-11-09 23:35:00 -05:00
Eric Banks	315ac68b0b	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-09 22:37:36 -05:00
Eric Banks	6313aae2c4	Adding checks for hasBasePileup() before calling getBasePileup() as per GS thread	2011-11-09 22:37:26 -05:00
Ryan Poplin	74a18d3de8	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-09 22:29:40 -05:00
Ryan Poplin	24712c0221	Merged bug fix from Stable into Unstable	2011-11-09 22:28:27 -05:00
Ryan Poplin	8942406aa2	Use MathUtils to compare doubles instead of testing for equality	2011-11-09 22:05:21 -05:00
Ryan Poplin	348f2db7fd	Fix for HMM optimization. If the two penalty arrays match exactly the function should return the end of the array instead of 0.	2011-11-09 22:00:52 -05:00
Eric Banks	82bf09edf3	Mark Standard Annotations with an asterisk	2011-11-09 20:42:31 -05:00
Eric Banks	04b122be29	Fix for bug reported on GetSatisfaction	2011-11-09 20:33:36 -05:00
Mauricio Carneiro	d00b2c6599	Adding a synthetic read for filtered data * Generalized the concept of a synthetic read to cread both running consensus and a synthetic reads of filtered data. * Synthetic reads can now have deletions (but not insertions) * New reduced read tag for filtered data synthetic reads (RF) * Sliding window header now keeps information of consensus and filtered data * Synthetic reads are created simultaneously, new functionality is controlled internally by addToSyntheticReads	2011-11-09 20:16:22 -05:00
Eric Banks	21bf43f3bb	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-09 15:34:40 -05:00
Christopher Hartl	85bffe1dca	Merged bug fix from Stable into Unstable	2011-11-09 15:29:14 -05:00
Christopher Hartl	d828eba7f4	Allow comments in a table-formatted file to precede the header line.	2011-11-09 15:27:38 -05:00
Eric Banks	8205efbb29	Merge branch 'master' into intervals	2011-11-09 15:27:15 -05:00
Eric Banks	d64f8a89a9	Instead of the SelfScopingFeatureCodec interface, pushed this functionality into Tribble itself. Now we can e.g. determine that a file can be parsed by the BedCodec on the fly.	2011-11-09 15:24:29 -05:00
Mauricio Carneiro	f080f64f99	Preserve RG information on new GATKSAMRecord from SAMRecord	2011-11-09 14:39:20 -05:00
Mauricio Carneiro	f9530e0768	Clean unnecessary attributes from the read this gives on average 40% file size reduction.	2011-11-09 14:39:20 -05:00
Mauricio Carneiro	9427ada498	Fixing no cigar bug empty GATKSAMRecords will have a null cigar. Treat them accordingly.	2011-11-09 14:39:20 -05:00
Mark DePristo	e639f0798e	mergeEvals allows you to treat -eval 1.vcf -eval 2.vcf as a single call set -- A bit of code cleanup in VCFUtils -- VariantEval table to create 1000G Phase I variant summary table -- First version of 1000G Phase I summary table Qscript	2011-11-09 14:35:50 -05:00
Christopher Hartl	149b79eaad	Merged bug fix from Stable into Unstable	2011-11-09 11:26:30 -05:00
Christopher Hartl	11abb4f9d1	Better error message.	2011-11-09 11:25:28 -05:00
Christopher Hartl	d3a533b82e	Revert "a" This reverts commit 1175f50ddbf389f5da74d27dc725596582ae15af.	2011-11-09 11:22:26 -05:00
Christopher Hartl	5eaf800281	a	2011-11-09 11:22:20 -05:00
Christopher Hartl	5451fbc2b2	Merged bug fix from Stable into Unstable	2011-11-09 11:06:15 -05:00
Christopher Hartl	091229e4db	MVLikelihoodRatio now checks if the family string is provided before attempting to instantiate. Also check that variant contexts have both genotypes and genotype likelihoods. Table codec now yells at users for not providing a HEADER with the table - parsing tables without a header line was causing the first line of the file to be eaten. Table feature now has a toString method. These are minor bug fixes.	2011-11-09 11:03:29 -05:00
Mauricio Carneiro	e1b4c3968f	Fixing GATKSAMRecord bug when constructing a GATKSAMRecord from scratch, we should set "mRestOfBinaryData" to null so the BAMRecord doesn't try to retrieve missing information from the non-existent bam file.	2011-11-08 16:50:36 -05:00
Ryan Poplin	e973ca2010	fixing merge conflict.	2011-11-08 14:55:05 -05:00
Ryan Poplin	b0e6afec48	Bug fix for HMM optimization. Need to also check the gap continuation penalty array for the index with the first discrepancy.	2011-11-08 14:51:25 -05:00
Laurent Francioli	571c724cfd	Added reporting of the number of genotypes updated.	2011-11-08 15:15:51 +01:00
Ryan Poplin	94dc447a70	Merged bug fix from Stable into Unstable	2011-11-07 15:26:35 -05:00
Ryan Poplin	0b181be61f	Bug fix in SelectVariants when using a discordance track but no sample specifications. Added integration test to test this.	2011-11-07 15:25:16 -05:00
Ryan Poplin	0534149708	Merged bug fix from Stable into Unstable	2011-11-07 14:07:08 -05:00
Ryan Poplin	2d1e385ca4	Adding note to VQSR docs about Rscript being needed in the environment PATH.	2011-11-07 14:04:13 -05:00
Eric Banks	759f4fe6b8	Moving unclaimed walker with bad integration test to archive	2011-11-07 13:16:38 -05:00
Eric Banks	c1986b6335	Add notes to the GATKdocs as to when a particular annotation can/cannot be calculated.	2011-11-07 11:06:19 -05:00
Eric Banks	724e3f3b0d	Merged bug fix from Stable into Unstable	2011-11-06 22:23:22 -05:00
Eric Banks	cdd40d1222	Removing contracts for the SimpleTimer	2011-11-06 22:22:49 -05:00
Ryan Poplin	5c565d28b9	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-06 10:26:19 -05:00
Eric Banks	1c4e429a1c	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-06 00:05:56 -04:00
Eric Banks	a12bc63e5c	Get rid of support for bams without sample information in the read groups. This hidden option wasn't being used anyways because it wasn't hooked up properly in the AlignmentContext.	2011-11-05 23:54:28 -04:00
Eric Banks	90a053ea93	Don't change the mapping quality of MQ=255 reads in IR	2011-11-05 22:40:45 -04:00
Ryan Poplin	611a395783	Now properly extending candidate haplotypes with bases from the reference context instead of filling with padding bases. Functionality in the private Haplotype class is no longer necessary so removing it. No need to have four different Haplotype classes in the GATK.	2011-11-05 12:18:56 -04:00
Mark DePristo	e99871f587	Bug fix for decode loc -- decodeLoc() wasn't skipping input header lines, so the system blew up when there was an = line being split.	2011-11-04 13:20:54 -04:00
Mark DePristo	a340a1aeac	Bug fix. decodeLoc() should update lineNo so you get meaningful line no when indexing due to malformed VCF files.	2011-11-04 11:44:24 -04:00
Mark DePristo	9f260c0dc1	Zero byte index bug fix for RandomlySplitVariants + cleanup -- vcfWriter2 was never being closed in onTraversalDone(), so the on the fly index file was being created but never actually properly written to the file. -- This bug is ultimately due to the inability of the GATK to allow multiple VCF output writers as @Output arguments, though -- Removed the unnecessary local variable iFraction, = 1000 * the input fraction argument. Now the system just uses a double random number and compares to the input fraction at all. Is there some subtle reason I don't appreciate for this programming construct?	2011-11-04 09:45:20 -04:00
Mauricio Carneiro	e89ff063fc	GATKSAMRecord refactor The GATK engine will now provide a GATKSAMRecord to all tools which incorporates the functionality used by the GATK to the bam file (ReadGroups, Reduced Reads, ...). * No tools should create SAMRecord anymore, use GATKSAMRecord instead *	2011-11-03 15:43:26 -04:00
Laurent Francioli	385a6abec1	Fixed a bug that wrongly swapped the mother and father genotypes in case the child genotype missing.	2011-11-03 13:04:53 +01:00
Laurent Francioli	893787de53	Functions getAsMap and getNegLog10GQ now handle missing genotype case.	2011-11-03 13:04:11 +01:00
Eric Banks	e8bceb1eaa	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-02 21:13:54 -04:00
Eric Banks	52b16bf739	Must check whether there's a normal vs. extended pileup before asking for it.	2011-11-02 20:45:24 -04:00
Eric Banks	e1edd6bd12	Removing the min mapping quality argument since it wasn't being used in the normal processing of the pileups in UG - only for indel pileups. Instead, we apply the min base quality to the reads in the pileup for indels and define it to be the min 'confidence' of the base. Docs are updated but I didn't rename the argument as I don't want people to complain.	2011-11-02 20:32:58 -04:00
Ryan Poplin	e94fcf537b	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-02 16:29:19 -04:00
Ryan Poplin	4d35272916	Bug fixes with Mauricio to functions in ReadUtils used by reduced reads and the haplotype caller.	2011-11-02 16:29:10 -04:00
Mark DePristo	8a2929c1dd	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-11-02 16:21:00 -04:00
Laurent Francioli	19ad5b635a	- Calculation of parent/child pairs corrected - Separated the reporting of single and double mendelian violations in trios	2011-11-02 18:35:31 +01:00
Eric Banks	967ff647b8	Reduced reads shouldn't contribute to Fisher Strand calculations	2011-11-02 13:07:20 -04:00
Eric Banks	cf0e699226	QualByDepth was inefficiently iterating over the pileup 2 times for some reason. Removed non-useful annotation classes.	2011-11-02 12:58:38 -04:00
Eric Banks	4501dce58d	Fixing merge conflict	2011-11-02 12:50:32 -04:00
Eric Banks	54331b44e9	New way of looking at the size of a pileup: there's a physical number of elements in the data structure and there's a representative depth of coverage (since a reduced read represents depth >= 1). The size() method has been removed because its meaning is ambiguous. Updated several annotations and the UG engine to make use of the representative depths.	2011-11-02 12:47:30 -04:00
Mark DePristo	c2b97030a4	IntervalUtils for completely balanced locus-based scatter/gather -- scatterLocusIntervals master utility -- Moved around some general functionality from GenomeLocSortedSet to GenomeLoc -- Util function for reversing a list (List<T> -> List<T>, unlike Collections version) -- DoC is PartitionType.INTERVAL -- Significant unit tests on new functionality (all passing) -- Ready for real-world testing, as soon as I can get LocusScatterFunction.scala to actually work	2011-11-02 10:49:40 -04:00
Laurent Francioli	119ca7d742	Fixed a bug in parent/child pairs reporting causing a crash in case the -mvf option was used and mother was not provided	2011-11-02 08:22:33 +01:00

1 2 3 4 5 ...

1158 Commits (7c58d8e37d490c890baac6421b6c309b4297716d)