gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Mark DePristo	0f4acaae1b	Update MD5s with new FS score	2012-08-28 08:06:47 -04:00
Christopher Hartl	db2e88c7cb	Fix for badIndelLength() throwing NPE at non-indel sites. Added integration test.	2012-08-25 12:38:23 -07:00
Christopher Hartl	f1166d6d00	Spotted a potential bug where sample IDs passed in from the meta data were only checked against the sample IDs in the VCF header if the input file happened to be a meta data file rather than a fam file. Added a check for fam files as well, and added an integration test to cover each case.	2012-08-23 11:43:19 -07:00
Guillermo del Angel	e29469eeeb	Forgot to update 2 integration test md5's (in this cases, changes are legit because of the code revamp of AD, it's simpler if AD is not output when a site is not variant, as genotype DP conveys the same information)	2012-08-22 15:53:33 -04:00
Guillermo del Angel	901f47d8af	Final step (for now) in VA refactoring: update MD5's because, a) since it's not guaranteed that we'll iterate through reads/pileups in the same order, the rank sum dithering will change annotations, b) FS uses new generic threshold to distinguish uninformative reads (it used to use ad-hoc thresholds), c) AD definition changed and throws away uninformative reads, d) shortened general ploidy integration tests for quicker debugging. May have missed some MD5's in the update so there may be lingering test failures still	2012-08-22 11:38:51 -04:00
Eric Banks	c7ce3e1cf5	Merged bug fix from Stable into Unstable	2012-08-22 00:24:40 -04:00
Eric Banks	03017855e4	WTF - why is support for whole-read insertions all messed up in LIBS? I've pushed a temporary patch for now (the right solution should certainly not be implemented in stable; LIBS needs to be better thought out). Added another unit test.	2012-08-22 00:24:01 -04:00
Christopher Hartl	ba8622ff0d	number of stashed changes are lurking in here. In order of importance: - Fix for M_Trieb's error report on the forum, and addition of integration tests to cover the walker. - Addition of StructuralIndel as a class of variation within the VariantContext. These are for variants with a full alt allele that's >150bp in length. - Adaptation of the MVLikelihoodRatio to work for a set of trios (takes the max over the trios of the MVLR) - InsertSizeDistribution changed to use the new gatk report output (it was previously broken) - RetrogeneDiscovery changed to be compatible with the new gatk report - A maxIndelSize argument added to SelectVariants - ByTranscriptEvaluator rewritten for cleanliness - VariantRecalibrator modified to not exclude structural indels from recalibration if the mode is INDEL - Documentation added to DepthOfCoverageIntegrationTest (no, don't yell at chartl ;_; ) Also sorry for the long commit history behind this that is the result of fixing merge conflicts. Because this also fixes a conflict (from git stash apply), for some reason I can't rebase all of them away. I'm pretty sure some of the commit notes say "this note isn't important because I'm going to rebase it anyway".	2012-08-21 07:08:58 -04:00
Eric Banks	40d5efc804	Fix for Adam K's reported bug: we weren't handling reads that were entirely insertions properly in LIBS. Specifically, the event bases were off-by-one (which was disasterous in Adam's case with a 1bp read). Added a unit test to cover this case.	2012-08-20 23:12:41 -04:00
Mark DePristo	9121b98167	CombineVariants outputs the first non-MISSING qual, not the maximum -- When merging multiple VCF records at a site, the combined VCF record has the QUAL of the first VCF record with a non-MISSING QUAL value. The previous behavior was to take the max QUAL, which resulted in sometime strange downstream confusion.	2012-08-19 10:29:38 -04:00
Mauricio Carneiro	d16cb68539	Updated and more thorough version of the BadCigar read filter * No reads with Hard/Soft clips in the middle of the cigar * No reads starting with deletions (with or without preceding clips) * No reads ending in deletions (with or without follow-up clips) * No reads that are fully hard or soft clipped * No reads that have consecutive indels in the cigar (II, DD, ID or DI) Also added systematic test for good cigars and iterative test for bad cigars.	2012-08-17 17:05:27 -04:00
Eric Banks	611d9b61e2	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-16 13:05:36 -04:00
Eric Banks	2df04dc48a	Fix for performance problem in GGA mode related to previous --regenotype commit. Instead of trying to hack around the determination of the calculation model when it's not needed, just simply overload the calculateGenotypes() method to add one that does simple genotyping. Re-enabling the Pool Caller integration tests.	2012-08-16 13:05:17 -04:00
Mark DePristo	a9a1c499fd	Update md5 in VariantRecalibrationWalkers test for BCF2 -- only encoding differences	2012-08-16 13:03:13 -04:00
Mark DePristo	c0a31b2e5b	CombineVariants parallel integration tests -- All tests but one (using old bad VCF3 input) run unmodified with parallel code. -- Disabled UNSAFE_VCF_PROCESSING for all but that test, which changes md5s because the output files have fixed headers -- Minor optimizations to simpleMerge	2012-08-15 21:13:16 -04:00
Mark DePristo	ae4d4482ac	Parallel combine variants! -- CombineVariants is now TreeReducible! -- Integration tests running in parallel all pass except one (will fix) due to incorrect use of db=0 flag on input from old VCF format	2012-08-15 21:13:15 -04:00
Eric Banks	87e41c83c5	In AlleleCount stratification, check to make sure the AC (or MLEAC) is valid (i.e. not higher than number of chromosomes) and throw a User Error if it isn't. Added a test for bad AC.	2012-08-14 15:02:30 -04:00
Eric Banks	8e3774fb0e	Fixing behavior of the --regenotype argument in SelectVariants to properly run in GenotypeGivenAlleles mode. Added integration tests to cover recent SV changes.	2012-08-14 14:21:42 -04:00
Eric Banks	34b62fa092	Two changes to SelectVariants: 1) don't add DP INFO annotation if DP wasn't used in the input VCF (it was adding DP=0 previously). 2) If MLEAC or MLEAF is present in the original VCF and the number of samples decreases, remove those annotations from the VC.	2012-08-14 12:54:31 -04:00
Ami Levy Moonshine	6fefdaf428	"update integration tests in CombineVariantsIntegrationTest" Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-10 17:00:35 -04:00
Ami Levy Moonshine	4968daf0a5	update integration tests at CombineVariantsIntegrationTest	2012-08-10 16:58:05 -04:00
Eric Banks	eca9613356	Adding support of X and = CIGAR operators to the GATK	2012-08-10 14:54:07 -04:00
Mark DePristo	cda8d944b7	Bugfixes for BCF with VQSR -- Old version converted doubles directly from strings. New version uses VariantContext getAttributeAsDouble() that looks at the values directly to determine how to convert from Object to Double (via Double.valueOf, (Double), or (Double)(Integer)). -- getAttributeAsDouble() is now smart in converting integers to doubles as needed -- Removed unnecessary logging info in BCF2Codec -- Added integration tests to ensure that VQSR works end-to-end with BCF2 using sites version of the file khalid sent to me -- Added vqsr.bcf_test.snps.unfiltered.bcf file for this integration test	2012-08-07 17:22:39 -04:00
Ryan Poplin	15085bf03e	The UnifiedGenotyper now makes use of base insertion and base deletion quality scores if they exist in the reads.	2012-08-07 13:58:22 -04:00
Ryan Poplin	8817fc70d1	Merged bug fix from Stable into Unstable	2012-08-03 10:45:01 -04:00
Ryan Poplin	f40d0a0a28	Updating VQSR to work with the MNP and symbolic variants that are coming out of the HaplotypeCaller. Integration tests change because of the MNPs in dbSNP.	2012-08-03 10:44:36 -04:00
Mark DePristo	ccac77d888	Bugfix for incorrect allele counting in IndelSummary -- Previous version would count all alt alleles as present in a sample, even if only 1 were present, because of the way VariantEval subsetted VCs -- Updated code for subsetting VCs by sample to be clearer about how it handles rederiving alleles -- Update a few pieces of code to get previous correct behavior -- Updated a few MD5s as now ref calls at sites in dbSNP are counted as having a comp sites, and therefore show up in known sites when Novelty strat is on (which I think is correct) -- Walkers that used old subsetting function with true are now using clearer version that does rederive alleles by default	2012-08-01 15:45:12 -04:00
Joel Thibault	2b25df3d53	Add removeProgramRecords argument * Add unit test for the removeProgramRecords	2012-08-01 15:33:05 -04:00
Eric Banks	ab53d73459	Quick fix to user error catching	2012-07-31 15:50:32 -04:00
Eric Banks	10111450aa	Fixed AlignmentUtils bug for handling Ns in the CIGAR string. Added a UG integration test that calls a BAM with such reads (provided by a user on GetSatisfaction).	2012-07-31 15:37:22 -04:00
Mark DePristo	e00ed8bc5e	Cleanup BQSR classes -- Moved most of BQSR classes (which are used throughout the codebase) to utils.recalibration. It's better in my opinion to keep commonly used code in utils, and only specialized code in walkers. As code becomes embedded throughout GATK its should be refactored to live in utils -- Removed unncessary imports of BQSR in VQSR v3 -- Now ready to refactor QualQuantizer and unit test into a subclass of RecalDatum, refactor unit tests into RecalDatum unit tests, and generalize into hierarchical recal datum that can be used in QualQuantizer and the analysis of adaptive context covariate -- Update PluginManager to sort the plugins and interfaces. This allows us to have a deterministic order in which the plugin classes come back, which caused BQSR integration tests to temporarily change because I moved my classes around a bit.	2012-07-31 08:11:03 -04:00
Guillermo del Angel	6c9d3ec155	Remerge after changes to allele construction code. More cleanups/fixes to artificial read pileup provider	2012-07-30 21:32:03 -04:00
Guillermo del Angel	5b9a1af7fe	Intermediate fix for pool GL unit test: fix up artificial read pileup provider to give consistent data. b) Increase downsampling in pool integration tests with reference sample, and shorten MT tests so they don't last too long	2012-07-30 09:56:10 -04:00
Eric Banks	c4ae9c6cfb	With the new Allele representation we can finally handle complex events (because they aren't so complex anymore). One place this manifests itself is with the strict VCF validation (ValidateVariants used to skip these events but doesn't anymore) so I've added a new test with complex events to the VV integration test.	2012-07-29 19:22:02 -04:00
Eric Banks	99b15b2b3a	Final checkpoint: all tests pass. Note that there were bugs in the PoolGenotypeLikelihoodsUnitTest that needed fixing and eventually led to my needing to disable one of the tests (with a note for Guillermo to look into it). Also note that while I have moved over the GATK to use the new non-null representation of Alleles, I didn't remove all of the now-superfluous code throughout to do padding checking on merges; we'll need to do this on a subsequent push.	2012-07-29 01:07:59 -04:00
Eric Banks	beb7610195	Resolving merge conflicts	2012-07-27 15:52:02 -04:00
Eric Banks	27e7e11ec0	Allele refactoring checkpoint #3 : all integration tests except for PoolCaller are passing now. Fixed a couple of bugs from old code that popped up during md5 difference review. Added VariantContextUtils.requiresPaddingBase() method for tools that create alleles to use for determining whether or not to add the ref padding base. One of the HaplotypeCaller tests wasn't passing because of RankSumTest differences, so I added a TODO for Ryan to look into this.	2012-07-27 15:48:40 -04:00
Eric Banks	ef335b6213	Several more walkers have been brought up to use the new Allele representation.	2012-07-27 02:14:25 -04:00
Eric Banks	baf3e33730	Allele refactoring checkpoint 2: all code finally compiles, AD and STR annotations are fixed, and most of the UG integration tests pass.	2012-07-26 23:27:11 -04:00
Guillermo del Angel	2ae890155c	Improvements to indel calling in pool caller: a) Compute per-read likelihoods in reference sample to determine wheter a read is informative or not. b) Fixed bugs in unit tests. c) Fixed padding-related bugs when computing matches/mismatches in ErrorModel, d) Added a couple of more integration tests to increase test coverage, including testing odd ploidy	2012-07-26 13:43:00 -04:00
Mark DePristo	8c418a15da	Sorting out HMS error handling (fingers crossed) -- Check if a traversal error occurred in the last shard -- Catch ExecutionException from the TreeReducer and throw as our HMS execption -- ShardTraverser just throws the exception as formatted by the HMS, rather than wrapping it as a RuntimeException itself -- EngineFeaturesIntegrationTests now uses public exampleFASTA (faster), and does 1000x iterations (slower)	2012-07-25 23:13:12 -04:00
Mark DePristo	9242f63a4d	On the way to really sorting out HMS error handling -- Better error message when a traveral error occurs (a real bug) -- EngineFeaturesIntegrationTest runs the multi-threaded error testing routines 50x times -- A bit of cleanup in WalkerTest	2012-07-25 22:11:10 -04:00
Mark DePristo	5671992db3	RMDTrackBuilderUnitTest now uses private/testdata file to avoid filesystem race conditions	2012-07-25 22:05:04 -04:00
Mark DePristo	16947e93f2	Integration test to ensure VariantFiltration makes . -> PASS/FAIL like VQSR Signed-off-by: Mark DePristo <depristo@broadinstitute.org>	2012-07-25 08:56:39 -04:00
Guillermo del Angel	39f45127f3	Fix md5's broken by recent changes to FisherStrand calculation	2012-07-21 14:41:38 -04:00
Mauricio Carneiro	65f4b67b86	Fixing walker unit test with the new naming convention	2012-07-20 17:50:29 -04:00
Mauricio Carneiro	116885a450	Removed the "Walker" suffix from all walkers that had it. * Did not touch archived walkers... those can be named whatever. * Kept abstract classes that end in Walker untouched (e.g. LocusWalker, ReadWalker, ...) * Renamed a few inner classes due to conflict when stripping off Walker from their outer classes: ContigStats, FlagStats and FastaStats.	2012-07-20 17:27:11 -04:00
Eric Banks	5f5edeca63	Reverting move of BQSR tests to public, as per DR's email	2012-07-19 10:02:05 -04:00
Eric Banks	d46ccec04e	Adding Unit Tests to cover the exception catching for Picard errors: because we are using String matching, we want to ensure that we know if/when the exception text changes underneath us.	2012-07-18 21:48:58 -04:00
Eric Banks	9c1ab1b0c0	Move BQSR integration test and its dependent files into public; previously there was a protected->private dependency.	2012-07-18 21:11:33 -04:00

1 2 3 4 5 ...

608 Commits (e12ae65d33b3e6fd009fcd47eef3f90ed4e75a12)