gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Eric Banks	94540ccc27	Using the simple VCBuilder constructor and then subsequently trying to modify attributes was throwing a NPE. This is easily solved (without a performance hit) by initializing the attributes map to an immutable Collections.emptyMap(). Added unit test to cover this case.	2012-08-22 12:54:29 -04:00
Guillermo del Angel	901f47d8af	Final step (for now) in VA refactoring: update MD5's because, a) since it's not guaranteed that we'll iterate through reads/pileups in the same order, the rank sum dithering will change annotations, b) FS uses new generic threshold to distinguish uninformative reads (it used to use ad-hoc thresholds), c) AD definition changed and throws away uninformative reads, d) shortened general ploidy integration tests for quicker debugging. May have missed some MD5's in the update so there may be lingering test failures still	2012-08-22 11:38:51 -04:00
Guillermo del Angel	7df0abf49b	Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-22 11:36:41 -04:00
Eric Banks	9e76e8aa0b	Just noticed that the efficient conversion to uppercase method is redundant since it's already implemented efficiently in Picard; let's just have a single implementation.	2012-08-22 11:26:08 -04:00
Christopher Hartl	20601f034e	Updating the checkType() function to include the new StructuralIndel variant type. Fixes outstanding broken integration test.	2012-08-22 07:33:10 -07:00
Eric Banks	c7ce3e1cf5	Merged bug fix from Stable into Unstable	2012-08-22 00:24:40 -04:00
Eric Banks	03017855e4	WTF - why is support for whole-read insertions all messed up in LIBS? I've pushed a temporary patch for now (the right solution should certainly not be implemented in stable; LIBS needs to be better thought out). Added another unit test.	2012-08-22 00:24:01 -04:00
Mark DePristo	1acf18aa25	run_performance_tests use bsub and gsa -- Confirmed that running on gsa queue is fine with sufficient iterations (3)	2012-08-21 16:26:12 -04:00
Mark DePristo	cb9ba4f660	Expand intervals processed for many GATKPerformanceOverTime commands -- For the high NT tests the total runtime may be too short to really assess nt efficiency vs. start up costs. Reworked underlying test data and intervals so that most tests run in 10-20 hrs for -nt 1.	2012-08-21 16:25:31 -04:00
Mark DePristo	1d707e7b31	Linear and Quadratic fits for GATKPerformanceOverTime.R	2012-08-21 14:44:18 -04:00
Mark DePristo	e43ff31eab	GATKPerformance over time lives (mark II) -- Now uses new tagging capabilities so that 2.x runs will tag their logs as GATKPerformanceOverTime -- Update bamboo runs, reverting back to gsa4 (it's slower but the results are less variable -- you were right david!).	2012-08-21 14:44:18 -04:00
Mark DePristo	df36b4384c	Update analyzeRunReports.py to handle new GATK tag argument	2012-08-21 14:44:18 -04:00
Mark DePristo	6ce8016ae7	GSA-491: Add hidden tag to GATK that propagates to the GATK logs	2012-08-21 14:44:18 -04:00
Mark DePristo	9eec33ec3b	Complete GSA-497: Let Queue write out runInfo on the fly, after each job group finishes running -- Queue will incrementally now write out its jobReport.txt file whenever jobs finish running (FAIL or DONE) -- This makes it far easier to track what's going on, or to analyze incrementally performance results coming out of Queue -- Generally cleaned up the QJobsReporting code, creating a new clean class QJobsReporter that holds all of the information on what to do log and where to put into, which was previously scattered in QCommandLine and QJobReport	2012-08-21 14:44:18 -04:00
Guillermo del Angel	6a8cf1c84a	Enable and adapt HaplotypeScore and MappingQualityZero as active region annotations now that we have per-read likelihoods passed in to annotations	2012-08-21 14:35:40 -04:00
Guillermo del Angel	d0644b3565	Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-21 10:35:23 -04:00
Ryan Poplin	94e7f677ad	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-21 10:21:47 -04:00
Guillermo del Angel	418ace463a	More merge conflict resolution	2012-08-21 10:15:52 -04:00
Ryan Poplin	10961db3ce	Another round of FindBugs fixes. Object returns its internal reference to an externally mutable array. Very dangerous.	2012-08-21 09:35:55 -04:00
Ryan Poplin	605acaae9c	Another round of FindBugs fixes. Object internally stores a reference to an externally mutable array. Very dangerous.	2012-08-21 09:33:58 -04:00
Ryan Poplin	55b7949d68	Another round of FindBugs fixes. Comparator doesn't implement Serializable.	2012-08-21 09:20:55 -04:00
Christopher Hartl	ba8622ff0d	number of stashed changes are lurking in here. In order of importance: - Fix for M_Trieb's error report on the forum, and addition of integration tests to cover the walker. - Addition of StructuralIndel as a class of variation within the VariantContext. These are for variants with a full alt allele that's >150bp in length. - Adaptation of the MVLikelihoodRatio to work for a set of trios (takes the max over the trios of the MVLR) - InsertSizeDistribution changed to use the new gatk report output (it was previously broken) - RetrogeneDiscovery changed to be compatible with the new gatk report - A maxIndelSize argument added to SelectVariants - ByTranscriptEvaluator rewritten for cleanliness - VariantRecalibrator modified to not exclude structural indels from recalibration if the mode is INDEL - Documentation added to DepthOfCoverageIntegrationTest (no, don't yell at chartl ;_; ) Also sorry for the long commit history behind this that is the result of fixing merge conflicts. Because this also fixes a conflict (from git stash apply), for some reason I can't rebase all of them away. I'm pretty sure some of the commit notes say "this note isn't important because I'm going to rebase it anyway".	2012-08-21 07:08:58 -04:00
Eric Banks	3dfe8df262	Merged bug fix from Stable into Unstable	2012-08-20 23:12:58 -04:00
Eric Banks	40d5efc804	Fix for Adam K's reported bug: we weren't handling reads that were entirely insertions properly in LIBS. Specifically, the event bases were off-by-one (which was disasterous in Adam's case with a 1bp read). Added a unit test to cover this case.	2012-08-20 23:12:41 -04:00
Khalid Shakir	3514fb6e66	Changed the default memory limit from none to 2GB upon suggestions from delangel, carneiro, and depristo.	2012-08-20 21:41:13 -04:00
Eric Banks	286b658fab	Re-enabling parallelism in the BaseRecalibrator now that the release is out.	2012-08-20 21:25:14 -04:00
Eric Banks	5b1781fdac	Merge remote-tracking branch 'unstable/master'	2012-08-20 21:18:54 -04:00
Guillermo del Angel	7bbd2a7a20	Fixing merge conflicts	2012-08-20 20:38:25 -04:00
Guillermo del Angel	2041cb853c	New implementation of AD - ignore now non-informative reads based on per-read likelihoods	2012-08-20 20:31:34 -04:00
Ryan Poplin	77fbaec044	Another round of FindBugs fixes. Class implements its own compareTo() but uses base Object.equals() which can lead to unpredictable behavior.	2012-08-20 16:55:00 -04:00
Ryan Poplin	5e28bca630	Another round of FindBugs fixes. Should be static inner class.	2012-08-20 16:15:48 -04:00
Ryan Poplin	a9472c1980	Another round of FindBugs fixes. Inefficient use of keySet iterator instead of entrySet iterator.	2012-08-20 16:11:45 -04:00
Ryan Poplin	5db3bd6fd2	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-20 15:28:57 -04:00
Ryan Poplin	464d49509a	Pulling out common caller arguments into its own StandardCallerArgumentCollection base class so that every caller isn't exposed to the unused arguments from every other caller.	2012-08-20 15:28:39 -04:00
Eric Banks	4450d66c64	Fixing the docs for DP and AD	2012-08-20 15:10:24 -04:00
Ryan Poplin	c67d708c51	Bug fix in HaplotypeCaller for non-regular bases in the reference or reads. Those events don't get created any more. Bug fix for advanced GenotypeFullActiveRegion mode: custom variant annotations created by the HC don't make sense when in this mode so don't try to calculate them.	2012-08-20 13:41:08 -04:00
Guillermo del Angel	5b5fee56cf	Next iteration of new VA interface: extend changes to per-genotype annotations as well. Will allow to have AD correctly implemented at last (that change not done yet)	2012-08-20 12:52:15 -04:00
Eric Banks	154f65e0de	Temporarily disabling multi-threaded usage of BaseRecalibrator for performance reasons.	2012-08-20 12:43:17 -04:00
Menachem Fromer	37dd7209df	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-20 12:31:34 -04:00
Guillermo del Angel	c384677917	Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-20 10:27:25 -04:00
Eric Banks	97b191f578	Thanks to Guillermo I was able to isolate an instance of where the MLEAC > AN. It turns out that this is valid, e.g. when PLs are all 0s for a sample we no-call it but it's allowed to factor into the MLE (since that's the contract with the exact model). Removing the check in UG and instead protecting for it in the AlleleCount stratification.	2012-08-20 01:16:23 -04:00
Guillermo del Angel	963ad03f8b	Second step of interface cleanup for variant annotator: several bug fixes, don't hash pileup elements to Maps because the hashCode() for a pileup element is not implemented and strange things can happen. Still several things to do, not done yet	2012-08-19 21:18:18 -04:00
Mark DePristo	7fa76f719b	Print "Parsing data stream with BCF version BCFx.y" in BCF2 codec as .debug not .info	2012-08-19 10:32:55 -04:00
Mark DePristo	9121b98167	CombineVariants outputs the first non-MISSING qual, not the maximum -- When merging multiple VCF records at a site, the combined VCF record has the QUAL of the first VCF record with a non-MISSING QUAL value. The previous behavior was to take the max QUAL, which resulted in sometime strange downstream confusion.	2012-08-19 10:29:38 -04:00
Guillermo del Angel	d9641e3d57	Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-19 09:23:21 -04:00
David Roazen	342a5b68ed	Bring bamboo performance test runner script under version control	2012-08-18 21:08:29 -04:00
Mark DePristo	d3206e35e0	Cleanup and expansion of GATKPerformanceOfTime -- Does BQSR parallelism test -- Does CountLoci parallelism test -- Updated R script	2012-08-18 18:47:26 -04:00
Mauricio Carneiro	d16cb68539	Updated and more thorough version of the BadCigar read filter * No reads with Hard/Soft clips in the middle of the cigar * No reads starting with deletions (with or without preceding clips) * No reads ending in deletions (with or without follow-up clips) * No reads that are fully hard or soft clipped * No reads that have consecutive indels in the cigar (II, DD, ID or DI) Also added systematic test for good cigars and iterative test for bad cigars.	2012-08-17 17:05:27 -04:00
Mark DePristo	980685af16	Fix GSA-137: Having both DataSource.REFERENCE and DataSource.REFERENCE_BASES is confusing to end users. -- Removed REFERENCE_BASES option. You only have REFERENCE now. There's no efficiency savings for the REFERENCE_BASES option any longer, since the reference bases are loaded lazy so if you don't use them there's effectively no cost to making the RefContext that could load them.	2012-08-17 14:55:38 -04:00
Eric Banks	2676b7fc2e	Put in a sanity check that MLEAC <= AN	2012-08-17 11:49:53 -04:00

... 7 8 9 10 11 ...

10733 Commits (5a4e2a5fa4d7ee7c6d7773d261eebc8a3ff349f1) All Branches Search

10733 Commits (5a4e2a5fa4d7ee7c6d7773d261eebc8a3ff349f1)

All Branches