gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Ryan Poplin	10961db3ce	Another round of FindBugs fixes. Object returns its internal reference to an externally mutable array. Very dangerous.	2012-08-21 09:35:55 -04:00
Ryan Poplin	605acaae9c	Another round of FindBugs fixes. Object internally stores a reference to an externally mutable array. Very dangerous.	2012-08-21 09:33:58 -04:00
Ryan Poplin	55b7949d68	Another round of FindBugs fixes. Comparator doesn't implement Serializable.	2012-08-21 09:20:55 -04:00
Christopher Hartl	ba8622ff0d	number of stashed changes are lurking in here. In order of importance: - Fix for M_Trieb's error report on the forum, and addition of integration tests to cover the walker. - Addition of StructuralIndel as a class of variation within the VariantContext. These are for variants with a full alt allele that's >150bp in length. - Adaptation of the MVLikelihoodRatio to work for a set of trios (takes the max over the trios of the MVLR) - InsertSizeDistribution changed to use the new gatk report output (it was previously broken) - RetrogeneDiscovery changed to be compatible with the new gatk report - A maxIndelSize argument added to SelectVariants - ByTranscriptEvaluator rewritten for cleanliness - VariantRecalibrator modified to not exclude structural indels from recalibration if the mode is INDEL - Documentation added to DepthOfCoverageIntegrationTest (no, don't yell at chartl ;_; ) Also sorry for the long commit history behind this that is the result of fixing merge conflicts. Because this also fixes a conflict (from git stash apply), for some reason I can't rebase all of them away. I'm pretty sure some of the commit notes say "this note isn't important because I'm going to rebase it anyway".	2012-08-21 07:08:58 -04:00
Eric Banks	3dfe8df262	Merged bug fix from Stable into Unstable	2012-08-20 23:12:58 -04:00
Eric Banks	40d5efc804	Fix for Adam K's reported bug: we weren't handling reads that were entirely insertions properly in LIBS. Specifically, the event bases were off-by-one (which was disasterous in Adam's case with a 1bp read). Added a unit test to cover this case.	2012-08-20 23:12:41 -04:00
Khalid Shakir	3514fb6e66	Changed the default memory limit from none to 2GB upon suggestions from delangel, carneiro, and depristo.	2012-08-20 21:41:13 -04:00
Eric Banks	286b658fab	Re-enabling parallelism in the BaseRecalibrator now that the release is out.	2012-08-20 21:25:14 -04:00
Eric Banks	5b1781fdac	Merge remote-tracking branch 'unstable/master'	2012-08-20 21:18:54 -04:00
Guillermo del Angel	7bbd2a7a20	Fixing merge conflicts	2012-08-20 20:38:25 -04:00
Guillermo del Angel	2041cb853c	New implementation of AD - ignore now non-informative reads based on per-read likelihoods	2012-08-20 20:31:34 -04:00
Ryan Poplin	77fbaec044	Another round of FindBugs fixes. Class implements its own compareTo() but uses base Object.equals() which can lead to unpredictable behavior.	2012-08-20 16:55:00 -04:00
Ryan Poplin	5e28bca630	Another round of FindBugs fixes. Should be static inner class.	2012-08-20 16:15:48 -04:00
Ryan Poplin	a9472c1980	Another round of FindBugs fixes. Inefficient use of keySet iterator instead of entrySet iterator.	2012-08-20 16:11:45 -04:00
Ryan Poplin	5db3bd6fd2	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-20 15:28:57 -04:00
Ryan Poplin	464d49509a	Pulling out common caller arguments into its own StandardCallerArgumentCollection base class so that every caller isn't exposed to the unused arguments from every other caller.	2012-08-20 15:28:39 -04:00
Eric Banks	4450d66c64	Fixing the docs for DP and AD	2012-08-20 15:10:24 -04:00
Ryan Poplin	c67d708c51	Bug fix in HaplotypeCaller for non-regular bases in the reference or reads. Those events don't get created any more. Bug fix for advanced GenotypeFullActiveRegion mode: custom variant annotations created by the HC don't make sense when in this mode so don't try to calculate them.	2012-08-20 13:41:08 -04:00
Guillermo del Angel	5b5fee56cf	Next iteration of new VA interface: extend changes to per-genotype annotations as well. Will allow to have AD correctly implemented at last (that change not done yet)	2012-08-20 12:52:15 -04:00
Eric Banks	154f65e0de	Temporarily disabling multi-threaded usage of BaseRecalibrator for performance reasons.	2012-08-20 12:43:17 -04:00
Menachem Fromer	37dd7209df	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-20 12:31:34 -04:00
Guillermo del Angel	c384677917	Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-20 10:27:25 -04:00
Eric Banks	97b191f578	Thanks to Guillermo I was able to isolate an instance of where the MLEAC > AN. It turns out that this is valid, e.g. when PLs are all 0s for a sample we no-call it but it's allowed to factor into the MLE (since that's the contract with the exact model). Removing the check in UG and instead protecting for it in the AlleleCount stratification.	2012-08-20 01:16:23 -04:00
Guillermo del Angel	963ad03f8b	Second step of interface cleanup for variant annotator: several bug fixes, don't hash pileup elements to Maps because the hashCode() for a pileup element is not implemented and strange things can happen. Still several things to do, not done yet	2012-08-19 21:18:18 -04:00
Mark DePristo	7fa76f719b	Print "Parsing data stream with BCF version BCFx.y" in BCF2 codec as .debug not .info	2012-08-19 10:32:55 -04:00
Mark DePristo	9121b98167	CombineVariants outputs the first non-MISSING qual, not the maximum -- When merging multiple VCF records at a site, the combined VCF record has the QUAL of the first VCF record with a non-MISSING QUAL value. The previous behavior was to take the max QUAL, which resulted in sometime strange downstream confusion.	2012-08-19 10:29:38 -04:00
Guillermo del Angel	d9641e3d57	Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-08-19 09:23:21 -04:00
David Roazen	342a5b68ed	Bring bamboo performance test runner script under version control	2012-08-18 21:08:29 -04:00
Mark DePristo	d3206e35e0	Cleanup and expansion of GATKPerformanceOfTime -- Does BQSR parallelism test -- Does CountLoci parallelism test -- Updated R script	2012-08-18 18:47:26 -04:00
Mauricio Carneiro	d16cb68539	Updated and more thorough version of the BadCigar read filter * No reads with Hard/Soft clips in the middle of the cigar * No reads starting with deletions (with or without preceding clips) * No reads ending in deletions (with or without follow-up clips) * No reads that are fully hard or soft clipped * No reads that have consecutive indels in the cigar (II, DD, ID or DI) Also added systematic test for good cigars and iterative test for bad cigars.	2012-08-17 17:05:27 -04:00
Mark DePristo	980685af16	Fix GSA-137: Having both DataSource.REFERENCE and DataSource.REFERENCE_BASES is confusing to end users. -- Removed REFERENCE_BASES option. You only have REFERENCE now. There's no efficiency savings for the REFERENCE_BASES option any longer, since the reference bases are loaded lazy so if you don't use them there's effectively no cost to making the RefContext that could load them.	2012-08-17 14:55:38 -04:00
Eric Banks	2676b7fc2e	Put in a sanity check that MLEAC <= AN	2012-08-17 11:49:53 -04:00
Mark DePristo	0a706c9105	Add support for CombineVariants nt option in GATKPerformanceOverTime -- Also includes some nicer PDF formatting	2012-08-17 11:49:02 -04:00
Mark DePristo	bf6c0aaa57	Fix for missing formatter in R 2.15 -- VariantCallQC now works on newest ESP call set	2012-08-17 11:49:02 -04:00
Mark DePristo	daa26cc64e	Print to logger not to System.out in CachingIndexFastaSequenceFile when profiling cache performance	2012-08-17 11:49:02 -04:00
Mark DePristo	be0f8beebb	Fixed GSA-434: GATK should generate error when gzipped FASTA is passed in. -- The GATK sort of handles this now, but only if you have the exactly correct sequence dictionary and FAI files associated with the reference. If you do, the file can be .gz. If not, the GATK will fail on creating the FAI and DICT files. Added an error message that handles this case and clearly says what to do.	2012-08-17 11:49:02 -04:00
Mark DePristo	a3d2764d11	Fixed: GSA-392 @arguments with just a short name get the wrong argument bindings -- Now blows up if an argument begins with -. Implementation isn't pretty, as it actually blows up during Queue extension creation with a somewhat obscure error message but at least its something.	2012-08-17 11:49:01 -04:00
Mark DePristo	4c0f198d48	Potential fix for GSA-484: Incomplete writing of temp BCF when running CombineVariants in parallel -- Keep reading from BCF2 input stream when read(byte[]) returns < number of needed bytes -- It's possible (I think) that the failure in GSA-484 is due to multi-threading writing/reading of BCF2 records where the underlying stream is not yet flushed so read(byte[]) returns a partial result. No loops until we get all of the needed bytes or EOF is encounted	2012-08-17 11:49:01 -04:00
Mark DePristo	de3be45806	Proper function call in BCF2Decoder to validateReadBytes	2012-08-17 11:49:01 -04:00
Mark DePristo	67ebd65512	Bugfix for potential SEGFAULT with JNA getting execution hosts for LSF with multiple hosts	2012-08-17 11:49:01 -04:00
Mark DePristo	54e7302daf	Improvements to GATKPerformanceOverTime -- CombineVariants parallelism test -- Easy way to ask for specific runs with enum argument -- Update for R to handle new outputs	2012-08-17 11:49:01 -04:00
Eric Banks	53383e82ec	Hmm, not good. Fixing the math in PBT resulted in changed MD5s for integration tests that look like significant changes. I am reverting and will report this to Laurent.	2012-08-16 21:41:18 -04:00
Eric Banks	65c594afff	Better error message for reads that begin/end with a deletion in LIBS	2012-08-16 21:27:07 -04:00
Guillermo del Angel	b61ecc7c19	Fix merge conflicts	2012-08-16 20:45:52 -04:00
Guillermo del Angel	d26183e0ec	First preliminary big refactoring of UG annotation engine. Goals: a) Remove gigantic hack that cached per-read haplotype likelihoods in a static array so that annotations would go back and retrieve them, b) unify interface for annotations between HaplotypeCaller and UnifiedGenotyper, c) as a consequence, removed and cleaned duplicated code. As a bonus, annotations have now more relevant info to help them compute values. Major idea is that per-read haplotype likelihoods are now stored in a single unified object of class PerReadAlleleLikelihoodMap. Class implementation in theory hides internal storage details from outside work (still may need work cleaning up interface), and this object(or rather, a Map from Sample->perReadAlleleLikelihoodMap) is produced by UGCalcLikelihoods. The genotype calculation is also able to potentially use this info if needed. All InfoFieldAnnotations now get an extra argument with this map. Currently, this map is only produced for indels in UG, or for all variants within HaplotypeCaller. If this map is absent (SNPs in UG), the old Pileup interface is used, but it's avoided whenever possible. FORMAT annotations are not yet changed but will be focus of second step. Major benefit will be that annotations will be able to very easily discard non-informative reads for certain events. HaplotypeCaller also uses this new class, and no longer hard-codes the mapping of allele ->list(reads) but instead uses the same objects and interfaces as the rest of the modules. Code still needs further testing/cleaning/reviewing/debugging	2012-08-16 20:36:53 -04:00
Mark DePristo	6a2862e8bc	GSA-483: Bug in GATKdocs for Enums -- Fixed to no long show constants in enums as constant values in the gatkdocs	2012-08-16 16:24:17 -04:00
Eric Banks	3253fc216b	FindBugs 'Maintainability' fixes	2012-08-16 15:53:06 -04:00
Eric Banks	05cbf1c8c0	FindBugs 'Efficiency' fixes	2012-08-16 15:40:52 -04:00
Mark DePristo	d8071c66ed	Removing SlowGenotype object from GATK	2012-08-16 15:23:06 -04:00
Eric Banks	a22e7a5358	Should've run 'ant clean' instead of just 'ant'. In any event, these are 2 cases where we are setting a class's internal static variable directly. Very dangerous.	2012-08-16 15:07:32 -04:00

... 3 4 5 6 7 ...

10515 Commits (2d4b00833b3d0e26b2bf9d8f016f4001bc86fcce) All Branches Search

10515 Commits (2d4b00833b3d0e26b2bf9d8f016f4001bc86fcce)

All Branches