gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Ryan Poplin	2d5bbecd9e	Merged bug fix from Stable into Unstable	2011-08-16 14:19:04 -04:00
Andrey Sivachenko	9f3328db53	fixing read group name collision: before writing the read into respective stream in nway-out mode we now retrieve the original rg, not the merged/modified one	2011-08-16 13:45:40 -04:00
Eric Banks	125ad0bcfa	Added docs to RTC	2011-08-16 12:46:48 -04:00
Mauricio Carneiro	8a51732049	Fixes to ReadClipper and added Reference Coordinate clipping. * Added reference coordinate based hard clipping functions. This allows you to set a hard cut on where you need the read to be trimmed despite indels. * soft clipping was messing up cigar string if there was already a hard clip at the beginning of the read. Fixed. * hard clipping now works with previously hard clipped reads.	2011-08-14 14:54:33 -04:00
Mauricio Carneiro	291d8c7596	Fixed HardClipping and Interval containment * Hard clipping was wrongfully hard clipping unmapped reads while soft clipping then hard clipping mapped reads. Now we throw exception if we try to hard/soft clip unmapped reads and use the soft->hard clip procedure fore every mapped read. * Interval containment needed a <= and >= to make sure it caught the borders right.	2011-08-14 14:54:33 -04:00
Mauricio Carneiro	0be1dacddb	Refactored interval clipping utility reads are clipped in map() and now we cover almost all cases. Left behind the case where the read stretches through two intervals. This will need special treatment later.	2011-08-14 14:54:33 -04:00
Mauricio Carneiro	10e873d9c6	Merge branch 'repval'	2011-08-12 15:24:31 -04:00
Menachem Fromer	9121b8ed65	Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-08-12 12:24:19 -04:00
Menachem Fromer	7ed120361d	Fixed bug that required symbolic alleles to be padded with reference base and added integration test to test parsing and output of symbolic alleles	2011-08-12 12:23:44 -04:00
Eric Banks	7ea9196321	Better error message for name/type clashes.	2011-08-12 11:18:14 -04:00
Eric Banks	27f0748b33	Renaming the HapMap codec and feature to RawHapMap so that we don't get esoteric errors when trying to bind a rod with the name 'hapmap' (since it was also a feature).	2011-08-12 11:11:56 -04:00
Mark DePristo	2007d2fcad	Better documentation for default value fields -- DocString function for types that create default outputs "stdout" -- RodBinding now creates a makeUnbound default value automatically for you if your RodBinding isn't required -- Removed warning about sparse help from TextFormattingUtils	2011-08-10 22:16:22 -04:00
Mauricio Carneiro	bb557266ca	Merge branches to get new RodBinding framework Conflicts: private/java/src/org/broadinstitute/sting/gatk/walkers/replication_validation/ReplicationValidationWalker.java	2011-08-10 18:23:01 -04:00
David Roazen	0497170bc9	SnpEffCodec now implements SelfScopingFeatureCodec so that we no longer have to specify the codec name on the command line for SnpEff files.	2011-08-10 13:12:09 -04:00
Eric Banks	70b3daf689	VariantsToVCF is up and running again; integration tests are reenabled (and added one for dbSNP).ant	2011-08-09 03:03:43 -04:00
David Roazen	b180a1311a	Merge branch 'snpEff'	2011-08-08 22:12:14 -04:00
David Roazen	a13bc7b929	Added an integration test for the SnpEff annotation support, as well as some extra safety checks and comments.	2011-08-08 20:01:24 -04:00
Mark DePristo	f8a56bc64b	Merge branch 'master' into rodRefactor	2011-08-08 16:58:18 -04:00
David Roazen	5e288136e0	Added unit tests for the SnpEff codec, and made minor adjustments to the codec itself.	2011-08-08 16:51:43 -04:00
Eric Banks	d7813db217	Combine Variants was actually outputting invalid VCFs in cases where it was combining Variant Contexts with different alternate alleles: if any of the genotypes had PLs they were no longer valid/correct. Added a check for such cases (the combined VC has more alleles than an original VC) and strip out the PLs when triggered; added integration test to cover it. I also added the check to Select Variants, although it currently doesn't remove unused alleles so it should never trigger. Is there any reason not to strip out unused alleles after a select?	2011-08-08 16:25:35 -04:00
Mark DePristo	383bb6f0e0	Merge branch 'master' into rodRefactor	2011-08-08 15:25:55 -04:00
Mark DePristo	e36994e36b	Refactored a FeatureManager class from RMDTrackBuilder New class handles (vastly more cleanly) the db of tribble codecs, features, and names for use throughout the GATK. Added SelfScopingFeatureCodec interface that allows a FeatureCodec to examine a file and determine if the file can be parsed. This is the first step towards allowing the GATK to dynamically determine the type of a RodBinding.	2011-08-08 14:04:46 -04:00
Eric Banks	197169e47b	Submitting patch from Larry Singh to make MathUtils compatible with java 1.7	2011-08-08 13:34:04 -04:00
David Roazen	dd974040af	When finding the highest-impact effect at a locus, all effects that are not within a non-coding gene are now considered higher impact than all effects that are within a non-coding gene.	2011-08-08 13:29:54 -04:00
David Roazen	c1061e994c	Initial support for adding genomic annotations through VariantAnnotator using the output from the SnpEff tool, which replaces the old Genomic Annotator.	2011-08-08 13:29:53 -04:00
Mark DePristo	0db79207e8	Refactored dependancy from CommandLineGATK from javadocs This allows us to run the GATK again in environments without Javadoc loading by default in the classpath	2011-08-08 12:27:13 -04:00
Mark DePristo	e5fde0d16b	Merge branch 'master' into rodRefactor	2011-08-08 10:08:43 -04:00
Mark DePristo	5f8bc3aa8a	Documenting classes, and name cleanup	2011-08-07 15:17:50 -04:00
Mark DePristo	1c63d43176	Help now points to GATKDocs instead of spitting out full, garbled description	2011-08-07 15:02:46 -04:00
Mark DePristo	75632abf88	Merge branch 'master' into rodRefactor Conflicts: public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantsToVCF.java public/java/test/org/broadinstitute/sting/gatk/walkers/indels/RealignerTargetCreatorIntegrationTest.java public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersIntegrationTest.java	2011-08-04 18:44:14 -04:00
Mauricio Carneiro	b22a3d6508	Functional VCF output. It is outputting a VCF with the 'second best guess' for the alternate allele correctly. Annotations are added at the pool level, but may get overwritten at the lane and site level. Still need to implement the merging of the the annotations at higher levels.	2011-08-04 17:49:08 -04:00
Eric Banks	e48492f3c3	Validate that the reference padding base for indels is correct.	2011-08-04 12:48:56 -04:00
Mark DePristo	8f696c7731	Continuing progress towards RodBinding 1.0 -- Cleaning up old interface to RMDT, docs and contracts added -- Proper type checking for RodBinding for cases where the Tribble type isn't found or is the wrong type	2011-08-03 17:19:28 -04:00
Mark DePristo	800bb97f0b	Removed getFeaturesAsGATKFeature and created createGenomeLoc(Feature) in genomeLocParser Updated all walkers that used the now deleted methods.	2011-08-03 16:04:51 -04:00
Mark DePristo	79e4a8f6d3	Merge Conflicts: private/java/src/org/broadinstitute/sting/gatk/walkers/qc/TestVariantContextWalker.java public/java/src/org/broadinstitute/sting/gatk/walkers/phasing/PhaseByTransmission.java public/java/src/org/broadinstitute/sting/gatk/walkers/variantrecalibration/VariantDataManager.java public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantValidationAssessor.java public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersIntegrationTest.java public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersPerformanceTest.java public/java/test/org/broadinstitute/sting/gatk/walkers/varianteval/VariantEvalIntegrationTest.java public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java	2011-08-03 15:09:47 -04:00
Eric Banks	f62f47d476	Not sure why this didn't fail before, but bringing VE up to date with previous changes	2011-08-03 14:27:07 -04:00
Eric Banks	5dc324ff35	Dealing with merge confict	2011-08-03 11:03:47 -04:00
Eric Banks	7c89fe01b3	Instead of having the padded reference base be some hackish attribute it is now an actual variable in the Variant Context class. More importantly, we now always require that it be present when padding is necessary - and validate as such upon construction of the VC. This cleans up the interface significantly because we no longer require that a reference base be passed in when writing a VC/VCF record.	2011-08-03 11:00:36 -04:00
Khalid Shakir	5dcac7b064	GATKReport v0.2: - Floating point column widths are measured correctly - Using fixed width columns instead of white space separated which allows spaces embedded in cell values - Legacy support for parsing white space separated v0.1 tables where the columns may not be fixed width - Enforcing that table descriptions do not contain newlines so that tables can be parsed correctly Replaced GATKReportTableParser with existing functionality in GATKReport	2011-08-03 00:24:47 -04:00
Mark DePristo	2874835997	Bug fix for type checking RodBindings Now compares the feature class not the codec class. UnitTests improvements integrationtests on their way to actually running	2011-08-02 22:25:41 -04:00
Mark DePristo	b5e843f8f0	Approaching the end for the new RodBinding system -- support for explicit naming of bindings (-X:name,type x) -- support for automatic naming of bindings in lists (-X:vcf foo.vcf -X:vcf bar.vcf will generate internal names X and X2) -- ParserEngineUnitTest expanded to cover all of the Rodbinding cases -- RodBindingUnitTest tests all of the low-level accessors -- Parsing engine throws UserExceptions when bad bindings are provided on the command line	2011-08-02 22:00:06 -04:00
David Roazen	d3437e62da	Added a simple utility method Utils.optimumHashSize() to calculate the optimum initial size for a Java hash table (HashMap, HashSet, etc.) given an expected maximum number of elements. The optimum size is the smallest size that's guaranteed not to result in any rehash / table-resize operations. Example Usage: Map<String, Object> hash = new HashMap<String, Object>(Utils.optimumHashSize(expectedMaxElements)); I think we're paying way too heavy a price in unnecessary rehash operations across the GATK. If you don't specify an initial size, you get a table of size 16 that gets completely rehashed and doubles in size every time it becomes 75% full. This means you do at least twice as much work as you need to in order to populate your table: (n + n/2 + n/4 + ... 16 ~= (1 + 1/2 + 1/4...) * n ~= 2 * n	2011-08-02 21:59:06 -04:00
Mark DePristo	3a27a25cfc	Validates that the tribble binding provides the right object types at startup Tests to ensure this remains working	2011-08-02 20:11:24 -04:00
Mark DePristo	03741fb640	Merge branch 'master' into rodRefactor Conflicts: public/java/src/org/broadinstitute/sting/gatk/walkers/annotator/VariantAnnotatorEngine.java public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerIntegrationTest.java public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerPerformanceTest.java public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java	2011-08-02 14:21:58 -04:00
Mark DePristo	a366f9a18d	Updating tools to use the RodBinding<T> syntax	2011-08-02 14:05:51 -04:00
Mauricio Carneiro	a58ddab93b	minQual and minPower filters added. VCF output added. Calls are now made based on the likelihood AC model. Two filters are applied: minQual and minPower. Output is now a VCF file with the variant context. It's now called the gatk's PoolCaller, no longer Replication Validation framework. Lots of testing ensue....	2011-07-28 18:58:36 -04:00
Eric Banks	ff31fa7990	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-07-27 16:15:23 -04:00
Mark DePristo	15be383d5b	Merge branch 'master' into rodRefactor	2011-07-27 15:36:49 -04:00
Mark DePristo	097828a466	ParsingEngine now maintains the list of rodBindings No longer try to reparser objects to find the right fields Direct support in RodBinding for getTags()	2011-07-27 11:36:53 -04:00
Mauricio Carneiro	321afac4e8	Updates to the help layout. New style.css, new template for the walker auto-generated html. Short description is no longer repeated in the long description of the walker. Updated DiffObjectsWalker and ContigStatsWalker as "reference" documented walkers.	2011-07-26 19:29:25 -04:00

1 2 3

113 Commits (a21e193a9e49646d7f9b147e79ddba525f90a092)