Eric Banks
c405a75f54
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-17 13:28:25 -04:00
Eric Banks
575303ae6b
Renaming for consistency and bringing up to speed with new rod system
2011-08-17 13:28:19 -04:00
Eric Banks
6d629c176c
Adding docs
2011-08-17 13:27:36 -04:00
Eric Banks
a21e193a9e
Adding docs to 3 more walkers
2011-08-17 12:35:08 -04:00
Menachem Fromer
98acb546a9
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-17 12:22:29 -04:00
Menachem Fromer
d1bb302d12
Added GatkDocs documentation
2011-08-17 12:21:37 -04:00
Mark DePristo
3da71a9bb6
Clean up summary
2011-08-17 12:04:45 -04:00
Mark DePristo
c6fb215faf
GATKDocs for VariantsToTable
...
-- Made a previously required argument optional, as this was a long-standing bug
2011-08-17 12:02:41 -04:00
Mark DePristo
5f794d16a7
Fixed bad character in documentation
2011-08-17 12:01:08 -04:00
Mark DePristo
9d1d5bd27a
Revert "Fixed bad character in documentation"
...
This reverts commit a1f50c82d3cb25e5e83d36e9054d74cdee957d87.
2011-08-17 11:57:31 -04:00
Mark DePristo
78deb3f195
Fixed bad character in documentation
2011-08-17 11:57:00 -04:00
Mark DePristo
79dcfca25f
Fixed bad character in documentation
2011-08-17 11:56:51 -04:00
Eric Banks
b3b5d608ca
Adding docs to yet more walkers
2011-08-17 09:57:19 -04:00
Eric Banks
fadcbf68fd
Adding docs to QC walkers
2011-08-17 09:39:33 -04:00
Mauricio Carneiro
5d6a6fab98
Renamed softUnclipped functions to refCoord*
...
These functions return reference coordinates, so they should be named accordingly.
2011-08-16 18:56:28 -04:00
Mauricio Carneiro
ed8f769dce
Fixed index for getSoftUnclippedEnd()
...
Unclipped end can be calculated simply by looking at the last cigar element and adding it's length in case it's a soft clip.
2011-08-16 18:54:28 -04:00
Eric Banks
5f3f46aad1
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-16 16:26:33 -04:00
Eric Banks
946f5c53fe
Adding docs to more walkers
2011-08-16 16:26:26 -04:00
Mark DePristo
6e828260a0
Removed -B support. Now explodes with error if -B provided.
2011-08-16 16:13:47 -04:00
Ryan Poplin
2d5bbecd9e
Merged bug fix from Stable into Unstable
2011-08-16 14:19:04 -04:00
Mauricio Carneiro
07c1e113cd
Fixed interval traversal for previously hard clipped reads.
...
If a read was hard clipped for being low quality and no does not overlap the interval anymore, this read will now be discarded instead of treated as an error by the GATK traversal engine.
2011-08-16 14:18:05 -04:00
Ryan Poplin
9d4add3268
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-08-16 14:18:03 -04:00
Ryan Poplin
170d1ff7b6
Fix in UG for trying to call indels at IUPAC code bases when in EMIT_ALL_SITES mode
2011-08-16 14:17:46 -04:00
Mauricio Carneiro
b135565183
Added low quality clipping
...
Clips both tails of a read if the tails are below a given quality threshold (default Q2).
*Added special treatment for reads that get completely clipped.
2011-08-16 13:51:25 -04:00
Andrey Sivachenko
9f3328db53
fixing read group name collision: before writing the read into respective stream in nway-out mode we now retrieve the original rg, not the merged/modified one
2011-08-16 13:45:40 -04:00
Eric Banks
ab0b56ed11
Minor doc fixes
2011-08-16 12:55:45 -04:00
Eric Banks
125ad0bcfa
Added docs to RTC
2011-08-16 12:46:48 -04:00
Eric Banks
ef9216011e
Added docs to IR
2011-08-16 12:24:53 -04:00
Eric Banks
ab1e3d6a98
Use the right set of sample names
2011-08-16 01:03:05 -04:00
Eric Banks
36c7f83208
Refactoring VE stratifications so that they don't pass around bulky data; instead just pull needed data from the VE parent. This allows us stop using deprecated features of the rod system.
2011-08-15 16:31:57 -04:00
Eric Banks
1246b89049
Forgot to initialize variants on the merge
2011-08-15 16:00:43 -04:00
Mauricio Carneiro
993ecb85da
Added Hard Clipping Tail Ends
...
Added functionality to hard clip the low quality tail ends of reads (lowQual <= 2)
2011-08-15 15:22:54 -04:00
Eric Banks
045e8a045e
Updating random walkers to new rod system; removing unused GenotypeAndValidateWalker
2011-08-15 14:05:23 -04:00
Eric Banks
fc2c21433b
Updating random walkers to new rod system
2011-08-15 13:29:31 -04:00
Eric Banks
3d56bbf087
Resolving merge conflicts
2011-08-15 12:28:05 -04:00
Eric Banks
9ddbfdcb9f
Check filtered status before applying to alt reference
2011-08-15 12:25:23 -04:00
Mauricio Carneiro
0d976d6211
Fixed second time clipping
...
When a read is clipped once, and then in the second operation, because of indels, it doesn't reach the coordinate initially set for hard clipping, the indices were wrong. This should fix it.
2011-08-15 12:04:53 -04:00
Mauricio Carneiro
489c15b99d
Fixed indexing issue in coordinate conversion
...
When a read had been previously soft clipped, the UnclippedEnd could not be used directly as Reference Coordinate for clipping , because the read does not go that far.
2011-08-15 01:42:34 -04:00
Mauricio Carneiro
c7b69a4574
Fixed integration tests
2011-08-14 16:38:20 -04:00
Mauricio Carneiro
6ae3f9e322
Wrapped clipping op information
...
The clipping op extra information being kept by this walker was specific to the walker, not to the read clipper. Created a wrapper ReadClipperWithData class that keeps the extra information and leaves the ReadClipper slim.
(this is a quick commit to unbreak the build, performing integration tests and will make further commits if necessary)
2011-08-14 15:44:48 -04:00
Mauricio Carneiro
8a51732049
Fixes to ReadClipper and added Reference Coordinate clipping.
...
* Added reference coordinate based hard clipping functions. This allows you to set a hard cut on where you need the read to be trimmed despite indels.
* soft clipping was messing up cigar string if there was already a hard clip at the beginning of the read. Fixed.
* hard clipping now works with previously hard clipped reads.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro
291d8c7596
Fixed HardClipping and Interval containment
...
* Hard clipping was wrongfully hard clipping unmapped reads while soft clipping then hard clipping mapped reads. Now we throw exception if we try to hard/soft clip unmapped reads and use the soft->hard clip procedure fore every mapped read.
* Interval containment needed a <= and >= to make sure it caught the borders right.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro
0be1dacddb
Refactored interval clipping utility
...
reads are clipped in map() and now we cover almost all cases. Left behind the case where the read stretches through two intervals. This will need special treatment later.
2011-08-14 14:54:33 -04:00
David Roazen
9d2cda3d41
Removed a public -> private dependency in our test suite.
2011-08-12 17:29:10 -04:00
David Roazen
bb4ced3201
SnpEff-related fixes.
...
-To correctly handle indels and MNPs, only consider features that start at the current locus,
rather than features that span the current locus, when selecting the most significant effect.
-Throw a UserException when a SnpEff rodbinding is not provided instead of simply not adding
any annotations and silently returning.
2011-08-12 15:26:24 -04:00
Mauricio Carneiro
10e873d9c6
Merge branch 'repval'
2011-08-12 15:24:31 -04:00
Guillermo del Angel
31dc831531
Merged bug fix from Stable into Unstable
2011-08-12 13:26:41 -04:00
Menachem Fromer
9121b8ed65
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-12 12:24:19 -04:00
Menachem Fromer
7ed120361d
Fixed bug that required symbolic alleles to be padded with reference base and added integration test to test parsing and output of symbolic alleles
2011-08-12 12:23:44 -04:00
Eric Banks
7ea9196321
Better error message for name/type clashes.
2011-08-12 11:18:14 -04:00
Eric Banks
27f0748b33
Renaming the HapMap codec and feature to RawHapMap so that we don't get esoteric errors when trying to bind a rod with the name 'hapmap' (since it was also a feature).
2011-08-12 11:11:56 -04:00
Eric Banks
005bd71be3
Working too quickly earlier. Fixing syntax.
2011-08-12 10:29:36 -04:00
Menachem Fromer
c7ca33cbff
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-12 10:12:09 -04:00
Eric Banks
639a01f382
Updating integration test now that VE has been updated
2011-08-12 07:15:08 -04:00
Eric Banks
41f3da75d7
Implementation in VE was confusing 'variant' status vs. 'polymorphic' status. This led to issues because we now match types of eval and comp; specifically, subsetting a VC to a monomorphic sample can't change the 'variant' status of the VC (it's still a variant site or otherwise we'll never match the comps, which breaks GenotypeConcordance). CountVariants really got this wrong. Fixed. VE now passes all integration tests.
2011-08-12 02:22:44 -04:00
Eric Banks
45f973ab1f
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-12 00:40:18 -04:00
Eric Banks
eba316621d
Finish moving VE over to new rod system and fixing up the type inconsistency between eval and comp rods. Now the novel count is always 0 under the known stratification. :)
2011-08-12 00:40:08 -04:00
Menachem Fromer
9de06560df
Update to new RodBinding system
2011-08-11 17:54:16 -04:00
Ryan Poplin
f1d1252be2
Fixing syntax of BQSR and UG performance tests.
2011-08-11 17:04:09 -04:00
Ryan Poplin
902eb0c61e
Adding dbsnp annotation back into the UG integration tests
2011-08-11 13:55:03 -04:00
Eric Banks
90771b74b4
When matching eval to comps, try to choose the one with the same alt allele.
2011-08-11 13:55:01 -04:00
Eric Banks
200f73b008
No reason to warn the user anymore because it's no longer possible for them to specify a dbsnp file on the command-line.
2011-08-11 13:44:07 -04:00
Eric Banks
e93538cdf7
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-11 13:39:36 -04:00
Eric Banks
265c3d744b
Fixing VariantEval logic and having it use the new rod system.
2011-08-11 13:39:34 -04:00
Ryan Poplin
b705d9cf15
Oops, these VariantAnnotator input bindings aren't needed during the UG
2011-08-11 13:17:16 -04:00
Ryan Poplin
7fade88070
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-11 11:02:47 -04:00
Ryan Poplin
c7b9a9ef0a
Updating UnifiedGenotyper to use the new rod binding system.
2011-08-11 11:02:11 -04:00
Mark DePristo
418a4d541f
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-11 11:01:38 -04:00
Mark DePristo
e71255d3c2
GATKDocsExample walker
...
-- Shows the best practice for documentating a walker with the GATKdocs
-- See http://www.broadinstitute.org/gsa/wiki/index.php/GATKdocs#Writing_GATKdocs_for_your_walkers for a brief discussion
2011-08-11 11:01:21 -04:00
Ryan Poplin
79c86e211f
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-11 09:59:20 -04:00
Ryan Poplin
ea42ee4a95
Updating BQSR for the new rod binding system.
2011-08-11 09:58:42 -04:00
Mark DePristo
8cdc0cbd9c
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-11 08:58:49 -04:00
Mark DePristo
40e06f9afb
Fixed broken RodBinding defaults.
...
-- Verified now to be correct at runtime
-- UnitTest covers this
-- createTypeDefault now takes a Type, not a Class, so that parameterized classes can have their parameter fetched in the defaults.
2011-08-11 08:58:30 -04:00
Ryan Poplin
dd5fe8291d
Fixing up some comments in the BQSR
2011-08-11 08:36:00 -04:00
Eric Banks
f1b09db39e
Fixes for rod bindings
2011-08-10 23:08:47 -04:00
Eric Banks
75985c2fa0
Resolving merge conflicts
2011-08-10 22:45:11 -04:00
Eric Banks
bdb1da30fd
Better interface for getting RodBindings to the VariantAnnotatorEngine and its annotations: pass around an AnnotatorCompatibleWalker (interface) object. Updating VA to use the new rod system.
2011-08-10 22:43:08 -04:00
Mark DePristo
0086e27741
makeUnbound now package protected
...
-- Removed references to it in the codebase
-- Fixed documentation I saw that had the summary + body style
2011-08-10 22:29:32 -04:00
Mark DePristo
cb6cf25bb0
Updating SelectVariants documentation to reflect best practice
2011-08-10 22:24:18 -04:00
Mark DePristo
00b4d6ec57
Updated the best practice on documenting a field
...
-- Best practice is now to skip the summary, as this is the @annotation doc value.
2011-08-10 22:21:12 -04:00
Mark DePristo
2007d2fcad
Better documentation for default value fields
...
-- DocString function for types that create default outputs "stdout"
-- RodBinding now creates a makeUnbound default value automatically for you if your RodBinding isn't required
-- Removed warning about sparse help from TextFormattingUtils
2011-08-10 22:16:22 -04:00
Mauricio Carneiro
bb557266ca
Merge branches to get new RodBinding framework
...
Conflicts:
private/java/src/org/broadinstitute/sting/gatk/walkers/replication_validation/ReplicationValidationWalker.java
2011-08-10 18:23:01 -04:00
Guillermo del Angel
8325cb8c26
Fixing up apparent source control/merge snafu: fix to correctly output PL ordering in multi-allelic sites by UG was only half-committed and hence not working. This completes fix
2011-08-10 15:31:49 -04:00
Eric Banks
07ad8c78a9
More tools moved over. Fixed the VariantContextIntegrationTest which was not useful because the md5s were all removed. In the future, instead of removing md5s (putting it in 'parameterization' mode), you should instead use @Test{enabled=false} since it's easier to track.
2011-08-10 14:24:40 -04:00
Eric Banks
8d14d32a62
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-10 13:42:37 -04:00
Eric Banks
749c8bfbcd
Moving more tools over to the new rod system
2011-08-10 13:42:35 -04:00
David Roazen
0497170bc9
SnpEffCodec now implements SelfScopingFeatureCodec so that we no longer have to specify the codec name on the command line for SnpEff files.
2011-08-10 13:12:09 -04:00
David Roazen
577f861f69
Pass the rodBindings into the VariantAnnotator engine, and from there to the
...
annotation classes themselves.
2011-08-10 13:11:57 -04:00
David Roazen
480e7a7984
Correctly initialize the optional SnpEff rod binding in VariantAnnotator using
...
RodBinding.makeUnbound()
2011-08-10 12:25:26 -04:00
Eric Banks
a42f90db11
Moving more tools over to use the standard VC arg collection. Also, while I'm in there, I removed all of the empty references to @Requires given that it's no longer relevant.
2011-08-10 12:20:18 -04:00
Eric Banks
c884b6bf1f
Fixed comment
2011-08-10 12:07:43 -04:00
Eric Banks
06cdc4d5f9
Added a StandardVariantContextInputArgumentCollection that is now used for consistency by many of the core tools.
2011-08-10 12:00:56 -04:00
Ryan Poplin
bc125f104a
TrainingSets class is obsolete now.
2011-08-10 10:23:33 -04:00
Ryan Poplin
c60cf52f73
Updating VQSR for new RodBinding syntax. Cleaning up indel specific parts of VQSR.
2011-08-10 10:20:37 -04:00
Eric Banks
1ea5ec276b
Minor cleanup
2011-08-09 23:28:59 -04:00
Eric Banks
bc2d4f554d
Bringing Indel Realigner up to speed with the new rod binding syntax; now use -known to specify the known indels track.
2011-08-09 23:21:17 -04:00
Eric Banks
b8f572b571
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-09 23:19:51 -04:00
Eric Banks
08631546c8
Partial commit for David so he can see what I want to do with the VariantAnnotator. Added a DbsnpArgumentCollection that people can use in their walkers to ensure that we have a standard syntax whenever allowing dbsnp rods. Added it to UG, but didn't hook it up. Maybe we should do the same for the 'variant' rod?
2011-08-09 23:19:40 -04:00
Mark DePristo
86afe878a7
ReducedRead optimization: single pass likelihood calculation
...
-- Low level add() now takes a nObs argument and rather than += likelihood now does += nObs * likelihood
2011-08-09 20:55:15 -04:00
Eric Banks
489e5cffc1
Missed a few 'variants'
2011-08-09 14:29:15 -04:00
Eric Banks
b20c4d5286
Thanks to Mark for agreeing to transition from 'variants' back to 'variant'. I think I got them all but I've been jumping all around the code, so there might be a straggler or two.
2011-08-09 12:04:55 -04:00
Eric Banks
78aa6db076
added the 'reference' header line too. We are now header-compliant for vcf4.1.
2011-08-09 11:45:54 -04:00
Eric Banks
ec76bf6d4a
VCF headers now include 'contig' lines describing the name, length, and assembly (when easily parsable) for each contig in the reference.
2011-08-09 11:24:48 -04:00
Eric Banks
7afb5c9f1c
More updates to be consistent with the new rod syntax.
2011-08-09 10:11:37 -04:00
Eric Banks
1e490e0dec
Bringing up to speed with new syntax
2011-08-09 09:26:06 -04:00
Eric Banks
70b3daf689
VariantsToVCF is up and running again; integration tests are reenabled (and added one for dbSNP).ant
2011-08-09 03:03:43 -04:00
Mauricio Carneiro
d15852be0a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-09 00:04:59 -04:00
Mauricio Carneiro
2db6225c53
A read filter that sets all mapping qualities to a given value
...
Pacbio has decided to assign 255 to the MQ of all their reads since they claim their aligner does not produce a number equivalent to a mapping quality. Despite much back and forth, they are dead set on not using this field, so if we want to use their bams, we will need to override that. This filter does just that. Replacing all values with a given one. Default is 60.
2011-08-09 00:04:42 -04:00
David Roazen
2efa376619
Made the necessary changes to get SnpEff support working with the new rodbinding system.
2011-08-08 23:29:39 -04:00
David Roazen
b180a1311a
Merge branch 'snpEff'
2011-08-08 22:12:14 -04:00
David Roazen
28d8c8fcbc
Modified the SnpEff integration test to run on a much smaller interval.
2011-08-08 21:51:16 -04:00
David Roazen
a13bc7b929
Added an integration test for the SnpEff annotation support, as well as some extra safety checks and comments.
2011-08-08 20:01:24 -04:00
Mark DePristo
80924d24de
Single positional arguments are now treated as names unless they actually match a tribble feature
2011-08-08 19:26:27 -04:00
Mark DePristo
f8a56bc64b
Merge branch 'master' into rodRefactor
2011-08-08 16:58:18 -04:00
Mark DePristo
f8ad91b16f
Reverting a bunch of bad -B type drops
2011-08-08 16:57:38 -04:00
David Roazen
5e288136e0
Added unit tests for the SnpEff codec, and made minor adjustments to the codec itself.
2011-08-08 16:51:43 -04:00
Eric Banks
d7813db217
Combine Variants was actually outputting invalid VCFs in cases where it was combining Variant Contexts with different alternate alleles: if any of the genotypes had PLs they were no longer valid/correct. Added a check for such cases (the combined VC has more alleles than an original VC) and strip out the PLs when triggered; added integration test to cover it. I also added the check to Select Variants, although it currently doesn't remove unused alleles so it should never trigger. Is there any reason not to strip out unused alleles after a select?
2011-08-08 16:25:35 -04:00
Mark DePristo
383bb6f0e0
Merge branch 'master' into rodRefactor
2011-08-08 15:25:55 -04:00
Mark DePristo
4f8fc0f2f1
VCF3 now dynamically determined
2011-08-08 15:05:47 -04:00
Mark DePristo
ba7353c561
Updated IntegrationTests to use the new type free format for VCF files
2011-08-08 15:04:38 -04:00
Mark DePristo
0810c42309
GATK now does dynamic type determination for VCF files
...
Added UnitTests covering all of the cases.
2011-08-08 14:45:46 -04:00
Mark DePristo
e36994e36b
Refactored a FeatureManager class from RMDTrackBuilder
...
New class handles (vastly more cleanly) the db of tribble codecs, features, and names for use throughout the GATK.
Added SelfScopingFeatureCodec interface that allows a FeatureCodec to examine a file and determine if the file can be parsed. This is the first step towards allowing the GATK to dynamically determine the type of a RodBinding.
2011-08-08 14:04:46 -04:00
Eric Banks
197169e47b
Submitting patch from Larry Singh to make MathUtils compatible with java 1.7
2011-08-08 13:34:04 -04:00
David Roazen
dd974040af
When finding the highest-impact effect at a locus, all effects that are not within a
...
non-coding gene are now considered higher impact than all effects that are within a
non-coding gene.
2011-08-08 13:29:54 -04:00
David Roazen
c1061e994c
Initial support for adding genomic annotations through VariantAnnotator using
...
the output from the SnpEff tool, which replaces the old Genomic Annotator.
2011-08-08 13:29:53 -04:00
Mark DePristo
0db79207e8
Refactored dependancy from CommandLineGATK from javadocs
...
This allows us to run the GATK again in environments without Javadoc loading by default in the classpath
2011-08-08 12:27:13 -04:00
Mark DePristo
e5fde0d16b
Merge branch 'master' into rodRefactor
2011-08-08 10:08:43 -04:00
Mark DePristo
526b524c3c
CombineVariants with new RodBinding. Bugfix
...
-- CombineVariants now uses the new RodBinding syntax, -V / --variants. Passed all integration tests on first run
-- Exposed gapping bug in the List<RodBinding<T>> system now fixed. ParserEngine now has a addRodBinding() that is called by RodBindingArgumentTypeDescriptor when it encounters each RodBinding. This allows the system to work with collection types that are recursively parsed by the system.
2011-08-07 20:16:51 -04:00
Ryan Poplin
6693407bd8
Merged bug fix from Stable into Unstable
2011-08-07 17:39:03 -04:00
Mark DePristo
5f8bc3aa8a
Documenting classes, and name cleanup
2011-08-07 15:17:50 -04:00
Mark DePristo
1c63d43176
Help now points to GATKDocs instead of spitting out full, garbled description
2011-08-07 15:02:46 -04:00
Mark DePristo
1d8b1bae0a
Need to rename the integration test argument -mask to -maskName
2011-08-07 13:32:26 -04:00
Mark DePristo
ece8f0db5e
Added b37dbSNP129, needed for Queue
2011-08-07 11:26:07 -04:00
Mark DePristo
b0e91f85cf
fix merge from Khalid's Queue fix
2011-08-07 10:33:20 -04:00
Mark DePristo
4d88e72958
Merge remote-tracking branch 'remotes/khalid/rodRefactor' into rodRefactor
...
Conflicts:
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java
public/java/test/org/broadinstitute/sting/BaseTest.java
2011-08-07 10:32:27 -04:00
Khalid Shakir
f049461120
Changed @Argument to @Input on input RodBindings.
...
Changed shortname collision with longname.
Restored scala builds.
Updated HSP to use new syntax.
2011-08-06 20:44:19 -04:00
Mark DePristo
573700d18d
Adding missing import
2011-08-04 21:57:00 -04:00
Mark DePristo
14e43c3382
Final fix to RodBindingUnitTest to reset global counter variable
2011-08-04 21:52:39 -04:00
Mark DePristo
d7f98e5c2a
Fixed merge conflict deleting a {
2011-08-04 18:48:34 -04:00
Mark DePristo
75632abf88
Merge branch 'master' into rodRefactor
...
Conflicts:
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantsToVCF.java
public/java/test/org/broadinstitute/sting/gatk/walkers/indels/RealignerTargetCreatorIntegrationTest.java
public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersIntegrationTest.java
2011-08-04 18:44:14 -04:00
Mark DePristo
f21f7f6335
SelectVariants fully documented, now the shining example of the new RodBinding system.
2011-08-04 18:28:59 -04:00
Mark DePristo
9308fbe3fb
VariantEval Integration Test parameterized for new novelty stratification
2011-08-04 18:08:47 -04:00
Mark DePristo
9be1ee59cc
TODO comments for Eric
2011-08-04 18:07:50 -04:00
Mauricio Carneiro
b22a3d6508
Functional VCF output.
...
It is outputting a VCF with the 'second best guess' for the alternate allele correctly. Annotations are added at the pool level, but may get overwritten at the lane and site level. Still need to implement the merging of the the annotations at higher levels.
2011-08-04 17:49:08 -04:00
Guillermo del Angel
a8eb8c27f0
a) Minor changes to indel consensus scripts to better reflect good default values, b) Fixed up Mills/Devine codec so it always produces correct ref padded bases, and added option to VariantsToVCF to fix reference base
2011-08-04 15:34:49 -04:00
Ryan Poplin
98a96f07c1
Updated standard deviation parameter in VQSR to our current recommended value
2011-08-04 14:06:26 -04:00
Mark DePristo
58a60d4901
Merge branch 'master' into rodRefactor
2011-08-04 12:48:56 -04:00
Eric Banks
e48492f3c3
Validate that the reference padding base for indels is correct.
2011-08-04 12:48:56 -04:00
Mark DePristo
d2078f09b2
Minor fixes to ITs
2011-08-04 12:47:55 -04:00
Eric Banks
f10588420c
Fixing path to dbSNP file as the other one was replaced
2011-08-04 12:36:24 -04:00
Mark DePristo
f0d798d47c
Bug fix: call RodBinding.resetNameCounter() in new ParsingEngine() so that we don't magically misnumber arguments in the integration tests where the GATK is only instantiated once.
2011-08-04 12:06:10 -04:00
Mark DePristo
490ca475fc
Replacing hardcoded dbsnp129 with BaseTest variable
2011-08-03 22:15:22 -04:00
Eric Banks
a831af1166
Another misprint when removing the references to -D
2011-08-03 21:29:21 -04:00
Mark DePristo
d0279bb28c
RodBinding names are now defaulting to the ArgumentTypeDescriptor fullname
...
Nearly all of the tools are passing integrationtests
2011-08-03 20:48:11 -04:00
Mark DePristo
d8f1ebf8c6
Parameterized RecalibrationWalkers with clean unstable database
2011-08-03 20:06:00 -04:00
Mark DePristo
41b3840d26
Took latest VEIT and updated to use dbsnp132 vcf
2011-08-03 18:40:32 -04:00
Mark DePristo
0ef85647f7
A working version of a GATKReportDiffableReader for the diffEngine!
2011-08-03 18:21:18 -04:00
Mark DePristo
acbd3d0922
Fixing up integration tests so more
2011-08-03 17:26:35 -04:00
Mark DePristo
8f696c7731
Continuing progress towards RodBinding 1.0
...
-- Cleaning up old interface to RMDT, docs and contracts added
-- Proper type checking for RodBinding for cases where the Tribble type isn't found or is the wrong type
2011-08-03 17:19:28 -04:00
Mark DePristo
800bb97f0b
Removed getFeaturesAsGATKFeature and created createGenomeLoc(Feature) in genomeLocParser
...
Updated all walkers that used the now deleted methods.
2011-08-03 16:04:51 -04:00
Mark DePristo
f6563c0f9f
Removed support for RMD in @Requires and @Allows
...
Merge as well
Conflicts:
private/java/src/org/broadinstitute/sting/gatk/walkers/qc/TestVariantContextWalker.java
public/java/src/org/broadinstitute/sting/gatk/walkers/phasing/PhaseByTransmission.java
public/java/src/org/broadinstitute/sting/gatk/walkers/variantrecalibration/VariantDataManager.java
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantValidationAssessor.java
public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersIntegrationTest.java
public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersPerformanceTest.java
public/java/test/org/broadinstitute/sting/gatk/walkers/varianteval/VariantEvalIntegrationTest.java
public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-03 15:36:55 -04:00
Mark DePristo
79e4a8f6d3
Merge
...
Conflicts:
private/java/src/org/broadinstitute/sting/gatk/walkers/qc/TestVariantContextWalker.java
public/java/src/org/broadinstitute/sting/gatk/walkers/phasing/PhaseByTransmission.java
public/java/src/org/broadinstitute/sting/gatk/walkers/variantrecalibration/VariantDataManager.java
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/SelectVariants.java
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantValidationAssessor.java
public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersIntegrationTest.java
public/java/test/org/broadinstitute/sting/gatk/walkers/recalibration/RecalibrationWalkersPerformanceTest.java
public/java/test/org/broadinstitute/sting/gatk/walkers/varianteval/VariantEvalIntegrationTest.java
public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-03 15:09:47 -04:00
Mark DePristo
38efd3066c
Bug fix for mask RodBinding
2011-08-03 14:58:18 -04:00
Eric Banks
f62f47d476
Not sure why this didn't fail before, but bringing VE up to date with previous changes
2011-08-03 14:27:07 -04:00
Mark DePristo
b25140db83
Contracts and documentation for some of RefMetaDataTracker
...
Continuing to fix integration tests that don't pass / run
2011-08-03 13:34:20 -04:00
Eric Banks
3de10b1ef8
Fixing misprint from Ryan's commit
2011-08-03 12:37:50 -04:00
Eric Banks
db2e0aaa1a
Darn, forgot to update unit tests.
2011-08-03 12:31:08 -04:00
Eric Banks
020b2408a8
Adding integration test for left alignment of indels
2011-08-03 12:19:44 -04:00
Eric Banks
f6648e0144
Don't left-align complex indels because it's too complicated.
2011-08-03 12:03:50 -04:00
Mark DePristo
85c67e9891
Contracts and documentation for Rodbinding
2011-08-03 11:16:06 -04:00
Eric Banks
5dc324ff35
Dealing with merge confict
2011-08-03 11:03:47 -04:00
Eric Banks
7c89fe01b3
Instead of having the padded reference base be some hackish attribute it is now an actual variable in the Variant Context class. More importantly, we now always require that it be present when padding is necessary - and validate as such upon construction of the VC. This cleans up the interface significantly because we no longer require that a reference base be passed in when writing a VC/VCF record.
2011-08-03 11:00:36 -04:00
Mark DePristo
d9bc673ff2
Fixed bad constructor in RMDTUnitTest
2011-08-03 09:42:43 -04:00
Khalid Shakir
5dcac7b064
GATKReport v0.2:
...
- Floating point column widths are measured correctly
- Using fixed width columns instead of white space separated which allows spaces embedded in cell values
- Legacy support for parsing white space separated v0.1 tables where the columns may not be fixed width
- Enforcing that table descriptions do not contain newlines so that tables can be parsed correctly
Replaced GATKReportTableParser with existing functionality in GATKReport
2011-08-03 00:24:47 -04:00
Mark DePristo
2874835997
Bug fix for type checking RodBindings
...
Now compares the feature class not the codec class.
UnitTests improvements
integrationtests on their way to actually running
2011-08-02 22:25:41 -04:00
Mark DePristo
b5e843f8f0
Approaching the end for the new RodBinding system
...
-- support for explicit naming of bindings (-X:name,type x)
-- support for automatic naming of bindings in lists (-X:vcf foo.vcf -X:vcf bar.vcf will generate internal names X and X2)
-- ParserEngineUnitTest expanded to cover all of the Rodbinding cases
-- RodBindingUnitTest tests all of the low-level accessors
-- Parsing engine throws UserExceptions when bad bindings are provided on the command line
2011-08-02 22:00:06 -04:00
David Roazen
d3437e62da
Added a simple utility method Utils.optimumHashSize() to calculate the optimum
...
initial size for a Java hash table (HashMap, HashSet, etc.) given an expected
maximum number of elements. The optimum size is the smallest size that's
guaranteed not to result in any rehash / table-resize operations.
Example Usage:
Map<String, Object> hash = new HashMap<String, Object>(Utils.optimumHashSize(expectedMaxElements));
I think we're paying way too heavy a price in unnecessary rehash operations across
the GATK. If you don't specify an initial size, you get a table of size 16 that gets
completely rehashed and doubles in size every time it becomes 75% full. This means you
do at least twice as much work as you need to in order to populate your table:
(n + n/2 + n/4 + ... 16 ~= (1 + 1/2 + 1/4...) * n ~= 2 * n
2011-08-02 21:59:06 -04:00
Mark DePristo
83891271b5
--variants throughout integrationtests
2011-08-02 20:28:47 -04:00
Mark DePristo
3a27a25cfc
Validates that the tribble binding provides the right object types at startup
...
Tests to ensure this remains working
2011-08-02 20:11:24 -04:00
Guillermo del Angel
df37716857
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 18:27:13 -04:00
Ryan Poplin
b2cde87378
Removing --DBSNP syntax from BQSR integration tests
2011-08-02 15:34:38 -04:00
Mark DePristo
e4a67f3df1
RefMetaDataTracker has complete set of get() functions for List<RodBinding<T>>
...
Including unit tests
2011-08-02 14:28:35 -04:00
Mark DePristo
03741fb640
Merge branch 'master' into rodRefactor
...
Conflicts:
public/java/src/org/broadinstitute/sting/gatk/walkers/annotator/VariantAnnotatorEngine.java
public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerIntegrationTest.java
public/java/test/org/broadinstitute/sting/gatk/walkers/indels/IndelRealignerPerformanceTest.java
public/java/test/org/broadinstitute/sting/utils/variantcontext/VariantContextIntegrationTest.java
2011-08-02 14:21:58 -04:00
Mark DePristo
a366f9a18d
Updating tools to use the RodBinding<T> syntax
2011-08-02 14:05:51 -04:00
Ryan Poplin
c0653514b3
minor update to comment in UG
2011-08-02 13:34:48 -04:00
Ryan Poplin
2ba57bb502
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-02 13:30:46 -04:00
Ryan Poplin
38e4ae4176
minor update to comment in UG
2011-08-02 13:30:38 -04:00
Guillermo del Angel
821bbfa9e0
Bug fixes and enhancements to run whole-genome indel VQSR, removed old chr20-only code and cleanup
2011-08-02 13:17:20 -04:00
Eric Banks
65c5d55b72
Not sure how I missed these. These lines are now superfluous.
2011-08-02 12:48:36 -04:00
Eric Banks
b9d0d2af22
Adding back temporarily removed integration test now that the file permissions have been fixed.
2011-08-02 12:39:11 -04:00
Eric Banks
1c387848de
No more use of -D in the integration tests but instead stick with VCFs only. Since all of these tests were duplicated (one each for dbSNP format and for VCF), we don't actually lose coverage in the integration tests.
2011-08-02 10:39:50 -04:00
Eric Banks
2c5e526eb7
Don't use the mismatch fraction by default in the RealignerTargetCreator (since it's only useful when using SW in the indel realigner). Also, no more use of -D but instead move over to using VCFs. One integration test is temporarily commented out while I wait for a VCF file to get fixed.
2011-08-02 10:34:46 -04:00
Eric Banks
5626199bb6
The Unified Genotyper now does NOT emit SLOD/SB by default; to compute SB use --computeSLOD
2011-08-02 10:14:21 -04:00
Mark DePristo
184030dd56
RefMetaDataTracker no longer automagically converts inputs to VariantContexts
...
This was no longer working properly given that DBSNP indels needed to be moved around. The adaptor system is being refactored and you will need to convert files from X -> VCF for many tools to work.
2011-08-01 15:21:16 -04:00
Mark DePristo
8b1adb8c95
Removed getVariantContext() code
2011-08-01 13:41:09 -04:00
Mark DePristo
f69bff5dd6
Commented out, because these fail the now removed dbSNP conversion.
2011-08-01 13:34:25 -04:00
Eric Banks
3a9b6eacdf
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-01 11:23:18 -04:00
Mark DePristo
7b07c4e04e
RefMetaDataTracker now has get() methods accepting RodBindings
...
RodBinding no longer duplicates the get() methods in RMDT. This is just an object now that connects the command line system to the RMDT.
Updated programs to use new style
Added UnitTests for the RodBinding accessors.
2011-07-30 15:34:11 -04:00
Mark DePristo
a6691ab2fd
List<RodBinding<T>> now working (sort of).
...
At least the argument parsing system tolerates it.
2011-07-29 16:11:22 -04:00
Mark DePristo
6acb4aad3b
RodBinding<T> are properly generic now.
...
VariantContextRodBinding removed, as RodBinding<VariantContext> is the right style now.
2011-07-29 14:37:12 -04:00