Eric Banks
687df2341d
Merged bug fix from Stable into Unstable
2012-08-01 10:27:15 -04:00
Eric Banks
05bf6e3726
Updating md5s in pipeline tests so that they finally pass
2012-08-01 10:27:00 -04:00
Eric Banks
38e5419b11
Merged bug fix from Stable into Unstable
2012-08-01 09:50:31 -04:00
Eric Banks
56f8afab97
Requested by Geraldine: adding a utility to register deprecated walkers (and the major version of the first release since they were removed) so that the User Error printed out for e.g. CountCovariates now states: Walker CountCovariates is no longer available in the GATK; it has been deprecated since version 2.0.
2012-08-01 09:50:00 -04:00
Eric Banks
7cf4b63d76
Disabling indel quals in BaseRecalibrator as it should be, not PrintReads.
2012-08-01 09:23:04 -04:00
Guillermo del Angel
0528337467
Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-07-31 18:17:50 -04:00
Guillermo del Angel
4a23f3cd11
Simple cleanup of pool caller code - since usage is much more general than just calling pools, AF calculation models and GL calculation models are renamed from Pool -> GeneralPloidy. Also, don't have users specify special arguments for -glm and -pnrm. Instead, when running UG with sample ploidy != 2, the correct general ploidy modules are automatically detected and loaded. -glm now reverts to old [SNP|INDEL|BOTH] usage
2012-07-31 16:34:20 -04:00
Eric Banks
6cb10cef96
Fixed older GS reported bug. Actually, the problem really lies in Picard (can't set max records in RAM without it throwing an exception, reported on their JIRA) so I just masked out the problem by removing this never-used argument from this rarely-used tool.
2012-07-31 16:00:36 -04:00
Eric Banks
ab53d73459
Quick fix to user error catching
2012-07-31 15:50:32 -04:00
Eric Banks
10111450aa
Fixed AlignmentUtils bug for handling Ns in the CIGAR string. Added a UG integration test that calls a BAM with such reads (provided by a user on GetSatisfaction).
2012-07-31 15:37:22 -04:00
Eric Banks
fff78ab462
Archiving VQSRv3
2012-07-31 14:34:42 -04:00
Ryan Poplin
4f10386bd4
Del/Ins ratio should really be Ins/Del ratio on the summary page of the variant QC report.
2012-07-31 14:23:36 -04:00
Mark DePristo
f7133ffc31
Cleanup syntax errors from BQSR reorganization
2012-07-31 08:11:05 -04:00
Mark DePristo
762a3d9b50
Move BQSR.R to utils/recalibration in R
2012-07-31 08:11:04 -04:00
Mark DePristo
dad9bb1192
Changes order of writing BaseRecalibrator results so that if R blows up you still get a meaningful tree
2012-07-31 08:11:04 -04:00
Mark DePristo
0c4e729e13
Working version of adaptive context calculations
...
-- Uses chi2 test for independences to determine if subcontext is worth representing. Give excellent visual results
-- Writes out analysis output file producing excellent results in R
-- Trivial reformatting of MathUtils
2012-07-31 08:11:04 -04:00
Mark DePristo
93640b382e
Preliminary version of adaptive context covariate algorithm
...
-- Works according to visual inspection of output tree
2012-07-31 08:11:04 -04:00
Mark DePristo
315d25409f
Improvement to RecalDatum and VisualizeContextTree
...
-- Reorganize functions in RecalDatum so that error rate can be computed indepentently. Added unit tests. Removed equals() method, which is a buggy without it's associated implementation for hashcode
-- New class RecalDatumTree based on QualIntervals that inherits from RecalDatum but includes the concept of sub data
-- VisualizeContextTree now uses RecalDatumTree and can trivially compute the penalty function for merging nodes, which it displays in the graph
2012-07-31 08:11:04 -04:00
Mark DePristo
57b45bfb1e
Extensive unit tests, contacts, and documentation for RecalDatum
2012-07-31 08:11:03 -04:00
Mark DePristo
e00ed8bc5e
Cleanup BQSR classes
...
-- Moved most of BQSR classes (which are used throughout the codebase) to utils.recalibration. It's better in my opinion to keep commonly used code in utils, and only specialized code in walkers. As code becomes embedded throughout GATK its should be refactored to live in utils
-- Removed unncessary imports of BQSR in VQSR v3
-- Now ready to refactor QualQuantizer and unit test into a subclass of RecalDatum, refactor unit tests into RecalDatum unit tests, and generalize into hierarchical recal datum that can be used in QualQuantizer and the analysis of adaptive context covariate
-- Update PluginManager to sort the plugins and interfaces. This allows us to have a deterministic order in which the plugin classes come back, which caused BQSR integration tests to temporarily change because I moved my classes around a bit.
2012-07-31 08:11:03 -04:00
Mark DePristo
191294eedc
Initial cleanup of RecalDatum for move and further refactoring
...
-- Moved Datum, the now unnecessary superclass, into RecalDatum
-- Fixed some obviously dangerous synchronization errors in RecalDatum, though these may not have caused problems because they may not have been called in parallel mode
2012-07-31 08:11:03 -04:00
Mark DePristo
0670316288
Be clearer that dcov 50 is good for 4x, should use 200 for >30x
2012-07-31 08:11:02 -04:00
Mark DePristo
8db4e787b1
V1 of tool to visualize the quality score information in the context covariates
...
-- Upgraded jgrapht to latest version (0.8.3)
2012-07-31 08:11:02 -04:00
Mark DePristo
874dbf5b58
Maximum wait for GATK run report upload reduced to 10 seconds
2012-07-31 08:11:02 -04:00
Guillermo del Angel
e6b326c189
Merge branch 'master' of ssh://gsa4.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-07-30 21:32:19 -04:00
Guillermo del Angel
6c9d3ec155
Remerge after changes to allele construction code. More cleanups/fixes to artificial read pileup provider
2012-07-30 21:32:03 -04:00
Ryan Poplin
3dabb90eb0
Updating example active region walker integration test.
2012-07-30 21:26:16 -04:00
Ryan Poplin
c2b57ee444
updating HC integration tests after these changes.
2012-07-30 12:41:40 -04:00
Ryan Poplin
7ed06ee7b9
Updating FindCoveredIntervals to use the changes to the ActiveRegionWalker.
2012-07-30 12:16:27 -04:00
Ryan Poplin
13591b169f
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-07-30 12:13:24 -04:00
Ryan Poplin
48b9495460
Fixes to the likelihood based LD calculation for deciding when to combine consecutive events.
2012-07-30 12:12:56 -04:00
Ryan Poplin
9002758ede
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-07-30 12:10:00 -04:00
Ryan Poplin
7a73042cd3
Bug fix for the case of merging two VCs when a deletion deletes the padding base for a consecutive indel. Added unit test to cover this case.
2012-07-30 12:09:23 -04:00
Eric Banks
0b30588d67
Catch yet another class of User Errors
2012-07-30 11:59:56 -04:00
Eric Banks
5743694196
Merged bug fix from Stable into Unstable
2012-07-30 11:35:28 -04:00
Eric Banks
79195b97a3
Adding categories for the remaining uncategorized walkers
2012-07-30 11:35:08 -04:00
Guillermo del Angel
5b9a1af7fe
Intermediate fix for pool GL unit test: fix up artificial read pileup provider to give consistent data. b) Increase downsampling in pool integration tests with reference sample, and shorten MT tests so they don't last too long
2012-07-30 09:56:10 -04:00
Eric Banks
7630c929a7
Re-enabling the unit tests for reverse allele clipping
2012-07-29 22:24:56 -04:00
Eric Banks
b07bf1950b
Adding an integration test for another feature that I snuck in during a previous commit: we now allow lower-case bases in the REF/ALT alleles of a VCF and upper-case them (this had been turned off because the previous version used Strings to do the uppercasing whereas we stick with byte operations now).
2012-07-29 22:19:49 -04:00
Eric Banks
c4ae9c6cfb
With the new Allele representation we can finally handle complex events (because they aren't so complex anymore). One place this manifests itself is with the strict VCF validation (ValidateVariants used to skip these events but doesn't anymore) so I've added a new test with complex events to the VV integration test.
2012-07-29 19:22:02 -04:00
Eric Banks
99b15b2b3a
Final checkpoint: all tests pass. Note that there were bugs in the PoolGenotypeLikelihoodsUnitTest that needed fixing and eventually led to my needing to disable one of the tests (with a note for Guillermo to look into it). Also note that while I have moved over the GATK to use the new non-null representation of Alleles, I didn't remove all of the now-superfluous code throughout to do padding checking on merges; we'll need to do this on a subsequent push.
2012-07-29 01:07:59 -04:00
Eric Banks
2b1b00ade5
All integration tests and VC/Allele unit tests are passing
2012-07-27 17:03:49 -04:00
Eric Banks
beb7610195
Resolving merge conflicts
2012-07-27 15:52:02 -04:00
Eric Banks
27e7e11ec0
Allele refactoring checkpoint #3 : all integration tests except for PoolCaller are passing now. Fixed a couple of bugs from old code that popped up during md5 difference review. Added VariantContextUtils.requiresPaddingBase() method for tools that create alleles to use for determining whether or not to add the ref padding base. One of the HaplotypeCaller tests wasn't passing because of RankSumTest differences, so I added a TODO for Ryan to look into this.
2012-07-27 15:48:40 -04:00
Ryan Poplin
22bb4804f0
HaplotypeCaller now use an excessive number of high quality soft clips as a triggering signal in order to capture both end points of a large deletion in a single active region.
2012-07-27 12:44:02 -04:00
Ryan Poplin
a0890126a8
ActiveRegionWalker's isActive function returns a results object now instead of just a double.
2012-07-27 11:01:39 -04:00
Eric Banks
ef335b6213
Several more walkers have been brought up to use the new Allele representation.
2012-07-27 02:14:25 -04:00
Eric Banks
9e2209694a
Re-enable reverse trimming of alleles in UG engine when sub-selecting alleles after genotyping. UG integration tests now pass.
2012-07-27 00:47:15 -04:00
Eric Banks
baf3e33730
Allele refactoring checkpoint 2: all code finally compiles, AD and STR annotations are fixed, and most of the UG integration tests pass.
2012-07-26 23:27:11 -04:00
Ryan Poplin
35e803e110
Merged bug fix from Stable into Unstable
2012-07-26 14:00:04 -04:00