Mauricio Carneiro
2a4ccfe6fd
Updated all JAVA file licenses accordingly
...
GSATDG-5
2013-01-10 17:06:41 -05:00
Mauricio Carneiro
dd177b1714
Removing fully commented out varianteval evaluators
...
- Files were completely commmented out, and were screwing up my license script. Dont like them. Removed them.
GSATDG-5
2013-01-10 17:06:12 -05:00
Chris Hartl
80dec72c53
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2013-01-10 14:35:59 -05:00
Chris Hartl
31a5f88c4f
Expanded unit tests to cover the Concordance Metrics class fairly uniformly.
2013-01-10 14:33:47 -05:00
Ryan Poplin
1a18947abf
Adding new command line argument requested on the forum to control the maximum number of haplotypes that are sent forward for genotyping. In the presence of a large degree of heterozygosity the current algorithm breaks down and so this argument would need to be increased.
2013-01-09 15:54:02 -05:00
Ryan Poplin
487fb2afb4
Bug fix for the case of overlapping assembled and partially-assembled events created by the HC. Unfortunately the symbolic allele can't be combined with the indel allele because the reference basis will change.
2013-01-09 15:30:46 -05:00
Chris Hartl
6787f86803
Eliminate the import of DiploidGenotype, which switched public/private underneath me but for some reason didn't stop me from compiling...
2013-01-09 13:23:24 -05:00
Chris Hartl
c1de92b511
Add in some todo items
2013-01-09 13:16:06 -05:00
Chris Hartl
8d126161e2
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2013-01-09 13:15:04 -05:00
Eric Banks
3a0dd4b175
Oops, I broke the build. NOW we shouldn't have any more public->protected dependancies.
2013-01-09 11:12:28 -05:00
Eric Banks
a921b06e02
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2013-01-09 11:06:17 -05:00
Eric Banks
4fa439d89e
Move some classes back to public because they are used in the engine. Move some test classes to protected. We should have no more public->protected dependancies now
2013-01-09 11:06:10 -05:00
Ryan Poplin
396bce1f28
Reverting this change until we can figure out the right thing to do here.
2013-01-09 10:51:30 -05:00
Eric Banks
676e79542a
Bring CombineVariants back to public since it's used for SG. I needed to break ChromosomeCountConstants out of ChromosomeCounts to make this work.
2013-01-09 10:39:48 -05:00
Ryan Poplin
c87ad8c0ef
Bug fixes related to HC's GGA mode. Tracking just the artificial allele isn't sufficient when there are multiple GGA records that change the reference basis. Also, duplicated records screw up the tracking of merged alleles.
2013-01-09 10:00:46 -05:00
Chris Hartl
ad7c2a08d4
Normalize by the event type counts, not the total genotype counts: more useful normalization.
2013-01-09 09:12:41 -05:00
Chris Hartl
b56754606b
Initial break-out of GenotypeConcordance as a standalone walker. Some basic functionality testing. Currently performs only a pairwise comparison, but is very careful about proper tabulation through the GenotypeType enum.
2013-01-09 00:34:07 -05:00
Eric Banks
264cc9e78d
Resolve protected->public dependencies for BQSR by wrapping the BQSR-specific arguments in a new class.
...
Instead of the GATK Engine creating a new BaseRecalibrator (not clean), it just keeps track of the arguments (clean).
There are still some dependency issues, but it looks like they are related to Ami's code. Need to look into it further.
2013-01-08 16:23:29 -05:00
Eric Banks
ee7d85c6e6
Move around the DiploidGenotype classes (so it can be used by the GATKPaperGenotyper)
2013-01-08 15:53:11 -05:00
Eric Banks
0e2e672521
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2013-01-08 15:46:39 -05:00
Eric Banks
f0bd1b5ae5
Okay, all public->protected dependencies are gone except for the BQSR arguments. I'll need to think through this but should be able to make that work too.
2013-01-08 15:46:32 -05:00
Tad Jordan
9cbb2b868f
ErrorRatePerCycleIntegrationTest fix
...
-- sorting by row is required
2013-01-08 14:53:07 -05:00
Eric Banks
b099e2b4ae
Moving integration tests to protected
2013-01-08 09:34:08 -05:00
Eric Banks
dfe4cf1301
When merging the PerReadAlleleLikelihoodMap classes, I forgot to initialize the underlying objects. This was causing the LargeScaleTests to fail.
2013-01-08 09:24:12 -05:00
Eric Banks
9e6c2afb28
Not sure why IntelliJ didn't add this for commit like the other dirs
2013-01-07 18:11:07 -05:00
Ami Levy-Moonshine
3787ee6de7
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2013-01-07 17:07:29 -05:00
Eric Banks
47d030a52d
Oops, move the covariates over too
2013-01-07 15:47:25 -05:00
Eric Banks
35699a8376
Move bqsr utils to protected
2013-01-07 15:41:21 -05:00
Eric Banks
a0219acfaa
Collapse the PerReadAlleleLikelihoodMap classes into 1 now that Lite is gone
2013-01-07 14:55:21 -05:00
Eric Banks
35d9bd377c
Moved (nearly) all Walkers from public to protected and removed GATKLite utils
2013-01-07 14:42:40 -05:00
Ryan Poplin
4f95f850b3
Bug fix in the HC's allele mapping for multi-allelic events. Using the allele alone as a key isn't sufficient because alleles change when the reference allele changes during VariantContextUtils.simpleMerge for multi-allelic events.
2013-01-07 11:05:44 -05:00
Ami Levy-Moonshine
d3c2c97fb2
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2013-01-06 23:35:47 -05:00
Ami Levy-Moonshine
81eef3aa37
merge development branchs of log-less HMM and FastGatherer to master
2013-01-06 23:01:58 -05:00
Eric Banks
52067f0549
Handle merge conflicts
2013-01-06 12:29:12 -05:00
Chris Hartl
41bc416b65
Remove AAL and update MD5s.
2013-01-04 16:46:14 -05:00
Eric Banks
bce6fce58d
Resolving merge conflicts after Mark's latest push
2013-01-04 14:46:39 -05:00
Eric Banks
dd7f5e2be7
Hooking up the Bayesian estimate code for calculating Qemp in BQSR; various fixes after adding unit tests.
2013-01-04 14:43:11 -05:00
Mark DePristo
bbdf9ee91b
BQSR cleanup: merge Advanced and Standard recalibration engine into just the RecalibrationEngine
...
-- As we are no longer maintaining a public/protected system we need only have one RecalibrationEngine.
-- Misc. code cleanup and docs along the way
2013-01-04 11:39:24 -05:00
Mark DePristo
7df47418d8
BQSR optimization: make RecalibrationTables thread-local, and merge results in onTraversalDone
...
-- With the newer, faster BQSR, scaling was limited by the NestedIntegerArray. The solution to this is to make the entire table thread-local, so that each nct thread has its own data and doesn't have any collisions.
-- Removed the previous partial solution of having a thread-local quality score table
-- Added a new argument -lowMemory
2013-01-04 11:39:24 -05:00
Chris Hartl
3753209584
One md5sum slipped past in the HC integration test.
2013-01-02 15:09:28 -05:00
Chris Hartl
e1d09ab0db
QD is now divided by the average length of the alternate allele (weighted by the allele count). The average length is stored in a related annotation, "AAL", which can be used to re-compute the "old" QD by simple multiplication. Integration tests *should* all pass.
2013-01-02 14:41:29 -05:00
Eric Banks
275575462f
Protect against non-standard ref bases. Ryan, please review.
2012-12-26 15:46:21 -05:00
Mark DePristo
7bf1f67273
BQSR optimization: read group x quality score calibration table is thread-local
...
-- AdvancedRecalibrationEngine now uses a thread-local table for the quality score table, and in finalizeData merges these thread-local tables into the final table. Radically reduces the contention for RecalDatum in this very highly used table
-- Refactored the utility function to combine two tables into RecalUtils, and created UnitTests for this function, as well as all of RecalibrationTables. Updated combine in RecalibrationReport to use this table combiner function
-- Made several core functions in RecalDatum into final methods for performance
-- Added RecalibrationTestUtils, a home for recalibration testing utilities
2012-12-24 13:35:58 -05:00
Mark DePristo
0f0188ddb1
Optimization of BQSR
...
-- Created a ReadRecalibrationInfo class that holds all of the information (read, base quality vectors, error vectors) for a read for the call to updateDataForRead in RecalibrationEngine. This object has a restrictive interface to just get information about specific qual and error values at offset and for event type. This restrict allows us to avoid creating an vector of byte 45 for each read to represent BI and BD values not in the reads. Shaves 5% of the runtime off the entire code.
-- Cleaned up code and added lots more docs
-- With this commit we no longer have much in the way of low-hanging fruit left in the optimization of BQSR. 95% of the runtime is spent in BAQing the read, and updating the RecalData in the NestedIntegerArrays.
2012-12-24 13:35:09 -05:00
Ami Levy-Moonshine
6590039bc3
add fast gather to UG; change UG to work with log-lessHMM (work in prograss)
2012-12-20 14:58:57 -05:00
Ryan Poplin
c8cd6ac465
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-20 14:58:04 -05:00
Ryan Poplin
a098888f4d
Updating missed UG md5
2012-12-20 14:57:53 -05:00
Tad Jordan
b491c177ff
Added functionality of outputting sorted GATKReport Tables
...
- Added an optional argument to BaseRecalibrator to produce sorted GATKReport Tables
- Modified BSQR Integration Tests to include the optional argument. Tests now produce sorted tables
2012-12-20 14:02:21 -05:00
Ryan Poplin
54e5c84018
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-19 11:31:40 -05:00
Ryan Poplin
aa39037be8
updating UG integration tests.
2012-12-19 11:31:35 -05:00