Commit Graph

319 Commits (35543b9cbab7ff489b2eb8fb3bdbf518f9e3cf0d)

Author SHA1 Message Date
Eric Banks 67fafbb625 Forgot an include 2013-01-11 09:19:46 -05:00
Eric Banks 6bf0cc32f9 When reducing multiple samples it is possible to try to close a region that for a given sample has no reads. Currently we'd NPE. Fixed. 2013-01-11 09:16:19 -05:00
Eric Banks e7906713d9 Moving some random walkers back to public as requested by Mark. Mauricio will the licenses get updated automatically? 2013-01-11 02:03:43 -05:00
Eric Banks 3a51823c2a Clean up imports 2013-01-10 23:35:01 -05:00
Eric Banks e4b7b1955c Forgot to add the note about length normalization to the QD docs 2013-01-10 23:34:06 -05:00
Eric Banks ff5ac986d8 Fix docs for QD 2013-01-10 23:31:46 -05:00
Mauricio Carneiro 2a4ccfe6fd Updated all JAVA file licenses accordingly
GSATDG-5
2013-01-10 17:06:41 -05:00
Mauricio Carneiro dd177b1714 Removing fully commented out varianteval evaluators
- Files were completely commmented out, and were screwing up my license script. Dont like them. Removed them.

GSATDG-5
2013-01-10 17:06:12 -05:00
Chris Hartl 80dec72c53 Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable 2013-01-10 14:35:59 -05:00
Chris Hartl 31a5f88c4f Expanded unit tests to cover the Concordance Metrics class fairly uniformly. 2013-01-10 14:33:47 -05:00
Ryan Poplin 1a18947abf Adding new command line argument requested on the forum to control the maximum number of haplotypes that are sent forward for genotyping. In the presence of a large degree of heterozygosity the current algorithm breaks down and so this argument would need to be increased. 2013-01-09 15:54:02 -05:00
Ryan Poplin 487fb2afb4 Bug fix for the case of overlapping assembled and partially-assembled events created by the HC. Unfortunately the symbolic allele can't be combined with the indel allele because the reference basis will change. 2013-01-09 15:30:46 -05:00
Chris Hartl 6787f86803 Eliminate the import of DiploidGenotype, which switched public/private underneath me but for some reason didn't stop me from compiling... 2013-01-09 13:23:24 -05:00
Chris Hartl c1de92b511 Add in some todo items 2013-01-09 13:16:06 -05:00
Chris Hartl 8d126161e2 Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable 2013-01-09 13:15:04 -05:00
Eric Banks 3a0dd4b175 Oops, I broke the build. NOW we shouldn't have any more public->protected dependancies. 2013-01-09 11:12:28 -05:00
Eric Banks a921b06e02 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2013-01-09 11:06:17 -05:00
Eric Banks 4fa439d89e Move some classes back to public because they are used in the engine. Move some test classes to protected. We should have no more public->protected dependancies now 2013-01-09 11:06:10 -05:00
Ryan Poplin 396bce1f28 Reverting this change until we can figure out the right thing to do here. 2013-01-09 10:51:30 -05:00
Eric Banks 676e79542a Bring CombineVariants back to public since it's used for SG. I needed to break ChromosomeCountConstants out of ChromosomeCounts to make this work. 2013-01-09 10:39:48 -05:00
Ryan Poplin c87ad8c0ef Bug fixes related to HC's GGA mode. Tracking just the artificial allele isn't sufficient when there are multiple GGA records that change the reference basis. Also, duplicated records screw up the tracking of merged alleles. 2013-01-09 10:00:46 -05:00
Chris Hartl ad7c2a08d4 Normalize by the event type counts, not the total genotype counts: more useful normalization. 2013-01-09 09:12:41 -05:00
Chris Hartl b56754606b Initial break-out of GenotypeConcordance as a standalone walker. Some basic functionality testing. Currently performs only a pairwise comparison, but is very careful about proper tabulation through the GenotypeType enum. 2013-01-09 00:34:07 -05:00
Eric Banks 264cc9e78d Resolve protected->public dependencies for BQSR by wrapping the BQSR-specific arguments in a new class.
Instead of the GATK Engine creating a new BaseRecalibrator (not clean), it just keeps track of the arguments (clean).

There are still some dependency issues, but it looks like they are related to Ami's code.  Need to look into it further.
2013-01-08 16:23:29 -05:00
Eric Banks ee7d85c6e6 Move around the DiploidGenotype classes (so it can be used by the GATKPaperGenotyper) 2013-01-08 15:53:11 -05:00
Eric Banks 0e2e672521 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2013-01-08 15:46:39 -05:00
Eric Banks f0bd1b5ae5 Okay, all public->protected dependencies are gone except for the BQSR arguments. I'll need to think through this but should be able to make that work too. 2013-01-08 15:46:32 -05:00
Tad Jordan 9cbb2b868f ErrorRatePerCycleIntegrationTest fix
-- sorting by row is required
2013-01-08 14:53:07 -05:00
Eric Banks dfe4cf1301 When merging the PerReadAlleleLikelihoodMap classes, I forgot to initialize the underlying objects. This was causing the LargeScaleTests to fail. 2013-01-08 09:24:12 -05:00
Eric Banks 9e6c2afb28 Not sure why IntelliJ didn't add this for commit like the other dirs 2013-01-07 18:11:07 -05:00
Ami Levy-Moonshine 3787ee6de7 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2013-01-07 17:07:29 -05:00
Eric Banks 47d030a52d Oops, move the covariates over too 2013-01-07 15:47:25 -05:00
Eric Banks 35699a8376 Move bqsr utils to protected 2013-01-07 15:41:21 -05:00
Eric Banks a0219acfaa Collapse the PerReadAlleleLikelihoodMap classes into 1 now that Lite is gone 2013-01-07 14:55:21 -05:00
Eric Banks 35d9bd377c Moved (nearly) all Walkers from public to protected and removed GATKLite utils 2013-01-07 14:42:40 -05:00
Ryan Poplin 4f95f850b3 Bug fix in the HC's allele mapping for multi-allelic events. Using the allele alone as a key isn't sufficient because alleles change when the reference allele changes during VariantContextUtils.simpleMerge for multi-allelic events. 2013-01-07 11:05:44 -05:00
Ami Levy-Moonshine d3c2c97fb2 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2013-01-06 23:35:47 -05:00
Ami Levy-Moonshine 81eef3aa37 merge development branchs of log-less HMM and FastGatherer to master 2013-01-06 23:01:58 -05:00
Mark DePristo bbdf9ee91b BQSR cleanup: merge Advanced and Standard recalibration engine into just the RecalibrationEngine
-- As we are no longer maintaining a public/protected system we need only have one RecalibrationEngine.
-- Misc. code cleanup and docs along the way
2013-01-04 11:39:24 -05:00
Mark DePristo 7df47418d8 BQSR optimization: make RecalibrationTables thread-local, and merge results in onTraversalDone
-- With the newer, faster BQSR, scaling was limited by the NestedIntegerArray.  The solution to this is to make the entire table thread-local, so that each nct thread has its own data and doesn't have any collisions.
-- Removed the previous partial solution of having a thread-local quality score table
-- Added a new argument -lowMemory
2013-01-04 11:39:24 -05:00
Eric Banks 275575462f Protect against non-standard ref bases. Ryan, please review. 2012-12-26 15:46:21 -05:00
Mark DePristo 7bf1f67273 BQSR optimization: read group x quality score calibration table is thread-local
-- AdvancedRecalibrationEngine now uses a thread-local table for the quality score table, and in finalizeData merges these thread-local tables into the final table.  Radically reduces the contention for RecalDatum in this very highly used table
-- Refactored the utility function to combine two tables into RecalUtils, and created UnitTests for this function, as well as all of RecalibrationTables.  Updated combine in RecalibrationReport to use this table combiner function
-- Made several core functions in RecalDatum into final methods for performance
-- Added RecalibrationTestUtils, a home for recalibration testing utilities
2012-12-24 13:35:58 -05:00
Mark DePristo 0f0188ddb1 Optimization of BQSR
-- Created a ReadRecalibrationInfo class that holds all of the information (read, base quality vectors, error vectors) for a read for the call to updateDataForRead in RecalibrationEngine.  This object has a restrictive interface to just get information about specific qual and error values at offset and for event type.  This restrict allows us to avoid creating an vector of byte 45 for each read to represent BI and BD values not in the reads.  Shaves 5% of the runtime off the entire code.
-- Cleaned up code and added lots more docs
-- With this commit we no longer have much in the way of low-hanging fruit left in the optimization of BQSR.  95% of the runtime is spent in BAQing the read, and updating the RecalData in the NestedIntegerArrays.
2012-12-24 13:35:09 -05:00
Ami Levy-Moonshine 6590039bc3 add fast gather to UG; change UG to work with log-lessHMM (work in prograss) 2012-12-20 14:58:57 -05:00
Eric Banks 4a7e0427a3 Pushing the RR bug fix that I puished into unstable into stable, as requested by Tim 2012-12-19 11:47:16 -05:00
Ryan Poplin 54e5c84018 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-19 11:31:40 -05:00
Eric Banks 70479cb71d RR bug fix: we were failing when a read started with an insertion just at the edge of the consensus region.
The weird part is that the comments claimed it was doing what it was supposed to, but it didn't actually do it.
Now we maintain the last header element of the consensus (but without bases and quals) if it adjoins an element with an insertion.

Added the user's test file as an integration test.
2012-12-19 10:59:07 -05:00
David Roazen 07b369ca7e Move VCF/BCF2/VariantContext to new standalone org.broadinstitute.variant package
This is an intermediate commit so that there is a record of these changes in our
commit history. Next step is to isolate the test classes as well, and then move
the entire package to the Picard repository and replace it with a jar in our repo.

-Removed all dependencies on org.broadinstitute.sting (still need to do the test classes,
though)

-Had to split some of the utility classes into "GATK-specific" vs generic methods
(eg., GATKVCFUtils vs. VCFUtils)

-Placement of some methods and choice of exception classes to replace the StingExceptions
and UserExceptions may need to be tweaked until everyone is happy, but this can be
done after the move.
2012-12-19 10:25:22 -05:00
Ryan Poplin 902ca7ea70 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-18 15:45:33 -05:00
Ryan Poplin b5d590ba92 Based on NA12878 knowledge base experiments updating HC to allow for a much smaller minimum kmer length in the assembly graph. 2012-12-18 15:43:56 -05:00
Mark DePristo a481d006f0 Optimizations for applying BQSR table with PrintReads
-- Cleaned up code in updateDataForRead so that constant values where not computed in inner loops
-- BaseRecalibrator doesn't create it's own fasta index reader, it just piggy backs on the GATK one
-- ReadCovariates <init> now uses a thread local cache for it's int[][][] keys member variable.  This stops us from recreating an expensive array over and over.  In order to make this really work had to update recordValues in ContextCovariate so it writes 0s over base values its skipping because of low quality base clipping.  Previously the values in the ReadCovariates keys were 0 because they were never modified by ContextCovariates. Now these values are actually zero'd out explicitly by the covariates.
2012-12-17 16:47:27 -05:00
Mark DePristo 5ec25797b3 Optimizations for BaseRecalibrator
-- No longer computes at each update the overall read group table.  Now computes this derived table only at the end of the computation, using the ByQual table as input.  Reduces BQSR runtime by 1/3 in my test
2012-12-17 16:47:27 -05:00
Ryan Poplin 98f18b5f9e Changing the HC over to using the non-contamination-downsampled read maps for the purposes of annotations. This behavior now matches the UG. There is a new command line option to go back to the older behavior to explore the differences. 2012-12-17 11:27:44 -05:00
Mauricio Carneiro 5f1afb4136 Fixing an off-by-one clipping error in ReduceReads for reads off the contig
Reads that are soft-clipped off the contig (before the beginning of the contig) were being soft-clipped to position 0 instead of 1 because of an off-by-one issue. Fixed and included in the integration test.
2012-12-13 22:10:11 -05:00
Ryan Poplin 211a6e78ea Further related bug fixes to GGA mode in the HC: some variants (especially MNPs) were causing problems because they don't have to start at the current location to match the allele being genotyped. Fixed. 2012-12-12 14:53:02 -05:00
Mauricio Carneiro 8a115edbaf ReduceReads is now scattered by contig
It's no longer safe to scatter/gather by interval because now we don't hard-clip to the intervals anymore.
2012-12-10 15:25:27 -05:00
Eric Banks bdda63d973 Related bug fixes to GGA mode in the HC: some variants (especially MNPs) were causing problems because they don't have to start at the current location to match the allele being genotyped. Fixed. 2012-12-10 14:47:04 -05:00
Eric Banks 574d5b467f Bug fix for indel HMM: protect against situation where long reads (e.g. Sanger) in a pileup can lead to a read starting after the haplotype end for a given haplotype. 2012-12-09 02:09:34 -05:00
Eric Banks 406adb8d44 The allele biased downsampling should not abort if there's a reduced read. Rather it should always keep the RR and downsample only original reads in the pileup. 2012-12-05 23:15:36 -05:00
Ryan Poplin 156d6a5e0b misc minor bug fixes to GenotypingEngine. 2012-12-03 12:47:35 -05:00
Ryan Poplin 18b002c99c Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-03 10:08:56 -05:00
Ryan Poplin 1bdf17ef53 Reworking of how the likelihood calculation is organized in the HaplotypeCaller to facilitate the inclusion of per allele downsampling. We now use the downsampling for both the GL calculations and the annotation calculations. 2012-12-02 11:58:32 -05:00
Joel Thibault 198923b597 Add ActiveRegionReadState handling 2012-11-28 13:59:57 -05:00
Mauricio Carneiro 97fd5de260 Merging latest CMI updates with UNSTABLE 2012-11-27 09:08:00 -05:00
Ryan Poplin c3b7dd1374 Misc cleanup in the HaplotypeCaller. Cleaning up unused arguments after recent changes to HC-GenotypingEngine 2012-11-26 12:19:11 -05:00
Eric Banks 937ac7290f Lots more GGA fixes for the HC now that I understand what's going on internally. Integration tests pass except for the GGA test which I believe now produces better results. 2012-11-20 16:13:29 -05:00
Eric Banks f0b8a0228f Quick fix for HC refactoring: when copying over Haplotype objects, make sure to copy over the artificial allele used to create it too. 2012-11-19 09:57:55 -05:00
Eric Banks ff180a8e02 Significant refactoring of the Haplotype Caller to handle problems with GGA. The main fix is that we now maintain a mapping from 'original' allele to 'Smith-Waterman-based' allele so that we no longer need to do a (buggy) matching throughout the calling process. 2012-11-19 09:09:57 -05:00
Mauricio Carneiro 8b749673bc centralize header element removal in reduce reads 2012-11-14 13:59:34 -05:00
Mauricio Carneiro e35fd1c717 Merging CMI-0.5.0 and GATK-2.2 together. 2012-11-14 10:42:03 -05:00
Mauricio Carneiro a079d8d0d1 Breaking the utility to write @PG tags for SAMFileWriters and StingSAMFileWriters 2012-11-14 10:33:22 -05:00
Mauricio Carneiro dba31018f4 Implementation of BySampleSAMFileWriter
ReduceReads now works with the n-way-out capability, splitting by sample.
DEV-27 #resolve #time 3m
2012-11-14 10:33:22 -05:00
Mauricio Carneiro a17cd54b68 Co-Reduction implementation in ReduceReads
ReduceReads now co-reduces bams if they're passed in toghether with multiple -I. Co-reduction forces every variant region in one sample to be a variant region in all samples.
Also:
  * Added integrationtest for co-reduction
  * Fixed bug with new no-recalculation implementation of the marksites object where the last object wasn't being removed after finalizing a variant region (updated MD5's accordingly)

DEV-200 #resolve #time 8m
2012-11-14 10:33:21 -05:00
Eric Banks 0a2dded093 Fixes for bugs uncovered by unit tests 2012-11-06 16:07:40 -08:00
Eric Banks b07106b3a7 Reimplement the allele biased downsampling to be smarter. Now we don't blindly pull n% of reads off of each allele. Instead, we try all possible genotype conformations for the contaminating sample and choose the one that provides the best genotype for the target sample (based heuristically on allele balance). This method allows us to save some of the reads that belong to the target sample, which should make Daniel M happy. Added unit tests to test the biased downsampling functionality. 2012-11-06 14:39:58 -08:00
Mark DePristo 1444cd753b Bugfix for GSA-647 HaplotypeCaller misses good variant because the active region doesn't trigger for an exome
-- The logic for determining active regions was a bit broken in the HC when intervals were used in the system
-- TraverseActiveRegions now uses the AllLocus view, since we always want to see all reference sites, not just those covered.  Simplifies logic of TAR
-- Non-overlapping intervals are always treated as separate objects for determing active / inactive state.  This means that each exon will stand on its own when deciding if it should be active or inactive
-- Misc. cleanup, docs of some TAR infrastructure to make it safer and easier to debug in the future.
-- Committing the SingleExomeCalling script that I used to find this problem, and will continue to use in evaluating calling of a single exome with the HC
-- Make sure to get all of the reads into the set of potentially active reads, even for genomic locations that themselves don't overlap the engine intervals but may have reads that overlap the regions
-- Remove excessively expensive calls to check bases are upper cased in ReferenceContext
-- Update md5s after a lot of manual review and discussion with Ryan
2012-11-01 15:34:04 -04:00
Guillermo del Angel 51a9ce28e1 Merge remote-tracking branch 'unstable/master' into develop 2012-10-31 10:29:48 -04:00
Ryan Poplin 4e661847b2 DelocalizedBaseRecalibrator becomes the BaseRecalibrator. 2012-10-29 12:53:39 -04:00
Eric Banks ed11b7dab2 Fix UG parallelization test 2012-10-26 12:10:44 -04:00
Eric Banks 7a706ed345 Fix some of the broken integration tests 2012-10-26 11:23:44 -04:00
Eric Banks a53e03d525 Do not let reduced reads get removed in the contamination down-sampling 2012-10-26 02:13:04 -04:00
Eric Banks df9e0b7045 Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-10-25 02:49:54 -04:00
Eric Banks 72714ee43e Minor patches to get the contamination down-sampling working for indels. Adding @Hidden logging output for easy debugging. 2012-10-25 02:47:42 -04:00
Eric Banks c6b57fffda Added allele biased down-sampling capabilities to the PerReadAlleleLikelihoodMap object, which means that both the UG and HC can use this functionality. Note that it's only available in protected, so GATK-lite users won't be allowed to enable it. Needs more testing. 2012-10-24 22:52:25 -04:00
Eric Banks 9da7bbf689 Refactoring the PerReadAlleleLikelihoodMap in preparation for adding contntamination downsampling into protected only. 2012-10-24 15:49:07 -04:00
David Roazen 991658acf4 BQSR: use more granular locking for concurrency control
-With this change, BQSR performance scales properly by thread rather
 than gaining nothing from additional threads.
-Benefits are seen when using either -nt (HierarchicalMicroScheduler) or -nct
 (NanoScheduler)
-Removes high-level locks in the recalibration engines and NestedIntegerArray
 in favor of maximally-granular locks on and around manipulation of the leaf
 nodes of the NestedIntegerArray.
-NestedIntegerArray now creates all interior nodes upfront rather than on
 the fly to avoid the need for locking during tree traversals. This uses
 more memory in the initial part of BQSR runs, but the BQSR would eventually
 converge to use this memory anyway over the course of a typical run.

IMPORTANT NOTE: This does not mean it's safe to run the old BaseRecalibrator
walker with multiple threads. The BaseRecalibrator walker is and will never be
thread-safe, as it's a LocusWalker that uses read attributes to track
state information. ONLY the newer DelocalizedBaseRecalibrator can be made
thread-safe (and will hopefully be made so in my subsequent commits). This
commit addresses performance, not correctness.
2012-10-24 15:22:50 -04:00
Ryan Poplin 094db7bf24 We now require at least 10 samples to merge variants into complex events in the HC. Added a new population based bam for the complex event integration test. 2012-10-24 14:07:36 -04:00
Mauricio Carneiro c210b7cde4 Merge GATK repo into CMI-GATK
Bringing in the following relevant changes:
	* Fixes the indel realigner N-Way out null pointer exception DEV-10
	* Optimizations to ReduceReads that bring the run time to 1/3rd.

Conflicts:
	protected/java/src/org/broadinstitute/sting/gatk/walkers/compression/reducereads/SlidingWindow.java

DEV-10 #resolve #time 2m
2012-10-23 10:59:11 -04:00
Mark DePristo 90f59803fd MaxAltAlleles now defaults to 6, no more MaxAltAllelesForIndels
-- Updated StandardCallerArgumentCollection to remove MaxAltAllelesForIndels. Previous argument is deprecated with meaningful doc message for people to use maxAltAlleles
-- All constructores, factory methods, and test builders and their users updated to provide just a single argument
-- Updating MD5s for integration tests that change due to genotyping more alleles
-- Adding more alleles to genotyping results in slight changes in the QUAL value for multi-allelic loci where one or more alleles aren't polymorphic.  That's simply due to the way that alternative hypotheses contribute as reference evidence against each true allele.  The effect can be large (new qual = old qual / 2 in one case here).
-- If we want more precision in our estimates we could decide (Eric, should we discuss?) to actually separately do a discovery phase in the genotyping, eliminate all variants not considered polymorphic, and then do a final round of calling to get the exact QUAL value for only those that are segregating.  This would have the value of having the QUAL stay constant as more alleles are genotyped, at the cost of some code complexity increase and runtime.  Might be worth it through
2012-10-22 13:47:56 -04:00
Eric Banks ccae6a5b92 Fixed the RR bug I (knowingly) introduced last week: turns out we can't trust a context size's worth of data from the previous marking. I think Mauricio warned me about this but I forgot. 2012-10-22 11:48:34 -04:00
Mark DePristo 0fcd358ace Original EXACT model implementation lives, providing another reference (bi-allelic only) EXACT model
-- Potentially a very fast implementation (it's very clean) but restricted to the biallelic case
-- A starting point for future bi-allelic only optimized (logless) or generalized (bi-allelic general ploidy) implementations
-- Added systematic unit tests covering this implementation, and comparing it to others
-- Uncovered a nasty normalization bug in StateTracker that was capping our likelihoods at 0, even after summing up multiple likelihoods, which is just not safe to do and was causing us to lose likelihood in some cases
-- Removed the restriction that a likelihood be <= 0 in StateTracker, and the protection for these cases in GeneralPloidyExactAFCalc which just wasn't right
2012-10-21 12:42:31 -04:00
Mark DePristo 326f429270 Bugfixes to make new AFCalc system pass integrationtests
-- GeneralPloidyExactAFCalc turns -Infinity values into -Double.MAX_VALUE, so our calculations pass unit tests
-- Bugfix for GeneralPloidyGenotypeLikelihoodsCalculationModel, return a null VC when the only allele we get from our final alleles to use method is the reference base
-- Fix calculation of reference posteriors when P(AF == 0) = 0.0 and P(AF == 0) = X for some meaningful value of X.  Added unit test to ensure this behavior is correct
-- Fix horrible sorting bug in IndependentAllelesDiploidExactAFCalc that applied the theta^N priors in the wrong order.  Add contract to ensure this doesn't ever happen again
-- Bugfix in GLBasedSampleSelector, where VCs without any polymorphic alleles were being sent to the exact model
--
2012-10-21 12:42:31 -04:00
Mark DePristo 695cf83675 More docs and contracts for classes in genotyper.afcalc
-- Future protection of the output of GeneralPloidyExactAFCalc, which produces in some cases bad likelihoods (positive values)
2012-10-21 12:42:31 -04:00
Mark DePristo 99c9031cb4 Merge AFCalcResultTracker into StateTracker, cleanup
-- These two classes were really the same, and now they are actually the same!
-- Cleanuped the interfaces, removed duplicate data
-- Added lots of contracts, some of which found numerical issues with GeneralPloidyExactAFCalc (which have been patched over but not fixed)
-- Moved goodProbability and goodProbabilityVector utilities to MathUtils.  Very useful for contracts!
2012-10-21 12:42:31 -04:00
Eric Banks 0616b98551 Not sure why we were setting the UAC variables instead of the simpleUAC ones when that's what we wanted. 2012-10-21 08:26:26 -04:00
Eric Banks 2c624f76c8 Refactoring the Unified (and Standard) Argument Collections because it was really ugly that the subclass had to do all the cloning for the super class. The clone() method is really not recommended best practice in Java anyways, so I changed it so that we use standard overloaded constructors. Confirmed that the Haplotype Caller --help docs do not include UG-specific arguments. 2012-10-20 20:35:54 -04:00
Ryan Poplin a647f1e076 Refactoring the PairHMM util class to allow for multiple implementations which can be specified by the callers via an enum argument. Adding an optimized PairHMM implementation which caches per-read calculations as well as a logless implementation which drastically reduces the runtime of the HMM while also increasing the precision of the result. In the HaplotypeCaller we now lexicographically sort the haplotypes to take maximal benefit of the haplotype offset optimization which only recalculates the HMM matrices after the first differing base in the haplotype. Many thanks to Mauricio for all the initial groundwork for these optimizations. The change to the one HC integration test is in the fourth decimal of HaplotypeScore. 2012-10-20 16:38:18 -04:00
Eric Banks 4622896312 Oops, killed contracts 2012-10-19 13:04:05 -04:00
Eric Banks f7bd4998fc No need for dummy GLs 2012-10-19 12:13:59 -04:00
Eric Banks deca564aef Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-10-19 12:01:49 -04:00