gatk-3.8/protected/java/test/org/broadinstitute/sting/gatk/walkers
Mauricio Carneiro 285ab2ac62 Better caching for the HaplotypeCaller
Problem
-------
Caching strategy is incompatible with the current sorting of the haplotypes, and is rendering the cache nearly useless.

Before the PairHMM updates, we realized that a lexicographically sorted list of haplotypes would optimize the use of the cache. This was only true until we've added the initial condition to the first row of the deletion matrix, which depends on the length of the haplotype. Because of that, every time the haplotypes differ in length, the cache has to be wiped. A lexicographic sorting of the haplotypes will put different lengths haplotypes clustered together therefore wasting *tons* of re-compute.

Solution
-------
Very simple. Sort the haplotypes by LENGTH and then in lexicographic order.
2013-08-02 01:27:29 -04:00
..
annotator moved SnpEffUtilUnitTest to public tree 2013-07-30 17:51:40 -04:00
beagle Simpler FILTER and info field encoding for BeagleOutputToVCF 2013-06-14 15:56:13 -04:00
bqsr Removed plots generation from the BaseRecalibration software 2013-06-19 14:47:56 -04:00
compression/reducereads Two reduce reads updates/fixes: 2013-08-01 14:34:59 -04:00
diagnostics Update MD5s and the Diagnose Target scala script 2013-05-13 12:06:17 -04:00
diffengine Fixed issues raised by Appistry QA (mostly small fixes, corrections & clarifications to GATKDocs) 2013-03-12 10:57:14 -04:00
fasta Updated all JAVA file licenses accordingly 2013-01-10 17:06:41 -05:00
filters Don't allow users to specify keys and IDs that contain angle brackets or equals signs (not allowed in VCF spec). 2013-04-05 00:52:32 -04:00
genotyper Last feature request from Reich/Paavo labs: the allSitePLs feature in UG worked but not quite filled requirements. What's needed is the ability to have all 10 PLs for EVERY site, regardless of whether they are variant or not. Previous version only emitted the 10 PLs in reference sites. Problem is that, if all PLs are emitted in all sites and every single site is quad-allelic (only way to have the PLs printed out in a valid way) then the ability to filter variants and to use the INFO fields may be compromised. 2013-07-18 12:54:52 -04:00
haplotypecaller Better caching for the HaplotypeCaller 2013-08-02 01:27:29 -04:00
indels Another fix for the Indel Realigner that arises because of secondary alignments. 2013-06-21 16:59:22 -04:00
phasing Updated all JAVA file licenses accordingly 2013-01-10 17:06:41 -05:00
validation MathUtils.randomSubset() now uses Collections.shuffle() (indirectly, through the other methods 2013-03-29 14:52:10 -04:00
varianteval Move some VCF/VariantContext methods back to the GATK based on feedback 2013-01-29 16:56:55 -05:00
variantrecalibration Automatically order the annotation dimensions in the VQSR by their standard deviation instead of the order they were specified on the command line. 2013-07-26 10:22:43 -04:00
variantutils CombineVariants no longer adds PASS to unfiltered records 2013-05-20 16:53:51 -04:00