gatk-3.8/public/java/test/org/broadinstitute/sting/utils
Mark DePristo 9c81f45c9f Phase I commit to get shadowBCFs passing tests
-- The GATK VCFWriter now enforces by default that all INFO, FILTER, and FORMAT fields be properly defined in the header.  This helps avoid some of the low-level errors I saw in SelectVariants.  This behavior can be disable in the engine with the --allowMissingVCFHeaders argument
-- Fixed broken annotations in TandemRepeat, which were overwriting AD instead of defining RPA
-- Optimizations to VariantEval, removing some obvious low-hanging fruit all in the subsetting of variants by sample
-- SelectVariants header fixes -- Was defining DP for the info field as a FORMAT field, as for AC, AF, and AN original
-- Performance optimizations in BCF2 codec and writer
    -- using arrays not lists for intermediate data structures
    -- Create once and reuse an array of GenotypeBuilders for the codec, avoiding reallocating this data structure over and over
-- VCFHeader (which needs a complete rewrite, FYI Eric)
    -- Warn and fix on the way flag values with counts > 0
    -- GenotypeSampleNames are now stored as a List as they are ordered, and the set iteration was slow.  Duplicates are detected once at header creation.
    -- Explicitly track FILTER fields for efficient lookup in their own hashmap
    -- Automatically add PL field when we see a GL field and no PL field
    -- Added get and has methods for INFO, FILTER, and FORMAT fields
-- No longer add AC and AF values to the INFO field when there's no ALT allele
-- Memory efficient comparison of VCF and BCF files for shadow BCF testing.  Now there's no (memory) constraint on the size of the files we can compare
-- Because of VCF's limited floating point resolution we can only use 1 sig digit for comparing doubles between BCF and VCF
2012-06-21 15:16:26 -04:00
..
R Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
activeregion Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
baq Putative fix for BAQ array out of bounds 2011-09-21 11:25:08 -04:00
clipping Refactor on how RR treats soft clips 2012-06-21 14:02:03 -04:00
codecs Optimizations for VCF and BCF2 2012-06-14 16:42:39 -04:00
collections Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
crypt Do not fail tests that require the GATK private key if the user does not have permission to read it 2012-03-06 15:57:02 -05:00
fasta Reduced the number of combinations being tested here, which was overkill 2011-09-01 10:42:43 -04:00
fragments GATKSAMRecord refactor 2011-11-03 15:43:26 -04:00
interval Feature request from Tim that could be useful to all: there's now an --interval_padding argument that specifies how many basepairs to add to each of the intervals provided with -L (on both ends). This is particularly useful when trying to run over the exome plus flanks and don't want to have to pre-compute the flanks (just use e.g. --interval_padding 50). Added integration test to cover this feature. 2012-06-18 21:36:27 -04:00
io Public-key authorization scheme to restrict use of NO_ET 2012-03-06 00:09:43 -05:00
pileup GATKSAMRecord refactor 2011-11-03 15:43:26 -04:00
recalibration The next round of BQSR optimizations: no more Long[] array creation 2012-06-14 00:05:42 -04:00
report Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
runtime No more hunting down R "resources". As a tradeoff Rscript cannot be specified on the commandline and will be found in the environment path. 2011-10-27 14:17:07 -04:00
sam Bug fix for: https://getsatisfaction.com/gsa/topics/problem_with_indelrealigner_and_l_unmapped 2012-04-27 09:58:38 -04:00
text Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
threading Removed GATK use of distributed parallelism framework. 2011-07-20 16:26:09 -04:00
variantcontext Phase I commit to get shadowBCFs passing tests 2012-06-21 15:16:26 -04:00
BaseUtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
BitSetUtilsUnitTest.java Oops, forgot to push the unit tests 2012-06-12 11:38:30 -04:00
GenomeLocParserUnitTest.java During flanking interval creation merging overlapping flanks so that on scatter the list doesn't accidentally genotype the same site twice. 2011-11-17 13:56:42 -05:00
GenomeLocSortedSetUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
GenomeLocUnitTest.java Support for list of known CNVs in VariantEval 2011-11-30 17:05:16 -05:00
HaplotypeUnitTest.java Adding genotype given alleles mode to the HaplotypeCaller. 2012-05-30 15:07:01 -04:00
MWUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
MathUtilsUnitTest.java minor misc optimizations to PairHMM 2012-04-18 15:02:26 -04:00
MedianUnitTest.java Final updates to integration tests for BCF2 2012-05-24 10:58:59 -04:00
PairHMMUnitTest.java Resolve merge conflicts 2012-04-18 16:25:03 -04:00
PathUtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
QualityUtilsUnitTest.java Caching log calculations cut the non-Map runtime of HaplotypeCaller in half. Moved the qual log cache used in HC and PairHMM into a common place and added unit tests. 2012-03-21 08:45:42 -04:00
ReservoirDownsamplerUnitTest.java Moving reduced read functionality into GATKSAMRecord 2011-10-21 13:28:05 -04:00
SimpleTimerUnitTest.java The right fix for this test is just to delete it. 2011-11-15 14:53:27 -05:00
UtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00