gatk-3.8/public/java/test/org/broadinstitute/sting/utils
David Roazen 46edab6d6a Use the new downsampling implementation by default
-Switch back to the old implementation, if needed, with --use_legacy_downsampler

-LocusIteratorByStateExperimental becomes the new LocusIteratorByState, and
the original LocusIteratorByState becomes LegacyLocusIteratorByState

-Similarly, the ExperimentalReadShardBalancer becomes the new ReadShardBalancer,
with the old one renamed to LegacyReadShardBalancer

-Performance improvements: locus traversals used to be 20% slower in the new
downsampling implementation, now they are roughly the same speed.

-Tests show a very high level of concordance with UG calls from the previous
implementation, with some new calls and edge cases that still require more examination.

-With the new implementation, can now use -dcov with ReadWalkers to set a limit
on the max # of reads per alignment start position per sample. Appropriate value
for ReadWalker dcov may be in the single digits for some tools, but this too
requires more investigation.
2012-12-10 09:44:50 -05:00
..
R Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
activeregion Bugfix for GSA-647 HaplotypeCaller misses good variant because the active region doesn't trigger for an exome 2012-11-01 15:34:04 -04:00
baq Added checking in the GATK for mis-encoded quality scores. 2012-12-03 11:18:41 -05:00
clipping Updated and more thorough version of the BadCigar read filter 2012-08-17 17:05:27 -04:00
codecs I have pulled out all of the documentation URLs and put them into the HelpUtils class as static variables; this way, Appistry can change links as needed to point commercial users to their own internal forum without having to muck things up all over our source. Added some TODOs for Geraldine to update links in the GATK docs that still point to the old wiki. Sorry that I am pushing into stable, but that's what Appistry is pulling from for their release next week (and unstable has been failing forever). 2012-11-27 10:26:17 -05:00
collections Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
crypt Cleanup of VCF header lines and constants, BCF2 bugfixes 2012-06-21 15:16:31 -04:00
fasta Work on GSA-508 / CachingIndexedFastaReader should internally upper case bases loading data 2012-11-01 15:34:03 -04:00
fragments GATKSAMRecord refactor 2011-11-03 15:43:26 -04:00
interval Refactored parsing of Rod/IntervalBinding. Queue S/G now uses all interval arguments passed to CommandLineGATK QFunctions including support for BED/tribble types, XL, ISR, and padding. 2012-06-27 01:15:22 -04:00
io Public-key authorization scheme to restrict use of NO_ET 2012-03-06 00:09:43 -05:00
nanoScheduler Major performance improvement to the GATK engine 2012-12-05 14:49:22 -05:00
pileup GATKSAMRecord refactor 2011-11-03 15:43:26 -04:00
recalibration The user can now set the maximum allowable cycle on the command-line with --maximum_cycle_value. This value is (now) enforced in the Cycle covariate and a User Error is thrown if the maximum value is passed (with a helpful error message). Added unit tests to cover this new functionality. 2012-11-20 22:41:57 -05:00
report Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
runtime No more hunting down R "resources". As a tradeoff Rscript cannot be specified on the commandline and will be found in the environment path. 2011-10-27 14:17:07 -04:00
sam Quick fix: base qual array in the GATKSAMRecord stores the actual phred values (-33) and not the original bytes (duh). 2012-12-03 12:18:20 -05:00
text Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
threading Disable EfficiencyMonitoringThreadFactoryUnitTest 2012-10-21 12:43:46 -04:00
variantcontext Fix failing unit tests for VariantContextUtilsUnitTest 2012-11-27 14:26:23 -05:00
BaseUtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
BitSetUtilsUnitTest.java NestedHashMap-based implementation of BQSRv2 along with a few minor optimizations. Not a huge runtime upgrade over the long bitset approach, but it allows us to implement further optimizations going forward. Integration test change because the original version had a bug in the quantized qual table creation. 2012-06-27 16:55:49 -04:00
GenomeLocParserUnitTest.java Fixing parsing of genomelocs that contain colons in the contig names (which is allowed by the spec) as reported on the forum. Added unit test for this case. 2012-11-27 11:00:33 -05:00
GenomeLocSortedSetUnitTest.java Fix for GSA-649: GenomeLocSortedSet.overlaps is crazy slow. Also improved GenomeLocSortedSet.sizeBeforeLoc. 2012-11-27 01:07:00 -05:00
GenomeLocUnitTest.java Bugfix to compareTo and equals in GenomeLoc 2012-08-30 19:41:49 -04:00
HaplotypeUnitTest.java Lots more GGA fixes for the HC now that I understand what's going on internally. Integration tests pass except for the GGA test which I believe now produces better results. 2012-11-20 16:13:29 -05:00
LegacyReservoirDownsamplerUnitTest.java Use the new downsampling implementation by default 2012-12-10 09:44:50 -05:00
MWUnitTest.java A couple of minor things. 2012-09-20 12:48:13 -04:00
MathUtilsUnitTest.java Increasing the precision of MathUtils.approximateLog10SumLog10 from 1E-3 to 1E-4. Genotyper integration tests change as a result. Expanding the unit tests of MathUtils.log10sumLog10. 2012-10-15 13:24:32 -04:00
MedianUnitTest.java Final updates to integration tests for BCF2 2012-05-24 10:58:59 -04:00
PathUtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
QualityUtilsUnitTest.java Caching log calculations cut the non-Map runtime of HaplotypeCaller in half. Moved the qual log cache used in HC and PairHMM into a common place and added unit tests. 2012-03-21 08:45:42 -04:00
SimpleTimerUnitTest.java Done GSA-539: SimpleTimer should use System.nanoTime for nanoSecond resolution 2012-09-05 15:45:23 -04:00
UtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00