gatk-3.8/public/java/test/org/broadinstitute/sting/utils
Mark DePristo 465694078e Major performance improvement to the GATK engine
-- The NanoSchedule timing code (in NSRuntimeProfile) was crazy expensive, but never showed up in the profilers.  Removed all of the timing code from the NanoScheduler, the NSRuntimeProfile itself, and updated the unit tests.
-- For tools that largely pass through data quickly, this change reduces runtimes by as much as 10x.  For the RealignerTargetCreator example, the runtime before this commit was 3 hours, and after is 30 minutes (6x improvement).
-- Took this opportunity to improve the GATK ProgressMeter.  NotifyOfProgress now just keeps track of the maximum position seen, and a separate daemon thread ProgressMeterDaemon periodically wakes up and prints the current progress.  This removes all inner loop calls to the GATK timers.
-- The history of the bug started here: http://gatkforums.broadinstitute.org/discussion/comment/2402#Comment_2402
2012-12-05 14:49:22 -05:00
..
R Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
activeregion Bugfix for GSA-647 HaplotypeCaller misses good variant because the active region doesn't trigger for an exome 2012-11-01 15:34:04 -04:00
baq Added checking in the GATK for mis-encoded quality scores. 2012-12-03 11:18:41 -05:00
clipping Updated and more thorough version of the BadCigar read filter 2012-08-17 17:05:27 -04:00
codecs I have pulled out all of the documentation URLs and put them into the HelpUtils class as static variables; this way, Appistry can change links as needed to point commercial users to their own internal forum without having to muck things up all over our source. Added some TODOs for Geraldine to update links in the GATK docs that still point to the old wiki. Sorry that I am pushing into stable, but that's what Appistry is pulling from for their release next week (and unstable has been failing forever). 2012-11-27 10:26:17 -05:00
collections Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
crypt Cleanup of VCF header lines and constants, BCF2 bugfixes 2012-06-21 15:16:31 -04:00
fasta Work on GSA-508 / CachingIndexedFastaReader should internally upper case bases loading data 2012-11-01 15:34:03 -04:00
fragments GATKSAMRecord refactor 2011-11-03 15:43:26 -04:00
interval Refactored parsing of Rod/IntervalBinding. Queue S/G now uses all interval arguments passed to CommandLineGATK QFunctions including support for BED/tribble types, XL, ISR, and padding. 2012-06-27 01:15:22 -04:00
io Public-key authorization scheme to restrict use of NO_ET 2012-03-06 00:09:43 -05:00
nanoScheduler Major performance improvement to the GATK engine 2012-12-05 14:49:22 -05:00
pileup GATKSAMRecord refactor 2011-11-03 15:43:26 -04:00
recalibration The user can now set the maximum allowable cycle on the command-line with --maximum_cycle_value. This value is (now) enforced in the Cycle covariate and a User Error is thrown if the maximum value is passed (with a helpful error message). Added unit tests to cover this new functionality. 2012-11-20 22:41:57 -05:00
report Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
runtime No more hunting down R "resources". As a tradeoff Rscript cannot be specified on the commandline and will be found in the environment path. 2011-10-27 14:17:07 -04:00
sam Quick fix: base qual array in the GATKSAMRecord stores the actual phred values (-33) and not the original bytes (duh). 2012-12-03 12:18:20 -05:00
text Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
threading Disable EfficiencyMonitoringThreadFactoryUnitTest 2012-10-21 12:43:46 -04:00
variantcontext Fix failing unit tests for VariantContextUtilsUnitTest 2012-11-27 14:26:23 -05:00
BaseUtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
BitSetUtilsUnitTest.java NestedHashMap-based implementation of BQSRv2 along with a few minor optimizations. Not a huge runtime upgrade over the long bitset approach, but it allows us to implement further optimizations going forward. Integration test change because the original version had a bug in the quantized qual table creation. 2012-06-27 16:55:49 -04:00
GenomeLocParserUnitTest.java Fixing parsing of genomelocs that contain colons in the contig names (which is allowed by the spec) as reported on the forum. Added unit test for this case. 2012-11-27 11:00:33 -05:00
GenomeLocSortedSetUnitTest.java Fix for GSA-649: GenomeLocSortedSet.overlaps is crazy slow. Also improved GenomeLocSortedSet.sizeBeforeLoc. 2012-11-27 01:07:00 -05:00
GenomeLocUnitTest.java Bugfix to compareTo and equals in GenomeLoc 2012-08-30 19:41:49 -04:00
HaplotypeUnitTest.java Lots more GGA fixes for the HC now that I understand what's going on internally. Integration tests pass except for the GGA test which I believe now produces better results. 2012-11-20 16:13:29 -05:00
LegacyReservoirDownsamplerUnitTest.java Revert "Separated out the DoC calculations from the XHMM pipeline, so that CalcDepthOfCoverage can be used for calculating joint coverage on a per-base accounting over multiple samples (e.g., family samples)" 2012-09-10 15:52:39 -04:00
MWUnitTest.java A couple of minor things. 2012-09-20 12:48:13 -04:00
MathUtilsUnitTest.java Increasing the precision of MathUtils.approximateLog10SumLog10 from 1E-3 to 1E-4. Genotyper integration tests change as a result. Expanding the unit tests of MathUtils.log10sumLog10. 2012-10-15 13:24:32 -04:00
MedianUnitTest.java Final updates to integration tests for BCF2 2012-05-24 10:58:59 -04:00
PathUtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
QualityUtilsUnitTest.java Caching log calculations cut the non-Map runtime of HaplotypeCaller in half. Moved the qual log cache used in HC and PairHMM into a common place and added unit tests. 2012-03-21 08:45:42 -04:00
SimpleTimerUnitTest.java Done GSA-539: SimpleTimer should use System.nanoTime for nanoSecond resolution 2012-09-05 15:45:23 -04:00
UtilsUnitTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00