gatk-3.8/public/java/test/org/broadinstitute/sting/gatk
Mark DePristo 39e4396de0 New ActiveRegionShardBalancer allows efficient NanoScheduling
-- Previously we used the LocusShardBalancer for the haplotype caller, which meant that TraverseActiveRegions saw its shards grouped in chunks of 16kb bits on the genome.  These locus shards are useful when you want to use the HierarchicalMicroScheduler, as they provide fine-grained accessed to the underlying BAM, but they have two major drawbacks (1) we have to fairly frequently reset our state in TAR to handle moving between shard boundaries and (2) with the nano scheduled TAR we end up blocking at the end of each shard while our threads all finish processing.
-- This commit changes the system over to using an ActiveRegionShardBalancers, that combines all of the shard data for a single contig into a single combined shard.  This ensures that TAR, and by extensions the HaplotypeCaller, gets all of the data on a single contig together so the the NanoSchedule runs efficiently instead of blocking over and over at shard boundaries.  This simple change allows us to scale efficiently to around 8 threads in the nano scheduler:
  -- See https://www.dropbox.com/s/k7f280pd2zt0lyh/hc_nano_linear_scale.pdf
  -- See https://www.dropbox.com/s/fflpnan802m2906/hc_nano_log_scale.pdf
-- Misc. changes throughout the codebase so we Use the ActiveRegionShardBalancer where appropriate.
-- Added unit tests for ActiveRegionShardBalancer to confirm it does the merging as expected.
-- Fix bad toString in FilePointer
2013-05-13 11:09:02 -04:00
..
datasources New ActiveRegionShardBalancer allows efficient NanoScheduling 2013-05-13 11:09:02 -04:00
downsampling Setting the reduce reads count tag was all wrong in a previous commit; fixing. 2013-04-30 13:45:42 -04:00
executive Updated all JAVA file licenses accordingly 2013-01-10 17:06:41 -05:00
filters Added check in the MalformedReadFilter for reads without stored bases (i.e. that use '*'). 2013-03-14 17:17:26 -04:00
iterators Refactor LIBS into utils.locusiterator before refactoring 2013-01-11 15:17:16 -05:00
refdata Detect stuck lock-acquisition calls, and disable file locking for tests 2013-04-24 22:49:02 -04:00
report Updated all JAVA file licenses accordingly 2013-01-10 17:06:41 -05:00
samples Updating TestNG to the latest version 2013-02-22 09:40:23 -05:00
traversals New ActiveRegionShardBalancer allows efficient NanoScheduling 2013-05-13 11:09:02 -04:00
walkers e# This is a combination of 2 commits. 2013-05-03 11:19:14 -04:00
CommandLineGATKUnitTest.java Updated all JAVA file licenses accordingly 2013-01-10 17:06:41 -05:00
EngineFeaturesIntegrationTest.java Further tweaking of test timeouts 2013-03-15 14:49:21 -04:00
GenomeAnalysisEngineUnitTest.java Added the functionality to impose a relative ordering on ReadTransformers in the GATK engine. 2013-03-06 12:38:59 -05:00
MaxRuntimeIntegrationTest.java Bump timeout for MaxRuntimeIntegrationTest 2013-03-17 16:17:29 -04:00
WalkerManagerUnitTest.java Updated all JAVA file licenses accordingly 2013-01-10 17:06:41 -05:00