-- Intermediate commit on the way to archiving SomaticIndelDetector and other tools.
-- SomaticIndelDetector, PairMaker and RemapAlignments tools have been refactored into the private andrey package. All utility classes refactored into here as well. At this point, the SomaticIndelDetector builds in this version of the GATK.
-- Subsequent commit will put this code into the archive so it no longer builds in the GATK
-- Created a separate, limited interface MapResultsQueue object that previously was set to the PriorityBlockingQueue.
-- The MapResultsQueue is now backed by a synchronized ExpandingArrayList, since job ids are integers incrementing from 0 to N. This means we avoid the n log n sort in the priority queue which was generating a lot of cost in the reduce step
-- Had to update ReducerUnitTest because the test itself was brittle, and broken when I changed the underlying code.
-- A few bits of minor code cleanup through the system (removing unused constructors, local variables, etc)
-- ExpandingArrayList called ensureCapacity so that we increase the size of the arraylist once to accommodate the upcoming size needs
-- Refactor calculation so that upfront constant values are pre-computed, and cached, and their values just looked up during application
-- Trivial comment on how we might use BAQ better in BaseRecalibrator
-- No longer computes at each update the overall read group table. Now computes this derived table only at the end of the computation, using the ByQual table as input. Reduces BQSR runtime by 1/3 in my test
It turns out that pre-allocating the entire tree was too expensive in
terms of memory when using large values for the -mcs and -ics parameters.
Pre-allocating the first two dimensions prevents us from ever locking the
root node during a put(). Contention between threads over lower levels
of the tree should be minimal given that puts are rare compared to gets.
Also output dimensions and pre-allocation info at startup. If pre-allocation
takes longer than usual this gives the user a sense of what is causing the
delay.
-With this change, BQSR performance scales properly by thread rather
than gaining nothing from additional threads.
-Benefits are seen when using either -nt (HierarchicalMicroScheduler) or -nct
(NanoScheduler)
-Removes high-level locks in the recalibration engines and NestedIntegerArray
in favor of maximally-granular locks on and around manipulation of the leaf
nodes of the NestedIntegerArray.
-NestedIntegerArray now creates all interior nodes upfront rather than on
the fly to avoid the need for locking during tree traversals. This uses
more memory in the initial part of BQSR runs, but the BQSR would eventually
converge to use this memory anyway over the course of a typical run.
IMPORTANT NOTE: This does not mean it's safe to run the old BaseRecalibrator
walker with multiple threads. The BaseRecalibrator walker is and will never be
thread-safe, as it's a LocusWalker that uses read attributes to track
state information. ONLY the newer DelocalizedBaseRecalibrator can be made
thread-safe (and will hopefully be made so in my subsequent commits). This
commit addresses performance, not correctness.