BaseRecalibration: don't cache instances of ReadCovariates across reads

Caching and reusing ReadCovariates instances across reads sounds good in theory, but:

-it doesn't work unless you zero out the internal arrays before each read
-the internal arrays must be sized proportionally to the maximum POSSIBLE
recalibrated read length (5000!!!), instead of the ACTUAL read lengths

By contrast, creating a new instance per read is basically equivalent to doing an
efficient low-level memset-style clear on a much smaller array (since we use the actual
rather than the maximum read length to create it). So this should be faster than caching
instances and calling clear() but slower than caching instances and not calling clear().

Credit to Ryan to proposing this approach.
This commit is contained in:
David Roazen 2012-10-25 16:16:00 -04:00
parent dde3060bb8
commit 422e16c62e
1 changed files with 1 additions and 21 deletions

View File

@ -61,17 +61,6 @@ public class BaseRecalibration {
// qualityScoreByFullCovariateKey[i] = new NestedHashMap();
// }
/**
* Thread local cache to allow multi-threaded use of this class
*/
private ThreadLocal<ReadCovariates> readCovariatesCache;
{
readCovariatesCache = new ThreadLocal<ReadCovariates> () {
@Override protected ReadCovariates initialValue() {
return new ReadCovariates(MAXIMUM_RECALIBRATED_READ_LENGTH, requestedCovariates.length);
}
};
}
/**
* Constructor using a GATK Report file
@ -113,16 +102,7 @@ public class BaseRecalibration {
}
}
// Compute all covariates for the read
// TODO -- the need to clear here suggests there's an error in the indexing / assumption code
// TODO -- for BI and DI. Perhaps due to the indel buffer size on the ends of the reads?
// TODO -- the output varies depending on whether we clear or not
//final ReadCovariates readCovariates = readCovariatesCache.get().clear();
// the original code -- doesn't do any clearing
final ReadCovariates readCovariates = readCovariatesCache.get();
RecalUtils.computeCovariates(read, requestedCovariates, readCovariates);
final ReadCovariates readCovariates = RecalUtils.computeCovariates(read, requestedCovariates);
for (final EventType errorModel : EventType.values()) { // recalibrate all three quality strings
if (disableIndelQuals && errorModel != EventType.BASE_SUBSTITUTION) {