Problem ------- Caching strategy is incompatible with the current sorting of the haplotypes, and is rendering the cache nearly useless. Before the PairHMM updates, we realized that a lexicographically sorted list of haplotypes would optimize the use of the cache. This was only true until we've added the initial condition to the first row of the deletion matrix, which depends on the length of the haplotype. Because of that, every time the haplotypes differ in length, the cache has to be wiped. A lexicographic sorting of the haplotypes will put different lengths haplotypes clustered together therefore wasting *tons* of re-compute. Solution ------- Very simple. Sort the haplotypes by LENGTH and then in lexicographic order. |
||
|---|---|---|
| .. | ||
| genotyper | ||
| gvcf | ||
| haplotype | ||
| haplotypeBAMWriter | ||
| nanoScheduler | ||
| pairhmm | ||
| recalibration | ||
| smithwaterman | ||
| ContigComparatorUnitTest.java | ||