-- Removed versions getAttribriteAsX(key) that except on not having the value.
-- Removed version that getAttributeAsXNoException(key)
-- The only available assessors are now getAttributeAsX(key, default).
-- This single accessors properly handle their argument types, so if the value is a double it is returned directly for getAttributeAsDouble(), or if it's a string it's converted to a double. If the key isn't found, default is returned.
-- Don't create an empty LinkedHashSet() for PASS fields. Just return Collections.emptySet() instead.
-- For filter fields with actual values, returns an unmodifiableSet instead of one that can be changed
Found this neat little walker Kiran wrote stashed in the private tree. Very useful. Generalized it a bit, added GATKDocs and moved it to public. I might include it as a QC step on the pacbio processing pipeline.
* generalize it so it works with non pair ended reads.
* generalize it to work with no read group information
* getRefCoordSoftUnclippedEnd was not resetting the shift when hitting insertions. Fixed.
* getReadCoordinateForReferenceCoordinateBeforeAlignmentEnd was returning the wrong read coordinate position. Fixed.
The running consensus accepts deletions as long as they are homozygous and (with the current parameters) there is only one read in the pileup. It was generating the cigar string correctly but adding the "D" base to the read bases. Fixed.
The clipper could leave an insertion or deletion as the start or end of a read after hardclipping a read if the element adjacent to the clipping point was an indel. Fixed.
This makes it more in line with the insertion treatment and avoids the following problems:
* consensus reads starting with deletions
* variant regions ending abruptly and turning into one consensus with a long insertion, making it difficult to call the deletion in that location.
Read clipper now identifies and clips even if the requested coordinate is outside the alignment but the read contains soft clipped bases in that region.
* When hard clipping a read that had insertions in it, the insertion was being added to the cigar string's hard clip element. This way, the old UnclippedStart() was being modified and so was the calculation of the new AlignmentStart(). Fixed it by subtracting the number of insertions clipped from the total number of hard clipped bases.
* Walker was sending read instead of filtered read when deleting a read that contains only Q2 bases
* Sliding the window was causing reads that started on the new start position to be entirely clipped.
-- ProfileRodSystem now has a just load index mode, allowing us to optimize the profiler
-- assessFarmNodes R script for making nice plots of performance of jobs on the farm
-- Rev. tribble to use new, optimized index loading (performance win when loading many many indices)