The running consensus accepts deletions as long as they are homozygous and (with the current parameters) there is only one read in the pileup. It was generating the cigar string correctly but adding the "D" base to the read bases. Fixed.
The clipper could leave an insertion or deletion as the start or end of a read after hardclipping a read if the element adjacent to the clipping point was an indel. Fixed.
This makes it more in line with the insertion treatment and avoids the following problems:
* consensus reads starting with deletions
* variant regions ending abruptly and turning into one consensus with a long insertion, making it difficult to call the deletion in that location.
Read clipper now identifies and clips even if the requested coordinate is outside the alignment but the read contains soft clipped bases in that region.
* When hard clipping a read that had insertions in it, the insertion was being added to the cigar string's hard clip element. This way, the old UnclippedStart() was being modified and so was the calculation of the new AlignmentStart(). Fixed it by subtracting the number of insertions clipped from the total number of hard clipped bases.
* Walker was sending read instead of filtered read when deleting a read that contains only Q2 bases
* Sliding the window was causing reads that started on the new start position to be entirely clipped.
-- ProfileRodSystem now has a just load index mode, allowing us to optimize the profiler
-- assessFarmNodes R script for making nice plots of performance of jobs on the farm
-- Rev. tribble to use new, optimized index loading (performance win when loading many many indices)
* Sliding window now operates over the cigar string to handle indels correctly
* window slides first, then adds the read.
* fixed consensus generation at the end of a variable region
* Sliding reads no longer keep duplicate information of bases and qualities
* consensus read is now a valid sam record including RG information
* BaseIndex is no longer a private class of BaseCounts as it's a useful utility for other tools in the reduced reads
* Some optimizations to the code in general.
To include documentation for hidden features in the generated GATKDocs,
run with -Dgatkdocs.include.hidden=true
I will enable this flag when bamboo generates GATKDocs for unstable.
Contracts remain disabled for non-test build targets. To enable for
non-test targets, run with -Duse.contracts=true. To disable for test
targets, run with -Duse.contracts=false.
adding base counts now iterates over the read's cigar string instead of the bases to handle insertions and deletions correctly.
* This commit broke other functionality that was relying on the incorrectly formed base counts.