gatk-3.8/public
David Roazen 95b5f99feb Exclude reduced reads from elimination during downsampling
Problem:
-Downsamplers were treating reduced reads the same as normal reads,
 with occasionally catastrophic results on variant calling when an
 entire reduced read happened to get eliminated.

Solution:
-Since reduced reads lack the information we need to do position-based
 downsampling on them, best available option for now is to simply
 exempt all reduced reads from elimination during downsampling.

Details:
-Add generic capability of exempting items from elimination to
 the Downsampler interface via new doNotDiscardItem() method.
 Default inherited version of this method exempts all reduced reads
 (or objects encapsulating reduced reads) from elimination.

-Switch from interfaces to abstract classes to facilitate this change,
 and do some minor refactoring of the Downsampler interface (push
 implementation of some methods into the abstract classes, improve
 names of the confusing clear() and reset() methods).

-Rewrite TAROrderedReadCache. This class was incorrectly relying
 on the ReservoirDownsampler to preserve the relative ordering of
 items in some circumstances, which was behavior not guaranteed by
 the API and only happened to work due to implementation details
 which no longer apply. Restructured this class around the assumption
 that the ReservoirDownsampler will not preserve relative ordering
 at all.

-Add disclaimer to description of -dcov argument explaining that
 coverage targets are approximate goals that will not always be
 precisely met.

-Unit tests for all individual downsamplers to verify that reduced
 reads are exempted from elimination
2013-06-11 16:16:26 -04:00
..
R Updating gsalib for R-3.0 compatibility 2013-05-18 12:43:38 -04:00
c At chartl's request, add the bwa aln -N and bwa aln -m parameters to the bindings. 2012-01-17 14:47:53 -05:00
chainFiles
doc Fixed issues raised by Appistry QA (mostly small fixes, corrections & clarifications to GATKDocs) 2013-03-12 10:57:14 -04:00
java Exclude reduced reads from elimination during downsampling 2013-06-11 16:16:26 -04:00
keys Public-key authorization scheme to restrict use of NO_ET 2012-03-06 00:09:43 -05:00
packages ValidatingPileup was renamed to CheckPileup 2013-02-15 11:56:19 -05:00
perl Split out contig names from Reference .fai file on white space (to support the GATK resource bundle's file human_g1k_v37.fasta.fai.gz, which does not use tab delimiters) 2012-06-07 16:56:32 -04:00
scala Enable convenient display of diff engine output in Bamboo, plus misc. minor test-related improvements 2013-05-10 19:00:33 -04:00
testdata Somehow the index of exampleDBSNP.vcf was missing 2013-05-28 15:29:43 -04:00