gatk-3.8/java
kiran a679bdde18 FindContaminatingReadGroupsWalker lists read groups in a single-sample BAM file that appear to be contaminants by searching for evidence of systematic underperformance at likely homozygous-variant sites.
Procedure:
1. Sites that are likely homozygous-variant but are called as heterozygous are identified.
2. For each site and read group, we compute the proportion of bases in the pileup supporting an alternate allele.
3. A one-sample, left-tailed t-test is performed with the null hypothesis being that the alternate allele distribution has a mean of 0.95 and the alternate hypothesis being that the true mean is statistically significantly less than expected (pValue < 1e-9).



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1989 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-08 16:36:39 +00:00
..
config Provide a default logger, some config settings, and some doc updates. 2009-04-29 02:06:05 +00:00
src/org/broadinstitute/sting FindContaminatingReadGroupsWalker lists read groups in a single-sample BAM file that appear to be contaminants by searching for evidence of systematic underperformance at likely homozygous-variant sites. 2009-11-08 16:36:39 +00:00
test/org/broadinstitute/sting Added optional field DP to VCF output for Mark. 2009-11-06 20:03:22 +00:00