gatk-3.8/public
Chris Hartl 1f777c4898 Introducing the latest-and-greatest in genotyping: CalculatePosteriors.
CalculatePosteriors enables the user to calculate genotype likelihood posteriors (and set genotypes accordingly) given one or more panels containing allele counts (for instance, calculating NA12878 genotypes based on 1000G EUR frequencies). The uncertainty in allele frequency is modeled by a Dirichlet distribution (parameters being the observed allele counts across each allele), and the genotype state is modeled by assuming independent draws (Hardy-Weinberg Equilibrium). This leads to the Dirichlet-Multinomial distribution.

Currently this is implemented only for ploidy=2. It should be straightforward to generalize. In addition there's a parameter for "EM" that currently does nothing but throw an exception -- another extension of this method is to run an EM over the Maximum A-Posteriori (MAP) allele count in the input sample as follows:
 while not converged:
  * AC = [external AC] + [sample AC]
  * Prior = DirichletMultinomial[AC]
  * Posteriors = [sample GL + Prior]
  * sample AC = MLEAC(Posteriors)

This is more useful for large callsets with small panels than for small callsets with large panels -- the latter of these being the more common usecase.

Fully unit tested.

Reviewer (Eric) jumped in to address many of his own comments plus removed public->protected dependencies.
2013-11-27 13:00:45 -05:00
..
R Something changed with the ggtitle syntax in the latest version of ggplot2. 2013-08-14 14:40:03 -04:00
c At chartl's request, add the bwa aln -N and bwa aln -m parameters to the bindings. 2012-01-17 14:47:53 -05:00
chainFiles
doc Fixed issues raised by Appistry QA (mostly small fixes, corrections & clarifications to GATKDocs) 2013-03-12 10:57:14 -04:00
java Introducing the latest-and-greatest in genotyping: CalculatePosteriors. 2013-11-27 13:00:45 -05:00
keys Public-key authorization scheme to restrict use of NO_ET 2012-03-06 00:09:43 -05:00
packages ValidatingPileup was renamed to CheckPileup 2013-02-15 11:56:19 -05:00
perl Fixing the liftover script to not require strict VCF header validation. 2013-11-07 09:02:17 -05:00
scala Patched Queue extensions lacking a main class definition 2013-11-22 14:57:09 -05:00
testdata Walker to create a fastq file from an interval list 2013-06-29 11:24:16 -04:00