gatk-3.8/scala/qscript/oneoffs/depristo
depristo abc7d1aef9 BeagleOutputToVCF now accepts an option to keep monomorphic sites. This is useful to genotype a single sample, where having AC=0 just means that the sample is hom-ref at the site.
ProduceBeagleInputWalker can optionally emit a beagle markers file, necessary to use the beagled reference panel for imputation.  Also supports the VQSR calibration curve idea that a site can be flagged as a certain FP, based on the VQSLOD field.  This allows us to have both continuous quality in the refinement of sites as well as hard filtering at some threshold so we don't end up with lots of sites with all 1/3 1/3 1/3 likelihoods for all samples (i.e., a definite FP site where we don't know anything about the samples). 

Added a new VariantsToBeagleUnphased walker that writes out a marker drive hard-call unphased genotypes file suitable for imputating missing genotypes with a reference panel with beagle.  Can optionally keep back a fraction of sites, marked as missing in the genotypes file, for assessment of imputation accuracy and power.  The bootstrap sites can be written to a separate VCF for assessment as well.

Finally, my general Queue script for creating and evaluating reference panels from VCF files.  Supports explicitly genotyping a BAM file at each panel SNP site, for assessment of imputation accuracy of a reference panel.  Lots of options for exploring the impact of the VQS likelihooods, multiple VCFs for constructing the reference panel, as well as fraction of sites left out in assessing the panel's power.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5467 348d0f76-0448-11de-a6fe-93d51630548a
2011-03-18 03:08:38 +00:00
..
1kg_table1.scala Refactoring the qscript directory; oneoffs, playground, and core 2011-01-19 15:23:40 +00:00
CleaningTest.scala Removing unused class 2011-02-04 22:22:28 +00:00
RefineGenotypesWithBeagle.q BeagleOutputToVCF now accepts an option to keep monomorphic sites. This is useful to genotype a single sample, where having AC=0 just means that the sample is hom-ref at the site. 2011-03-18 03:08:38 +00:00
VQSRCutByNRS.scala Removing unused class 2011-02-04 22:22:28 +00:00
manySampleUGPerformance.scala Generic, easy-to-use variant evaluation Queue script that tests indel and SNP call sets against standard evaluation data sets for sensitivity and specificity 2011-03-07 18:03:29 +00:00
resequencingSamples1KG.scala Class name to reflect actual file name. manySampleUGPerformance now operates on 1000 samples! 2011-02-26 23:36:04 +00:00