gatk-3.8/scala/qscript/oneoffs
depristo abc7d1aef9 BeagleOutputToVCF now accepts an option to keep monomorphic sites. This is useful to genotype a single sample, where having AC=0 just means that the sample is hom-ref at the site.
ProduceBeagleInputWalker can optionally emit a beagle markers file, necessary to use the beagled reference panel for imputation.  Also supports the VQSR calibration curve idea that a site can be flagged as a certain FP, based on the VQSLOD field.  This allows us to have both continuous quality in the refinement of sites as well as hard filtering at some threshold so we don't end up with lots of sites with all 1/3 1/3 1/3 likelihoods for all samples (i.e., a definite FP site where we don't know anything about the samples). 

Added a new VariantsToBeagleUnphased walker that writes out a marker drive hard-call unphased genotypes file suitable for imputating missing genotypes with a reference panel with beagle.  Can optionally keep back a fraction of sites, marked as missing in the genotypes file, for assessment of imputation accuracy and power.  The bootstrap sites can be written to a separate VCF for assessment as well.

Finally, my general Queue script for creating and evaluating reference panels from VCF files.  Supports explicitly genotyping a BAM file at each panel SNP site, for assessment of imputation accuracy of a reference panel.  Lots of options for exploring the impact of the VQS likelihooods, multiple VCFs for constructing the reference panel, as well as fraction of sites left out in assessing the panel's power.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5467 348d0f76-0448-11de-a6fe-93d51630548a
2011-03-18 03:08:38 +00:00
..
carneiro this is a oneoff script to clean the papuans and test TargetCreator and IndelRealigner with scatter gathering. 2011-03-17 17:09:53 +00:00
chartl Added in unit tests for the statistics calculated by the test runner; and bug-fixes to the calculations; so we have some assurance that the statistics coming out the back-end are correct. 2011-03-06 16:54:02 +00:00
delangel Script for calling indels in all phase 1 samples - VQSR part still needs work but raw calling is done 2011-01-22 14:07:10 +00:00
depristo BeagleOutputToVCF now accepts an option to keep monomorphic sites. This is useful to genotype a single sample, where having AC=0 just means that the sample is hom-ref at the site. 2011-03-18 03:08:38 +00:00
fromer Some minor updates to fully utilize the functionality of reduceByInterval 2011-03-09 20:38:08 +00:00
hanna Refactoring the qscript directory; oneoffs, playground, and core 2011-01-19 15:23:40 +00:00
kshakir Removed deprecated getDbsnpFile. 2011-02-08 21:12:15 +00:00
rpoplin not useful 2011-03-15 22:47:55 +00:00
QTools.q Generalized association is now working. Output is in a horrific format. Implementation of T-testing. Improvements are to look for classes dynamically (a la VariantEval/VariantAnnotator), beautify output, and do optimizations where they exist. 2011-03-01 01:23:37 +00:00