gatk-3.8/R
kiran bd27287fe7 An R module that takes in a Variant Recalibration cluster file (file with '@!CLUSTER' lines in it), a tabularized VCF, and optionally a set of loci that should be examined more carefully, and emits a tremendous number of plots. For every annotation used in clustering, the distributions and pair-wise comparison (with ellipses denoting the 2-sigma cluster boundaries) are shown. Each cluster is shaded with a color proportional to its mixture coefficient.
To use this module, you'll first have to take your VCF and create an R-readable table out of it with the following command:

python /path/to/Sting/trunk/python/vcf2table.py -f CHROM,POS,ID,AC,AF,AN,DB,DP,HRun,MQ,MQ0,MyHaplotypeScore,QD,SB my.vcf > my.vcf.table

Then, simply invoke this module with the command:

Rscript /path/to/Sting/trunk/R/VariantRecalibratorReport/VariantRecalibratorReport.R /path/to/output/prefix /path/to/my/my.clusters /path/to/my.vcf.table [/path/to/my.suspicious.loci]

This will create a number of plots all with the prefix "/path/to/output/prefix".  For instance, if you used QD, SB, HRun, and MyHaplotypeScore annotations during clustering, you should see output like this:

    /path/to/output/prefix.anndist.HRun.pdf
    /path/to/output/prefix.anndist.MyHaplotypeScore.pdf
    /path/to/output/prefix.anndist.QD.pdf
    /path/to/output/prefix.anndist.SB.pdf
    /path/to/output/prefix.cluster.HRun_vs_MyHaplotypeScore.pdf
    /path/to/output/prefix.cluster.HRun_vs_QD.pdf
    /path/to/output/prefix.cluster.HRun_vs_SB.pdf
    /path/to/output/prefix.cluster.MyHaplotypeScore_vs_HRun.pdf
    /path/to/output/prefix.cluster.MyHaplotypeScore_vs_QD.pdf
    /path/to/output/prefix.cluster.MyHaplotypeScore_vs_SB.pdf
    /path/to/output/prefix.cluster.QD_vs_HRun.pdf
    /path/to/output/prefix.cluster.QD_vs_MyHaplotypeScore.pdf
    /path/to/output/prefix.cluster.QD_vs_SB.pdf
    /path/to/output/prefix.cluster.SB_vs_HRun.pdf
    /path/to/output/prefix.cluster.SB_vs_MyHaplotypeScore.pdf
    /path/to/output/prefix.cluster.SB_vs_QD.pdf



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3936 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-04 18:35:14 +00:00
..
VariantRecalibratorReport An R module that takes in a Variant Recalibration cluster file (file with '@!CLUSTER' lines in it), a tabularized VCF, and optionally a set of loci that should be examined more carefully, and emits a tremendous number of plots. For every annotation used in clustering, the distributions and pair-wise comparison (with ellipses denoting the 2-sigma cluster boundaries) are shown. Each cluster is shaded with a color proportional to its mixture coefficient. 2010-08-04 18:35:14 +00:00
VariantReport A very nice way of automatically plotting the results of a VariantEval run. All of the hard work is actually in the common R repository, gsacommons.R, including methods for creating a Venn diagram. It also provides a mechanism for the output of a VariantEval run to be loaded into a single list object. 2010-07-19 12:38:26 +00:00
analyzeConcordance Using bitmap() instead of png() since the former doesn't rely on X11. 2010-02-23 05:31:51 +00:00
Data.Processing.Report.r This replaces tearsheet.r, neatens up graphics, and allows the script to be used in R's interactive environment 2010-06-24 01:02:58 +00:00
PlotDepthOfCoverage.R Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics 2010-03-29 13:32:00 +00:00
generateBySamplePlot.R R script for graphing depth of coverage by sample name, and generating a loess curve for each sample's data. 2009-12-10 21:58:01 +00:00
gsacommons.R A very nice way of automatically plotting the results of a VariantEval run. All of the hard work is actually in the common R repository, gsacommons.R, including methods for creating a Venn diagram. It also provides a mechanism for the output of a VariantEval run to be loaded into a single list object. 2010-07-19 12:38:26 +00:00
plot_Annotations_BinnedTruthMetrics.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
plot_ClusterReport.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
plot_OptimizationCurve.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
plot_residualError_OtherCovariate.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
plot_residualError_QualityScoreCovariate.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
plot_variantROCCurve.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
plotting_library.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
tearsheet.r This script produces tearsheet and data processing report figures and tables when given Squid and Firehose produced data 2010-06-18 21:36:29 +00:00
titvFPEst.R Can run R scripts on the command line 2010-07-09 00:13:18 +00:00
whole_exome_bait_selection.R R script for selecting a variety of baits (using %GC content and normalized coverage) for Nanostring assessment from those used in the Agilent whole exome hybrid selection design. 2009-09-22 18:10:14 +00:00