1) Build the module with the following command:
$ ant gsalib
2) Add the module path to your ~/.Rprofile file:
.libPaths("/path/to/Sting/trunk/R/")
3) At the top of each R script that will use the library, include the line:
library(gsalib)
You can now use the package like any other R package. To get high-level documentation, supply the following command to R:
help(gsalib)
The methods contained herein are:
getargs : A method to easily provide arguments to interactive and non-interactive scripts.
Prints out a help message specifying how the script should be run if no arguments
or "-h" is provided. Very helpful when you're writing an R-script piecemeal in
interactive mode, then want to make it a command-line program.
plot.venn : Plots a two-way or three-way proportional Venn diagram.
read.eval : Reads VariantEval output that's formatted in R style.
read.gatkreport : Reads GATKReport output.
gsa.message : Emits a message with the prefix "[gsalib]" to stdout.
gsa.warn : Emits a warning message with the prefix "[gsalib] Warning:" to stdout.
gsa.error : Emits an error message with the prefix "[gsalib] Error: to stdout, calls traceback()
and halts execution.
Documentation on each of these methods can be obtained by typing "help(method_name)" at the R prompt.
* Retired GATKReport.R, as that functionality has now been moved to gsalib.
* Retired gsacommons, as that functionality has been split between gsalib and VariantReport.R.
* Modified VariantReport.R to make use of gsalib. The script now uses the getargs() method to provide the user with some information as to the proper way to run the script. Documentation on how to prepare output is given at http://www.broadinstitute.org/gsa/wiki/index.php/VariantEval .
* Added 'gsalib' target to build.xml file. Running "ant gsalib" will compile this module and place the R-ready package in R/gsalib .
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4416 348d0f76-0448-11de-a6fe-93d51630548a
This object designed to be both the structure that holds data during the execution of the walker, as well as the object that properly formats and emits the data so that it can be easily loaded into R. In the end, you get a table that looks like this:
##:GATKReport.v0.1 ErrorRatePerCycle : The error rate per sequenced position in the reads
cycle errorrate.61PA8.7 qualavg.61PA8.7
0 0.007451835696110506 25.474613284804366
1 0.002362777171937477 29.844949954504095
2 9.087604507451836E-4 32.87590975254731
3 5.452562704471102E-4 34.498999090081895
4 9.087604507451836E-4 35.14831665150137
5 5.452562704471102E-4 36.07223435225619
6 5.452562704471102E-4 36.1217248908297
7 5.452562704471102E-4 36.1910480349345
8 5.452562704471102E-4 36.00345705967977
...
A GATKReport object can hold multiple tables, and the write() method will emit all tables in succession. Each table starts with its own ##:GATKReport.v0.1 table header, so each table can stand alone. This allows for tables to be mixed and matched in a single file, or for the output from different walkers to be combined into a single file with no ill effect.
The display property of individual columns can be turned off. This is useful when a column is used to store intermediate results, necesary for the computation of some later value, but the contents of the intermediate column itself are not required in the final output file.
Finally, the GATKReportTable allows for some simple, mathematical, element-wise and column-wise operations. For instance, two whole columns can be divided, the results of the operation being stored in a third column. This mimics the most basic of R operations, where whole vectors can be added, subtracted, multiplied or divided without requiring the developer to explicitly write a loop.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4159 348d0f76-0448-11de-a6fe-93d51630548a
To use this module, you'll first have to take your VCF and create an R-readable table out of it with the following command:
python /path/to/Sting/trunk/python/vcf2table.py -f CHROM,POS,ID,AC,AF,AN,DB,DP,HRun,MQ,MQ0,MyHaplotypeScore,QD,SB my.vcf > my.vcf.table
Then, simply invoke this module with the command:
Rscript /path/to/Sting/trunk/R/VariantRecalibratorReport/VariantRecalibratorReport.R /path/to/output/prefix /path/to/my/my.clusters /path/to/my.vcf.table [/path/to/my.suspicious.loci]
This will create a number of plots all with the prefix "/path/to/output/prefix". For instance, if you used QD, SB, HRun, and MyHaplotypeScore annotations during clustering, you should see output like this:
/path/to/output/prefix.anndist.HRun.pdf
/path/to/output/prefix.anndist.MyHaplotypeScore.pdf
/path/to/output/prefix.anndist.QD.pdf
/path/to/output/prefix.anndist.SB.pdf
/path/to/output/prefix.cluster.HRun_vs_MyHaplotypeScore.pdf
/path/to/output/prefix.cluster.HRun_vs_QD.pdf
/path/to/output/prefix.cluster.HRun_vs_SB.pdf
/path/to/output/prefix.cluster.MyHaplotypeScore_vs_HRun.pdf
/path/to/output/prefix.cluster.MyHaplotypeScore_vs_QD.pdf
/path/to/output/prefix.cluster.MyHaplotypeScore_vs_SB.pdf
/path/to/output/prefix.cluster.QD_vs_HRun.pdf
/path/to/output/prefix.cluster.QD_vs_MyHaplotypeScore.pdf
/path/to/output/prefix.cluster.QD_vs_SB.pdf
/path/to/output/prefix.cluster.SB_vs_HRun.pdf
/path/to/output/prefix.cluster.SB_vs_MyHaplotypeScore.pdf
/path/to/output/prefix.cluster.SB_vs_QD.pdf
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3936 348d0f76-0448-11de-a6fe-93d51630548a