gatk3的最后一个经典版本3.8
 
 
 
 
Go to file
kiran fd19c63aaf A data structure that allows data to be collected over the course of a walker's computation, then have that data written to a PrintStream such that it's human-readable, AWK-able, and R-friendly (given that you load it using the GATKReport loader module).
This object designed to be both the structure that holds data during the execution of the walker, as well as the object that properly formats and emits the data so that it can be easily loaded into R.  In the end, you get a table that looks like this:

##:GATKReport.v0.1 ErrorRatePerCycle : The error rate per sequenced position in the reads
cycle  errorrate.61PA8.7         qualavg.61PA8.7
0      0.007451835696110506      25.474613284804366
1      0.002362777171937477      29.844949954504095
2      9.087604507451836E-4      32.87590975254731
3      5.452562704471102E-4      34.498999090081895
4      9.087604507451836E-4      35.14831665150137
5      5.452562704471102E-4      36.07223435225619
6      5.452562704471102E-4      36.1217248908297
7      5.452562704471102E-4      36.1910480349345
8      5.452562704471102E-4      36.00345705967977
...

A GATKReport object can hold multiple tables, and the write() method will emit all tables in succession.  Each table starts with its own ##:GATKReport.v0.1 table header, so each table can stand alone.  This allows for tables to be mixed and matched in a single file, or for the output from different walkers to be combined into a single file with no ill effect.

The display property of individual columns can be turned off.  This is useful when a column is used to store intermediate results, necesary for the computation of some later value, but the contents of the intermediate column itself are not required in the final output file.

Finally, the GATKReportTable allows for some simple, mathematical, element-wise and column-wise operations.  For instance, two whole columns can be divided, the results of the operation being stored in a third column.  This mimics the most basic of R operations, where whole vectors can be added, subtracted, multiplied or divided without requiring the developer to explicitly write a loop.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4159 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-29 05:39:24 +00:00
R A data structure that allows data to be collected over the course of a walker's computation, then have that data written to a PrintStream such that it's human-readable, AWK-able, and R-friendly (given that you load it using the GATKReport loader module). 2010-08-29 05:39:24 +00:00
archive Cleaning up playground utils and tests 2010-08-27 01:25:47 +00:00
c Reduce file handle usage. 2010-01-05 18:03:01 +00:00
doc removing the custom reflections library from the libs, and adding a release version. Hopefully this will fix the problem Menachem has been seeing with random JVM crashes. Also 2010-08-19 00:42:37 +00:00
java A data structure that allows data to be collected over the course of a walker's computation, then have that data written to a PrintStream such that it's human-readable, AWK-able, and R-friendly (given that you load it using the GATKReport loader module). 2010-08-29 05:39:24 +00:00
matlab Another matlab script -- this time for making power and coverage plots over a specific gene region. Lots of fun file reading, string manipulation, and exploration of the set() function 2009-11-30 20:02:25 +00:00
packages Enabling support for any scala used in QScripts. 2010-08-17 22:05:56 +00:00
perl update to use new rod syntax 2010-08-25 20:21:53 +00:00
python Changes from James Pirruccello: now can handle differences between UCSC and NCBI tables, properly sorting despite the contig prefix differences (presence or absence of 'chr'), and converts NCBI format to UCSC format for use by the GenomicAnnotator. 2010-08-26 19:02:29 +00:00
ruby accidentally commited an old tool 2010-08-25 15:42:02 +00:00
scala Ugh! varout --> out 2010-08-29 02:34:41 +00:00
settings removing the custom reflections library from the libs, and adding a release version. Hopefully this will fix the problem Menachem has been seeing with random JVM crashes. Also 2010-08-19 00:42:37 +00:00
shell Useful script to see the status of gsa computing resources. Crontab'd and will be arriving as email at 8 am 2010-08-07 12:36:28 +00:00
testdata and add changes to the vcf used in testing 2010-06-25 02:56:02 +00:00
LICENSE Adding a license to the root directory in case BOSC checks for one. Has the 2010-04-20 16:04:29 +00:00
build.xml up the test mem. from 2g to 4g; we're currently hitting the 2g in aggregate across some of the larger tests 2010-08-27 01:39:05 +00:00
ivy.xml Okay, finally done with VCF compression. Now: 2010-08-24 16:36:54 +00:00