gatk-3.8/public/java/test/org/broadinstitute/sting/gatk/walkers
Roger Zurawicki 7887a06703 GATKReport v1.0
GATKReport format changes:

 - All non-data header lines are preceeded with a single pound ( #:)
 - Every report now has a report header containing the version number and number of tables
 - Every table has two lines of table header: The first explains the size of the table and the data types of each column, the second contains the table name and description.
 - This new format will allow reports in the future to be gatherable.
 - Changed the header format to include an end-of-line string ":;"

Added features:

 - Simplified GATK Reports:

	The constructor for a simplified GATK Report. Simplified GATK report are designed for reports that do not need the advanced functionality of a full GATK Report.

	A simple GATK Report consists of:
		- A single table
		- No primary key ( it is hidden )
	    Optional:
		- Only untyped columns. As long as the data is an Object, it will be accepted.
		- Default column values being empty strings.
	Limitations:
		- A simple GATK report cannot contain multiple tables.
		- It cannot contain typed columns, which prevents arithmetic gathering.

       - Added a constructor to generate simplified GATK reports.
       - Added a method to easily add data to simple GATK reports.

 - Upgraded the input parser take advantage of the new file format (v1).
 - Added the GATKReportGatherer, more usability cmoing in next versionof GATK Report. Curently, it can only add rows from one table to another. Added private methods in GATKReport to combine Tables and Reports, It is very conservative and will only gather if the table columns, as well as everything else matches. At the column level, it uses the (redundant) row ids to add new rows. It will throw an exception if it is overwriting data.
 - Made some GATKReport methods public, and added more setters and getters.
 - Added method that compares formats of two GATKReports, and added an equals method to verify all data inside.
 - The gsalib for R now supports reading GATKReport v1 files in addition to legacy formats (v0.*)
 - Added a GATKReportDataType enum to give column a certain data type. This must be specified when making a gatherable report. This enum contains several methods including a reverse lookup map.
 - Added a data type field in GATKColumn, when a type is not specified, the unknown type is used. Unknown types should not be gathered.

Test changes:

 - Updated Unit Tests for GATK Report v1. Added a test for the gatherer. Left one test disabled while we transition from v0 to v1.
 - Updated the MD5 hashes in integration tests throughout the GATK.

Other changes:

 - Added the gatherer functions to CoverageByRG
 - Also added the scatterCount parameter in the Interval Coverage script
 - Dropped support for reading in legacy GATKReport formats ( v0.*)
 - Updated VariantEvalWalker to work with GATK Report v1, added a format String to all applicable DataPoints.
 - Rewrote the read file method for GATK report files.
 - Optimized the equals methods within GATKReport. The protected functions should only be called by the GATKReport methods.

Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2012-03-12 23:09:19 -04:00
..
CNV Updating md5 because the file changed 2011-09-23 07:33:20 -04:00
annotator Updating integration tests now that standard annotations support multiple alleles 2012-02-27 11:32:26 -05:00
beagle Minor tweaks to T2D-related qscripts. Replacing old md5s from the BeagleIntegrationTest. All differences boiled down either to the accounting of genotypes changed (./. --> 0/0 is no longer a "changed" genotype, and original genotypes that were ./. are represented as OG=. rather than OG=./. .) 2012-01-23 08:25:34 -05:00
bqsr Unit tests for the context covariate 2012-03-01 17:56:45 -05:00
coverage DoC now properly handles reference N bases + misc. additional cleanups 2012-02-25 11:32:50 -05:00
diagnostics GATKReport v1.0 2012-03-12 23:09:19 -04:00
diffengine GATKReport v1.0 2012-03-12 23:09:19 -04:00
fasta Removing the legacy -L "interval1;interval2" syntax 2011-11-21 13:18:53 -05:00
filters Again, fixing the add call when we really mean replace 2011-11-21 19:15:56 -05:00
genotyper Getting rid of redundant methods in MathUtils. Adding unit tests for approximateLog10SumLog10 and normalizeFromLog10. Increasing the precision of the Jacobian approximation used by approximateLog10SumLog which changes the UG+HC integration tests ever so slightly. 2012-03-05 12:28:32 -05:00
indels Rename *PerformanceTest test classes to *LargeScaleTest 2011-12-22 10:38:49 -05:00
phasing Merge with master 2011-11-19 09:56:06 -05:00
qc First version of VariantContextBuilder 2011-11-18 11:06:15 -05:00
recalibration Patching special case in the adaptor clipping 2012-01-11 17:47:44 -05:00
validation Removing all instances of -BTI (in tests and in GATKdocs) and replacing them with the appropriate alternative. 2011-10-27 23:55:11 -04:00
varianteval GATKReport v1.0 2012-03-12 23:09:19 -04:00
variantrecalibration Changing the VQSR command line syntax back to the parsed tags approach. This cleans up the code and makes sure we won't be parsing the same rod file multiple times. I've tried to update the appropriate qscripts. 2011-09-12 12:17:43 -04:00
variantutils GATKReport v1.0 2012-03-12 23:09:19 -04:00
BAQIntegrationTest.java merge master 2011-10-25 16:08:39 -04:00
ClipReadsWalkersIntegrationTest.java Updating MD5s for updated BAM with read groups 2011-10-06 12:15:48 -07:00
PileupWalkerIntegrationTest.java Updated integration tests for the new adaptor clipping fix. 2011-12-30 18:47:14 -05:00
PrintReadsIntegrationTest.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
PrintReadsUnitTest.java Better location for the downsampling of reads in PrintReads 2012-01-14 14:06:09 -05:00