Added a rudimentary GATKReportParser for parsing VE3 results.
Re-enabled the FCPTest using VE3, the GATKRP, and the PicardAggregationUtils.
The tag type for .rod files is DBSNP, not ROD.
More explicit return types on implicit methods.
Added null checks for implicit string to/from file conversions.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5668 348d0f76-0448-11de-a6fe-93d51630548a
See https://www.broadinstitute.org/gsa/wiki/index.php/GATK_resource_bundle
Which live locally in /humgen/gsa-hpprojects/GATK/bundle/current
You use this following command to create the bundle:
java -Djava.io.tmpdir=/broad/shptmp/depristo/tmp -jar dist/Queue.jar -S scala/qscript/core/GATKResourcesBundle.scala --gatkjarfile dist/GenomeAnalysisTK.jar -bsub -jobQueue gsa -svn 5660 $*
Annoyingly, it must be run in the trunk directory, and requires an explicit svn version number to create the directory. It also must be run in two stages manually. First, the local bundle is created, and then with the -phase2 argument all of the files in the local bundle are compressed and pushed to the FTP server. I'm likely going to shift most of my processes over to using this location for data file access, especially for b37 data sets.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5665 348d0f76-0448-11de-a6fe-93d51630548a
a progress message, then aggregate metrics. Makes the overhead of
printProgress in RealignerTargetCreator go from >20% to ~3%.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5663 348d0f76-0448-11de-a6fe-93d51630548a
- fixing a bug on single ended BWA option of the data processing pipeline.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5662 348d0f76-0448-11de-a6fe-93d51630548a
interface between SAMDataSource and IntervalSharder that needs to stay around
until the original BAM sharder is retired. Will add a JIRA to fix design
flaw.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5661 348d0f76-0448-11de-a6fe-93d51630548a
After viewing results on real case/control data from RAW -- it's really working quite well. ReadIndels, however, needs to use a T-test rather than a U-test, especially in deep coverage (at indel sites, the reads with indels will have mostly the same number of CIGAR indel elements -- one -- which doesn't really play nicely with the UTest when sample sets are large). Modified ReadsLargeInsertSize to be a two-way test (e.g. ReadsLarge and ReadsSmall). BaseQualityScore also suffers from the same issue as read indels, so switching over to a T-test in that case as well.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5653 348d0f76-0448-11de-a6fe-93d51630548a
Scala type inference for the implicit return types on implicit methods was a little too much for poor IntelliJ IDEA to handle, and it was breaking things like copy/paste, auto-complete, etc.
Also updated the Queue package to include all Sting utils.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5646 348d0f76-0448-11de-a6fe-93d51630548a
+ UG now doesn't care whether it's given SNPs or indels to genotype, it will do the right thing -- so remove the option to specify which GM user wants
+ Max misamatches argument removed
integration test will follow
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5638 348d0f76-0448-11de-a6fe-93d51630548a
Switched YAML parser to new Broad parser which will additionally update picard cleaned bams to the latest version if the project and sample are specified.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5634 348d0f76-0448-11de-a6fe-93d51630548a
read metrics are actually a clone, which they can do with as they wish.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5626 348d0f76-0448-11de-a6fe-93d51630548a
Also fixed an "issue" with InsertSizeDistribution -- apparently for mate pairs, the first mate (karyotypically) will have a POSITIVE insert size, and the second a NEGATIVE insert size -- thus the insert size distribution was being conflated with enrichment/depletion of first-in-pair or second-in-pair reads. Gah.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5623 348d0f76-0448-11de-a6fe-93d51630548a