Commit Graph

35 Commits (31a2575c7bc1485d18ccaa0bc07d608be4d54865)

Author SHA1 Message Date
kshakir 15ce375787 While generating YAML now warning and skipping TSV rows that don't have all values.
Fixed log message typo in PipelineTest.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5320 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-25 20:50:03 +00:00
kshakir ad1e4f47b1 Fixed fatal typo in TSV to YAML converter.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5316 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-25 17:18:54 +00:00
kshakir 24ef2be02d Updated firehose pulldown shell scripts:
- a LOT more error reporting to stderr and exit codes
- split the firehose pull down into a TSV generators and a TSV to YAML converter
- YAML converter is compatible with the TSVs generated by the front end website and will grab only the appropriate columns
- deprecated getFirehosePipelineYaml.sh mode with a single Sample_Set name which uses the Firehose test harness
- new getFirehosePipelineYamle.sh mode using web services API and requires an additional parameter, a password config file with "-u <user>:<pass>" which has been tested on problematic Sample_Sets



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5313 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-25 00:23:05 +00:00
depristo 87e5c448cd Forgot to enable printing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5278 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-19 14:51:33 +00:00
depristo b1e4e1afb6 Slightly better output now -- no longer emitting pdfs by default. Emails will go to gsamembers now
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5224 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-10 13:02:24 +00:00
kshakir 8040998c15 Renamed the pipeline yaml dbsnpFile to genotypeDbsnp, and added an evalDbsnp.
Added a genotypeDbsnpType and evalDbsnpType to check the extensions for .vcf or .rod.
Moved renaming of "recalibrated" bams to "cleaned" from sed to yaml generation template (see diff for more info).
Renamed fCP.q to FCP.q.
Though it's still disabled until VariantEval is updated, added changes above to the FCPTest.
Removed refseq table from the queue.sh wrapper script. Only specified in the yaml.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5213 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-07 22:01:09 +00:00
corin ce2866122d Calls the bams pulled down from firehose cleaned by default
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5188 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 18:34:07 +00:00
corin a22ea53665 Updated template for the MPG pipeline's queue script runner.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5187 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 18:33:29 +00:00
depristo e510798bc2 Missed one uncomment
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5159 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 13:01:59 +00:00
depristo d9532ecf53 Better run reporting structure. Now text report is attached as well as inline in the email, so you can easily view it in fix width fonts!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5158 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 12:58:48 +00:00
depristo c50f39a147 V3 of the distributed GATK. High-efficiency implementation. Support for status tracking for debugging and display. Still not safe for production use due to NFS filelock problem. V4 will use alternative file locking mechanism
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5063 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-24 16:45:07 +00:00
kshakir 6fbd18c759 Cleaning up obsolete code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5044 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 16:27:35 +00:00
kshakir 8855f080c2 For the fullCallingPipeline.q:
- Reading the refseq table from the YAML if not specified on the command line.
 - Removed obsolete -bigMemQueue now that CombineVariants runs in 4g.
 - Added a -mountDir /broad/software option to work around adpr automount issues.
 - Merged the LSF preexec used for automount into the shell script used to execute tasks.
 - Using the LSF C Library to determine when jobs are complete instead of postexec.
 - Updated queue.sh to match the changes above.
 - Updated the FCPTest to match the changes above.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5036 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-20 22:34:43 +00:00
corin af60666f5d An example template for the shell script used to run the full calling pipeline on the broad system
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5022 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-19 20:28:59 +00:00
kshakir c901fb6d70 Now populating the refseq and dbsnp in awk instead of retrieving from firehose.
Added refseq table to the pipeline object.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5020 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-19 18:19:10 +00:00
depristo 0e089ce0b7 watch -n 30 shell/pipelineJobs.csh for those who want to watch the gsaadm's jobs progress
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4966 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-09 13:09:30 +00:00
depristo be67161b47 Deleting old shell code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4962 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-07 21:36:16 +00:00
depristo 2448b859e4 no longer prints unnecessary table conversion failures that muck up emails. Run script now uses du not ls to display archive size
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4918 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-02 13:27:37 +00:00
depristo a61f0047f0 Better queue status messages
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4901 348d0f76-0448-11de-a6fe-93d51630548a
2010-12-22 20:17:28 +00:00
depristo 7ec805bef7 should be uncommented
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4900 348d0f76-0448-11de-a6fe-93d51630548a
2010-12-22 20:17:04 +00:00
depristo 46cd227613 Stabilitity improvements to GATK run report system. R code is now robust. XML parser uses the C backend in python, 10x faster. Added shell script that runs the daily reports, and linked the /humgen/ runme.csh to this script. Script now emails the group the daily PDFs to gsamembers
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4845 348d0f76-0448-11de-a6fe-93d51630548a
2010-12-15 14:56:12 +00:00
depristo 5dabf73039 Useful script for me
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4704 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-18 15:21:06 +00:00
kshakir 7f25019f37 Inprocess functions by default now log what output files they are running for.
On -run cleaning up .done and .fail files for jobs that will be run.
Added detection to Firehose YAML generator shell script for (g)awk versions that ignore "\n" in patterns.
Removed obsolete mergeText and splitIntervals shell scripts.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4452 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-07 19:08:02 +00:00
depristo d841f260eb minor improvements to queue status
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4439 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-06 14:34:35 +00:00
kshakir 20b38b38f3 Updated from SnakeYAML 1.6 to 1.7.
Added a pipeline java bean and YAML utility to serialize java beans.
Added a getFirehosePipelineYaml.sh that can pull firehose data into the pipeline yaml file format.
Updated the fullCallingPipeline.q to begin using the pipeline yaml file format for bams and reference.
More changes to come as this code gets tested out in the fullCallingPipeline.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4329 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-22 19:47:49 +00:00
kshakir 1c94a73434 Fixed header generation when lines contain spaces.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4304 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-17 17:14:59 +00:00
kshakir 6dcbf04378 Added gsa-firehose2
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4284 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-15 02:24:04 +00:00
depristo 8c16babe91 continuing improvements to the queueStatus helper script
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4261 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-12 14:04:59 +00:00
depristo 80e31df40d Useful script to see the status of gsa computing resources. Crontab'd and will be arriving as email at 8 am
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3965 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-07 12:36:28 +00:00
delangel be75b087ec a) Add input argument (-ncrate) to BeagleOutputToVCFWalker. If the genotype posterior error probability is higher than this threshold, we declare No-call at this genotype.
b) Add "OG" annotation to genotypes. If Beagle changes genotypes, this annotation gets the original genotype call, to ease performance  comparisons. If not, this annotation gets an empty value.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3723 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-06 18:33:28 +00:00
kshakir 30cf78fdc0 Refactoring for a first version of scatter gather api with basic shell script implementations.
Modified build script so that queue is cleaned during "ant clean".



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3611 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-22 18:39:20 +00:00
depristo a08c68362e Renaming error to getNegLog10PError(); added Cached clearing method to GL; SSG now has a CallResult that counts calls; No more Adding class to System.out, now to logger.info; First major testing piece (and general approach too) to unit testing of a walker -- SingleSampleGenotyper now knows how many calls to make on a particular 1mb region on NA12878 for each call type and counts the number of calls *AND* the compares the geli MD5 sum to the expected one!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1530 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 12:39:06 +00:00
depristo 5289230eb8 Version 0.2.1 (released) of the TableRecalibrator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1108 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-25 22:50:55 +00:00
aaron 056fcdc31c Adding a script for diff'ing the output of samtools and the GATK for the whole genome and each individual chromosome.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@882 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 21:19:39 +00:00
hanna 2ee2623926 Move non-java code out of playground.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@154 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 19:31:38 +00:00