Commit Graph

18 Commits (302e8f023943c012a5702b6a21f1842f953614e2)

Author SHA1 Message Date
kshakir 302e8f0239 Fixed bug where the command directory was not being set to an absolute path, leading LSF to write some .done files to /tmp.
No longer using the command directory for temporary .done files, and instead using the user specified temporary directory.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4678 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-15 17:59:39 +00:00
kshakir 673fa841a4 Updated PluginManager so that during testing Queue can dynamically compile and load separately multiple class directories into the same class loader.
Removed obsolete usages of PackageUtils with updated PluginManager.
Ported Queue interval utilities written in scala over to Sting's java IntervalUtils.
Added a very basic intergration test to ensure that the fullCallingPipeline.q compiles.
Added options to specify the temporary directories without having to use -Djava.io.tmpdir (useful during the above integration test).
While adding tempDir added options to specify the run directory from the command line, for example "-runDir v1".
Upgraded to scala 2.8.1 and updated calls to deprecated functions.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4661 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-12 20:14:28 +00:00
kshakir f35d1aa43f Moving all file cleanup to IOUtils for easier debugging.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4646 348d0f76-0448-11de-a6fe-93d51630548a
2010-11-10 21:00:58 +00:00
kshakir d768c6558d Now that the user is required to set the java temp directory, it is safer for the LsfJobRunner to write to the java temp directory instead of the command directory.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4593 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-28 15:00:21 +00:00
kshakir e9c6f681a4 Instead of the pipeline's cleaner only writing BAMs with the target intervals, now pulling the list of contigs from the target intervals and outputing reads in those contigs.
Added a brute force -retry <count> option to Queue for transient errors.
Waiting up to 2 minutes for the LSF logs to appear before trying to display the errors from the logs.
Updates to the local job runner error logging when a job fails.
Refactored QGraph's settings as duplicate code was getting out of control.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4563 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-22 22:22:30 +00:00
kshakir b954a5a4d5 - After removing special code for intervals, instead of being of type File they are generated as List[File]. Changed previous checkin that was appending to this list and instead assigning a singleton list.
- More cleanup including removing the temporary classes and intermediate error files.  Quieting any errors using Apache Commons IO 2.0.
- Counting the contigs during the QScript generation instead of the end user having to pass a separate contig interval list.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4539 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-21 06:37:28 +00:00
kshakir 7157cb9090 While bkill'ing on the shutdown thread Queue will no longer try to submit more jobs on the original thread.
Updated pipeline output structure to current recommendations by Corin.
Directories are now automatically before the function runs.
Fixed several bugs with scatter gather binding when the script author needs to change the directories.
Fixed bug with tracking of log files for CloneFunctions.
More error handling and logging of exceptions (good test environment while LSF was down this early AM!)
Removed cleanup utility for scatter gather.  SG Output structure has changed significantly.  Will need to discuss and find a better approach for Queue programatically deleting files.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4504 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-15 17:01:36 +00:00
kshakir 5ee12875fb Emergency fix for Ryan:
- Catching errors when LSF fails and retrying.
- When LSF retries fail, catching the error, marking the job as failed, and no longer bkilling everything by exiting Queue.
- Caching function fields by class instead of each instance of a function saving a list of its fields.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4490 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-13 22:22:01 +00:00
kshakir db47230dd9 Wrapping ScatterGatherableFunctions with a facade instead of using slower clone library. Will require keeping Clone's facade code in sync with CommandLineFunction but runs *much* faster.
Shell invoking scripts so that even really long shell scripts make it through LSF.
Using the truncated (up to 1000 characters) of the command line for the job name for use with bjobs.
Switched the default from re-running everything to re-running only files that need to be regenerated.  --skip_up_to_date replaced with --start_clean for those who want to regenerate everything.
Updated logging to let users know when the scatter gather generator is running, which still takes a while but is orders of magnatudes faster for large lists of functions.  (40s for a 100 function graph exploding to a 2500 function graph)


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4448 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-07 01:19:18 +00:00
kshakir ca5db821ce Added the ability to Queue to run scala functions inside the JVM. NOTE: Extend from InProcessFunction instead of CommandLineFunction to use this functionality.
Queue now submits new LSF jobs only after previous functions have completed successfully.
When the Queue process is shutdown (ex: via Control-C) sends a bkill command for any running jobs.
Ported commands like creating directories and scatter/gather interval list to scala functions.
Updates to LSF status tracking by porting the python to internally generated bash scripts.
Temporarily disabled job name submission to LSF.  Plus side is that the full command is now available in "bjobs -w".  TODO: Put back jobName passing to LSF based on an option?
Changed BaseTest to allow scala to access paths to references.
Changed the extension generator to default the analysis name to the walker "name".

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4442 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-06 18:29:56 +00:00
chartl b24172c80f Queue now utilizes .[file].done to allow skipping of previous jobs, if they have been completed. This is, unfortunately, reliant on a python script to do the post-execution touching of .done files.
That is to say, proper resumability is live (but not extensively tested)



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4312 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-20 00:16:53 +00:00
kshakir 4eff69d95e Back to using the LSF job name during dry runs since when the real job ids weren't available '-w(null)' wasn't too informative.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4107 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-25 15:18:02 +00:00
kshakir 51678d48e4 Using job ids instead of job names for LSF dependency tracking.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4071 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-19 23:42:06 +00:00
kshakir f39dce1082 Exposed CommandLineFunction defaults to the Queue.jar command line (see -help).
Added ability to skip up-to-date jobs where the outputs are older than the inputs.
Changed -T CountDuplicates --quiet to --quietLocus so that Queue GATK extensions can use both short and full argument names.
Short names can be used to set values on Queue GATK extensions, for example: vf.XL :+= myFile
Moved Hidden from the GATK to StingUtils.
Updated ivy from 2.0.0 to 2.2.0-rc1 to fix sha1 issue: http://bit.ly/aX72w7
Added Queue to javadoc and testing build targets.
Added first Queue unit test.
Another pass at avoiding cycles in the DAG thanks to all function I/O being files.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4017 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-11 21:58:26 +00:00
kshakir 4f51a02dea Changed logging level to default at INFO instead of WARN.
Changes to StingUtils command line for use in Queue, replacing Queue's use of property files.
Updates to walkers used in existing QScripts to add @Input/@Output.
RMD used in @Required/@Allows now has a new default equal to "any" type.
New QueueGATKExtensions.jar generator for auto wrapping walkers as Queue CommandLineFunctions.
Added hooks to modify the functions that perform the Scattering and Gathering (setting their jar files, other arguments, etc.)
Removed dependency on BroadCore by porting LSF job submitter to scala.
Ivy now pulls down module dependencies from maven.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3984 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-09 16:42:48 +00:00
kshakir dce2c17404 Added "-bsubWait" where Queue waits for all the jobs to exit before exiting.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3661 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-28 19:52:17 +00:00
kshakir 30cf78fdc0 Refactoring for a first version of scatter gather api with basic shell script implementations.
Modified build script so that queue is cleaned during "ant clean".



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3611 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-22 18:39:20 +00:00
kshakir 32fc221ffe Replaced pattern matched pipeline spec with annotated objects.
Old version is no longer available.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3558 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-15 04:43:46 +00:00