Commit Graph

47 Commits (f22ab033f6de11053a33bb7bbfa2e2e856d5ee57)

Author SHA1 Message Date
Phillip Dexheimer 296bcc7fb1 Changed name of jobs submitted to cluster job runners
-- Added 'jobRunnerJobName' definition to QFunction, defaults to value of shortDescription
-- Edited Lsf and Drmaa JobRunners to use this string instead of description for naming jobs in the scheduler

Signed-off-by: Joel Thibault <thibault@broadinstitute.org>
2013-11-12 14:34:56 -05:00
Louis Bergelson c05208ecec Resolving warnings
--specifying exception types in cases where none was already specified
----mostly changed to catch Exception instead of Throwable
----EmailMessage has a point where it should only be expecting a RetryException but was catching everything

--changing build.xml so that it prints scala feature warning details

--added necessary imports needed to remove feature warnings

--updating a newly deprecated enum declaration to match the new syntax
2013-09-23 12:42:22 -04:00
David Roazen c3d59d890d Update licenses for new PbsEngine* classes 2013-07-01 15:50:20 -04:00
Francesco acf90ca027 corrected number of arguments passed to PbsEngineJobRunner when requesting multiple cores
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2013-07-01 15:08:15 -04:00
Francesco 948b2fca20 added PbsEngine plugin into engine folders, to be called in Queue with -jobRunner PbsEngine; the plugin is written modifying the existing GridEngine plugin, used as a template
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2013-07-01 15:08:14 -04:00
Mauricio Carneiro e5913e50b2 Updating licenses for all scala files
GSATDG-5
2013-01-10 17:46:10 -05:00
Mauricio Carneiro 6d22f4f737 Bringing latest performance updates from the GATK to CMI 2012-12-05 21:40:03 -05:00
Joel Thibault 97d29f203e Add walltime changes to LSF
- Check whether the specified attribute is available
- Add pipeline test (disabled due to missing attribute)
2012-11-29 15:23:37 -05:00
Johan Dahlberg daf6269b65 Setting the walltime
Signed-off-by: Joel Thibault <thibault@broadinstitute.org>
2012-11-29 15:23:36 -05:00
kshakir a6c1fcd151 Removed default use of @Output syntax.
If compile completes for QScripts, sending runtime errors during execute.
2012-11-29 13:40:36 -05:00
kshakir 2ec3852acd Scala classes were only returning direct subclasses (confirmed when inspected in debugger) so changed PluginManager to allow specifying the explicit subclass.
Removed some generics from PluginManager for now until able to figure out syntax for requesting explicit subclass.
QStatusMessenger uses a slightly more primitive Map[String, Seq[RemoteFile]] instead of Map[ArgumentSource, Seq[RemoteFile]].
Added an QCommandPlugin.initScript utility method for handling specialized script types.
2012-11-04 23:55:12 -05:00
kshakir 8dfa24df7b Sending a version of per job status messages.
In addition to outputs, inputs are passed to QStatusMessenger.done()
CloneFunction.cloneIndex has a new CloneFunction.cloneCount companion useful for display purposes.
2012-10-23 15:55:47 -04:00
kshakir f93b279151 Moved the class field caching from QScript to a ClassFieldCache utility.
Using ClassFieldCache to pull values from QScript for passing to done() method of QStatusMessenger.
2012-10-16 18:49:31 -04:00
kshakir c4ee31075c Fixed package error and a few deprecated scala warnings. 2012-10-15 15:29:40 -04:00
kshakir 213cc00abe Refactored argument matching to support other plugins in addition to file lists.
Added plugin support for sending Queue status messages.
Argument parsing can store subclasses of java.io.File, for example RemoteFile.
2012-10-15 15:10:45 -04:00
Mark DePristo 9eec33ec3b Complete GSA-497: Let Queue write out runInfo on the fly, after each job group finishes running
-- Queue will incrementally now write out its jobReport.txt file whenever jobs finish running (FAIL or DONE)
-- This makes it far easier to track what's going on, or to analyze incrementally performance results coming out of Queue
-- Generally cleaned up the QJobsReporting code, creating a new clean class QJobsReporter that holds all of the information on what to do log and where to put into, which was previously scattered in QCommandLine and QJobReport
2012-08-21 14:44:18 -04:00
Mark DePristo 67ebd65512 Bugfix for potential SEGFAULT with JNA getting execution hosts for LSF with multiple hosts 2012-08-17 11:49:01 -04:00
Khalid Shakir 22b4466cf5 Added setupRetry() to modify jobs when Queue is run with '-retry' and jobs are about to restart after an error.
Implemented a mixin called "RetryMemoryLimit" which will by default double the memory.
GridEngine memory request parameter can be selected on the command line via '-resMemReqParam mem_free' or '-resMemReqParam virtual_free'.
Java optimizations now enabled by default:
- Only 4 GC threads instead of each job using java's default O(number of cores) GC threads. Previously on a machine with N cores if you have N jobs running and java allocates N GC threads by default, then the machines are using up to N^2 threads if all jobs are in heavy GC (thanks elauzier).
- Exit if GC spends more than 50% of time in GC (thanks ktibbett).
- Exit if GC reclaims lest than 10% of max heap (thanks ktibbett).
Added a -noGCOpt command line option to disable new java optimizations.
2012-08-13 15:43:05 -04:00
Khalid Shakir 746a5e95f3 Refactored parsing of Rod/IntervalBinding. Queue S/G now uses all interval arguments passed to CommandLineGATK QFunctions including support for BED/tribble types, XL, ISR, and padding.
Updated HSP to use new padding arguments instead of flank intervals file, plus latest QC evals.
IntervalUtils return unmodifiable lists so that utilities don't mutate the collections.
Added a JavaCommandLineFunction.javaGCThreads option to test reducing java's automatic GC thread allocation based on num cpus.
Added comma to list of characters to convert to underscores in GridEngine job names so that GE JSV doesn't choke on the -N values.
JobRunInfo handles the null done times when jobs crash with strange errors.
2012-06-27 01:15:22 -04:00
Khalid Shakir a9a6516527 Merged bug fix from Stable into Unstable 2012-01-10 16:16:10 -05:00
Khalid Shakir ef50e77ee2 When running Queue jobs locally, merge the stderr to the stdout log if the error file is NOT specified.
Updated VE strats in the HSP for plotting Ka/Ks by AC.
2012-01-10 16:10:25 -05:00
Khalid Shakir 5793625592 No more "Q-<pid>@<host>". Generated log file names now use the first output + ".out" (ex. my.vcf.out) or the name of the first QScript plus the order the function was added (ex. MyScript-1.out). The same function added twice with the same outputs will now have the same default logs, meaning the 2nd instance of the function won't be added to the graph twice.
QScript accessor to QSettings to specify a default runName and other default function settings.
Because log files are no longer pseudo-random their presense can be used to tell if a job without other file outputs is "done". For now still using the log's .done file in addition to original outputs.
Gathered log files concatenate all log files together into the stdout.
InProcessFunctions now have PrintStreams for stdout and stderr.
Updated ivy to use commons-io 2.1 for copying logs to the stdout PrintStream. Removed snakeyaml.
During graph tracking of outputs the Index files, and now BAM MD5s, are tracked with the gathering of the original file.
In Queue generated wrappers for the GATK the Index and MD5s used for tracking are switched to private scope.
Added more detailed output when running with -l DEBUG.
Simplified graphviz visualization for additional debugging.
Switched usage of the scala class 'List' to the trait 'Seq' (think java.util.ArrayList vs. using the interface java.util.List)
Minor cleanup to build including sending ant gsalib to R's default libloc.
2012-01-08 12:11:55 -05:00
Mark DePristo 0cc5c3d799 General improvements to Queue
-- Support for collecting resources info from DRMAA runners
-- Disabled the non-standard mem_free argument so that we can actually use our own SGE cluster gsa4
-- NCoresRequest is a testing queue script for this.
-- Added two command line arguments:
  -- multiCoreJerk: don't request multiple cores for jobs with nt > 1.  This was the old behavior but it's really not the best way to run parallel jobs.  Now with queue if you run nt = 4 the system requests 4 cores on your host.  If this flag is thrown, though, it will only request 1 and you'll just use 4, like a jerk
  -- job_parallel_env: parallel environment named used with SGE to request multicore jobs.  Equivalent to -pe job_parallel_env NT for NT > 1 jobs
2011-12-20 14:05:09 -05:00
Khalid Shakir e25d40882a Swapping Thread.sleep(0) with Object.wait(0) caused Queue to lock up. Thanks to rpoplin for pointing it out. 2011-10-28 15:51:03 -04:00
Khalid Shakir b80d407dc7 No more hunting down R "resources". As a tradeoff Rscript cannot be specified on the commandline and will be found in the environment path.
Other minor cleanup.
2011-10-27 14:17:07 -04:00
Khalid Shakir fac9932938 Embedding gsalib source and queueJobReport R scripts in the dist and package jars.
Moved gsalib and queueJobReport.R to embeddable namespaced locations.
Updated packager dependencies/dir to add an @includes which filters the embedded fileset.
RScriptExecutor can now JIT compiles the gsalib.
RScriptExecutor uses ProcessController and sends the Rscript output to java's stdout when run under -l DEBUG.
Refactored ProcessController and IOUtils from Queue to Sting Utils.
Added more unit tests to ProcessController along with a utility class to hard stop OutputStreams at a specified byte count.
Replaced uses of some IOUtils with Apache Commons IO.
ShellJobRunner refactored to use direct ProcessController and now kills jobs on shutdown.
Better QGraph responsiveness on shutdown by using Object.wait() instead of Thread.sleep().
2011-10-24 15:58:34 -04:00
Khalid Shakir 510d5e7730 Merged bug fix from Stable into Unstable 2011-09-09 01:34:55 -04:00
Khalid Shakir 367bbee25a Fixed typo when printing the contents or last N lines of a file. Thanks to larryns. 2011-09-09 01:33:25 -04:00
Mark DePristo 61633c95a8 Default jobreport is now jobPrefix, so you see logs like Q-2508.jobreport.txt 2011-08-28 19:19:45 -04:00
Mark DePristo b38de1fa35 Now captures the exechost in the job report
-- Works for in process, shell, and LSF runners
-- Cleanup of debugging output
2011-08-28 12:05:56 -04:00
Mark DePristo 0cb1605df0 Clean documentation for JobRunInfo 2011-08-26 09:22:58 -04:00
Mark DePristo 415d5d5301 LSF long times are in seconds, convert to milliseconds to meet standard 2011-08-26 09:18:28 -04:00
Mark DePristo e01273ca7c Queue now writes out queueJobReport.pdf
-- General purpose RScript executor in java (please use when invoking RScripts)
-- Removed groupName.  This is now analysisName
-- Explicitly added capability to enable/disable individual QFunction
2011-08-25 16:57:11 -04:00
Mark DePristo 0f4be2c4a4 Argument to disable queueJobReport entirely
-- Minor improvements to RodPerformanceGoals
2011-08-25 13:32:03 -04:00
Mark DePristo d65faf509c Default output name for Queue JobReport is queue_jobreport.gatkreport.txt 2011-08-25 13:15:20 -04:00
Mark DePristo a7d6946b22 Refactored QJobReport and QFunction, which is now automatically tracked
-- All QFunctions, including sg ones, are tracked
-- Removed memory information
2011-08-25 13:13:55 -04:00
Mark DePristo 08fb21f127 Removing hostname 2011-08-24 16:45:50 -04:00
Mark DePristo 06e30a81d1 Fixes throughout for getting job information
-- no more hostname -- it's just not going to be important
2011-08-24 15:30:09 -04:00
Mark DePristo 4918519a58 No more NPE in getRuntime() when you cntr-c out of Queue 2011-08-24 14:14:01 -04:00
Mark DePristo 16d8360592 QJobReport is now the official capability name 2011-08-24 13:59:14 -04:00
Mark DePristo b8bc03bb42 JobRunInfo improvements
-- dry-run now adds some info, for testing
-- InProcessRunner adds some, but not all, of the information we want
2011-08-23 17:11:22 -04:00
Mark DePristo 31ec6e316c First implementation of JobRunInfo
-- onExecutionDone(Map(QFunction, JobRunInfo)) is the new signature, so that you can walk over your jobs and inspect their success/failure and runtime characteristics
2011-08-23 16:51:54 -04:00
Khalid Shakir c4c90c8826 Updates to JobRunners from the Queue developer community and from running the WholeGenomePipeline:
- Ability to pass a different resident memory reservation and limits. Useful for large pileups of low pass genome data that sometimes need high -Xmx6g but usually don't exceed 2-3g in actual heap size.
- Fixed jobPriority to work for all job runners. Now must be a integer between 0 and 100- even for GridEngine- and will be mapped to the correct values.
- Passing parallel environment and job resource requests to LSF and GridEngine. Useful for passing tokens like iodine_io=1 and -pe pe_slots 8
- Refactored GridEngine JobRunner to also provide basic support for other job dispatchers with DRMAA implementations such as Torque/PBS. Should work for basic running but advanced users must pass their own jobNativeArgs from the command line or in customized QScripts until someone maps properties like jobQueue, jobPriority, residentRequest, etc. into a Torque/PBS/etc. dispatcher.
2011-08-22 15:13:27 -04:00
Khalid Shakir eaa2f16d83 When a job finishes successfully in the ShellJobRunner, mark it as DONE instead of FAILED. 2011-08-06 10:42:04 -04:00
Khalid Shakir 59eb1f4663 Memory limits changed from Int to Double.
Updated LSF calls to read memory units from config along with tweaks to select hosts.
Moved some common code from GridEngine and LSF to super classes.
2011-07-21 22:57:18 -04:00
Khalid Shakir e93052a51e When generating the QGraph, don't regenerate if there aren't scatter/gather jobs.
Fixed a display issue with the number of milliseconds that Queue has tried to contact LSF.
2011-07-11 19:17:58 -04:00
David Roazen 3c9497788e Reorganized the codebase beneath top-level public and private directories,
removing the playground and oneoffprojects directories in the process. Updated
build.xml accordingly.
2011-06-28 06:55:19 -04:00