Commit Graph

282 Commits (ff180a8e02eaeffbc42226c789c6c6946affae68)

Author SHA1 Message Date
Eric Banks 843384e435 Rename hg19 files in bundle to b37 since that's what they are 2012-11-14 11:47:09 -05:00
Mauricio Carneiro e35fd1c717 Merging CMI-0.5.0 and GATK-2.2 together. 2012-11-14 10:42:03 -05:00
kshakir 6d59dd3455 Scala classes were only returning direct subclasses (confirmed when inspected in debugger) so changed PluginManager to allow specifying the explicit subclass.
Removed some generics from PluginManager for now until able to figure out syntax for requesting explicit subclass.
QStatusMessenger uses a slightly more primitive Map[String, Seq[RemoteFile]] instead of Map[ArgumentSource, Seq[RemoteFile]].
Added an QCommandPlugin.initScript utility method for handling specialized script types.
2012-11-14 10:33:20 -05:00
David Roazen 73157ae3d3 Allow each pipeline test the max of 10 hours to run
The runtime of these tests is extremely variable -- sometimes they will complete almost instantly,
other times they will wait in an LSF queue for 5-10+ hours. Minimize timeout errors by setting the
timeout for these tests to the maximum of 10 hours.
2012-11-02 12:40:56 -04:00
Guillermo del Angel 51a9ce28e1 Merge remote-tracking branch 'unstable/master' into develop 2012-10-31 10:29:48 -04:00
Eric Banks eccb76c304 Only run UG in the bundle for chr20 2012-10-30 15:09:46 -04:00
Eric Banks 8a402024c2 Updating bundle script to handle new naming convention of CEU trio best practices callset 2012-10-30 09:11:56 -04:00
Ryan Poplin 5ee2feb2a3 updating pipeline test md5s 2012-10-29 18:53:27 -04:00
Ami Levy Moonshine dde3060bb8 add the CEUtrio best practices results (UG + PBT) to the bundle 2012-10-25 15:36:17 -04:00
kshakir 8dfa24df7b Sending a version of per job status messages.
In addition to outputs, inputs are passed to QStatusMessenger.done()
CloneFunction.cloneIndex has a new CloneFunction.cloneCount companion useful for display purposes.
2012-10-23 15:55:47 -04:00
Guillermo del Angel 5fac5bf12e Fixed issues with Queue packaging of Picard QC classes: separate jar's are needed fromPicard. User needs to specify the -picardBase argument to point to input path for jars.
> Also, reenable joint cleaning as now it works.
> DEV-125 #resolve
> DEV-90 #resolve
2012-10-23 14:08:31 -04:00
kshakir 0cce1ae8b2 When gathering VCFs, using CombineVariants from the current classpath, and not the GATK used to run the command. This was a concern for external modules that bundled the engine but not CombineVariants. 2012-10-23 12:44:06 -04:00
Mauricio Carneiro c210b7cde4 Merge GATK repo into CMI-GATK
Bringing in the following relevant changes:
	* Fixes the indel realigner N-Way out null pointer exception DEV-10
	* Optimizations to ReduceReads that bring the run time to 1/3rd.

Conflicts:
	protected/java/src/org/broadinstitute/sting/gatk/walkers/compression/reducereads/SlidingWindow.java

DEV-10 #resolve #time 2m
2012-10-23 10:59:11 -04:00
Guillermo del Angel 7860ff7981 a) Resolve [#DEV-56] - test data with indels in new directory private/testdata/CMITestData/. b) Skeleton (not yet working) of fastq-BAM unit test, c) misc bug fixes for QC functions to work (not done yet) 2012-10-22 19:59:15 -04:00
Khalid Shakir fd59e7d5f6 Better error message when generic types are erased from scala collections. 2012-10-22 16:27:31 -04:00
Khalid Shakir 2ef456d51a Added explicit @ClassType annotations to @Argument for Option[Int] or Option[Double] since scala seems to change the reflected type to Option[Object] on some systems.
Changed ReflectionUtils.getGenericTypes' order of looking for @ClassType since the primitive generic wasn't completely erased, only changed to Object which is incorrect.
More fixes to @Arguments labeled as java.io.File via incorrect @Input annotation.
Put in a default undocumented implementation of @Argument doc() to match the one added to @Input.
2012-10-19 13:20:29 -04:00
Guillermo del Angel 4f768e2f58 redo QC picard parts 2012-10-19 12:25:46 -04:00
Khalid Shakir 403654d40a Fixed null checkes in ArgumentTypeDescriptor due to ArgumentMatchValue updates.
Fixed @Arguments such as scatter count that were labeled as java.io.File via incorrect @Input annotation.
2012-10-18 16:57:15 -04:00
kshakir 55ac4ba70b Added another utility that can convert to RemoteFiles.
QScripts will now generate remote versions of files if the caller has not already passed in remote versions (or the QScript replaces the passed in remote references... not good)
Instead of having yet another plugin, combined QStatusMessenger and RemoteFileConverter under general QCommandPlugin trait.
2012-10-17 20:00:03 -04:00
kshakir 0196dbeaca Added more logging to push/pull of RemoteFiles. 2012-10-17 09:52:17 -04:00
kshakir f93b279151 Moved the class field caching from QScript to a ClassFieldCache utility.
Using ClassFieldCache to pull values from QScript for passing to done() method of QStatusMessenger.
2012-10-16 18:49:31 -04:00
kshakir c4ee31075c Fixed package error and a few deprecated scala warnings. 2012-10-15 15:29:40 -04:00
kshakir 213cc00abe Refactored argument matching to support other plugins in addition to file lists.
Added plugin support for sending Queue status messages.
Argument parsing can store subclasses of java.io.File, for example RemoteFile.
2012-10-15 15:10:45 -04:00
Kristian Cibulskis dad7ca281e upgraded mutation caller with VCF output
raw indel calls (non filtered,non vcf)
2012-10-15 13:49:08 -04:00
Guillermo del Angel 22b79fb4dd Resolve [DEV-7]: add single-sample VCF calling at end of FASTQ-BAM pipeline. Initial steps of [DEV-4]: queue extensions for Picard QC metrics 2012-10-15 13:49:08 -04:00
Kristian Cibulskis 658f355171 initial cancer pipeline with mutations and partial indel support 2012-10-15 13:49:07 -04:00
Mauricio Carneiro 322ea1262c First implementation of a generic 'bundled' Data Processing Pipeline for germline and cancer.
not ready for prime time yet!
2012-10-15 13:49:06 -04:00
Mauricio Carneiro f1fb51b222 Reverting the DPP to the original version, going to create a new simplified version for CMI in private. 2012-10-15 13:49:06 -04:00
Mauricio Carneiro 429c96e723 Generic input file name recognition (still need to implement support to FastQ, but it now can at least accept it) 2012-10-15 13:49:06 -04:00
Khalid Shakir f66284658d RetryMemoryLimit now works with Scatter/Gather. 2012-10-09 21:51:03 -04:00
Johan Dahlberg e9b9e2318c Fixed SortSam bug, for .done file
The *.bai.done file for the .bai file was written in the run directory instead of in the specified output directory.
Changing getName() to getAbsolutePath() fixes this.

Signed-off-by: Joel Thibault <thibault@broadinstitute.org>
2012-10-09 16:25:18 -04:00
Mauricio Carneiro 9a8f53e76c Probably the GATK's most seen typo in the world 2012-10-02 13:34:37 -04:00
David Roazen 3f44b3e019 Update DataProcessingPipelineTest MD5s 2012-09-24 15:38:07 -04:00
Eric Banks 277ba94c7b Update from dbsnp135 to dbsnp137. 2012-08-31 14:06:29 -04:00
Eric Banks 5ea7cd6dcc Updating resource bundle: no reason to include both genotype and sites files for Omni and HM3, sites are enough. Also, don't include duplicate entry for the Mills indels. 2012-08-31 14:01:54 -04:00
Khalid Shakir 2d1ea7124b One less Queue command line requirement: -tempDir now defaults to .queue/tmp.
Also moved queueScatterGather to .queue/scatterGather.
2012-08-27 12:04:50 -04:00
Mark DePristo 9eec33ec3b Complete GSA-497: Let Queue write out runInfo on the fly, after each job group finishes running
-- Queue will incrementally now write out its jobReport.txt file whenever jobs finish running (FAIL or DONE)
-- This makes it far easier to track what's going on, or to analyze incrementally performance results coming out of Queue
-- Generally cleaned up the QJobsReporting code, creating a new clean class QJobsReporter that holds all of the information on what to do log and where to put into, which was previously scattered in QCommandLine and QJobReport
2012-08-21 14:44:18 -04:00
Khalid Shakir 3514fb6e66 Changed the default memory limit from none to 2GB upon suggestions from delangel, carneiro, and depristo. 2012-08-20 21:41:13 -04:00
Mark DePristo 67ebd65512 Bugfix for potential SEGFAULT with JNA getting execution hosts for LSF with multiple hosts 2012-08-17 11:49:01 -04:00
Khalid Shakir 22b4466cf5 Added setupRetry() to modify jobs when Queue is run with '-retry' and jobs are about to restart after an error.
Implemented a mixin called "RetryMemoryLimit" which will by default double the memory.
GridEngine memory request parameter can be selected on the command line via '-resMemReqParam mem_free' or '-resMemReqParam virtual_free'.
Java optimizations now enabled by default:
- Only 4 GC threads instead of each job using java's default O(number of cores) GC threads. Previously on a machine with N cores if you have N jobs running and java allocates N GC threads by default, then the machines are using up to N^2 threads if all jobs are in heavy GC (thanks elauzier).
- Exit if GC spends more than 50% of time in GC (thanks ktibbett).
- Exit if GC reclaims lest than 10% of max heap (thanks ktibbett).
Added a -noGCOpt command line option to disable new java optimizations.
2012-08-13 15:43:05 -04:00
Eric Banks 0381fd7c83 Hmm, I thought I used the right md5s last time. Let's try again. 2012-08-02 11:25:10 -04:00
Eric Banks 05bf6e3726 Updating md5s in pipeline tests so that they finally pass 2012-08-01 10:27:00 -04:00
Eric Banks 7cf4b63d76 Disabling indel quals in BaseRecalibrator as it should be, not PrintReads. 2012-08-01 09:23:04 -04:00
Eric Banks 675ccab2fa Renaming BQSR to BaseRecalibrator 2012-07-23 10:17:17 -04:00
Mauricio Carneiro d446d34227 GATK Error messages now point to the new website instead of GetSatisfaction. 2012-07-20 17:27:11 -04:00
Eric Banks a9f27e5b02 Updated md5s for DPP test 2012-07-17 21:54:46 -04:00
Eric Banks 4e3780fd4f Updated md5 for PBPP 2012-07-17 15:47:43 -04:00
Eric Banks 863eb5b5c0 Use Context not Dinuc covariate 2012-07-17 15:18:11 -04:00
Eric Banks 17d627b86d Update the DPP and PBPP to use the BQSRv2 walkers 2012-07-17 13:15:32 -04:00
Joel Thibault 9ee58d323a Pass the original GATK unsafe parameter to the VcfGatherFunction 2012-07-02 16:03:11 -04:00