Commit Graph

39 Commits (e02aac07439a1e2e565fd235255afa28df36acdb)

Author SHA1 Message Date
ebanks 72c5b75460 Tribble exceptions can be generated outside of the normal codec parsing code because we now lazy load the VCF genotype fields. I'm not sure how else to account for this (to make sure they show up as user errors and not GATK system errors) besides catching them here.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4554 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-22 15:22:17 +00:00
kshakir 88a0d77433 Changed parsing engine to store the order the argument bindings based on their definition in the class, moving "-T" to the front of Queue command lines.
Queue GATK generated .intervals is now a List(File) again removing special case handling in the generator.
Instead of using @Scatter annotation, using ScatterFunction instance to determine if a job can be scattered.
Implemented special VcfGatherFunction which only uses the header from the first file, even if the other files differ in their headers.
Added a -deleteIntermediates to Queue to delete the outputs from intermediate commands after a successful run.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4536 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-20 21:43:52 +00:00
ebanks d8db48204e Fix typo and tell people not to post user errors
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4415 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-03 18:58:03 +00:00
hanna 8d25a5f9f2 A mechanism for supplying attribution text -- mainly useful for external
walkers.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4402 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 18:31:19 +00:00
hanna 732aa32758 Every Sting app from now on will be forced into the US English locale.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4385 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 21:55:21 +00:00
hanna 0c99c97685 The engine now automatically adds the command-line arguments to the header of every VCF, unless -NO_HEADER is specified.
Changed integration tests, adding the -NO_HEADER argument, for walkers that previously did not include the command-line
arg headers.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4326 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-22 15:27:58 +00:00
depristo 74d4f124b1 Bug fixes to allow us to generate GATKRunReports for very early errors that leave the engine in a corrupt state. Vastly better error handling of common command line problems. Analysis output now notes whether an exception is a a UserException or a StingException
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4278 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-14 22:45:15 +00:00
depristo dbb641280e CycleCovariate now tolerates SOLEXA as machine type. Also, exception handling is now written to stderr.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4274 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-14 12:35:57 +00:00
depristo fa3be2209f Improvements to the error display code to print out the SVN number in all messages. Fixes to CallableLoci and tests to check for that case
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4270 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-13 18:36:45 +00:00
depristo 7880863eb7 Final step in error refactoring. GATK exception is now ReviewedStingException, indicating that this exception is really what one wants. Only use this exception when you have thought about StingException vs. UserException and made a real decision.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4267 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-12 15:07:38 +00:00
depristo 7ad8fbdd5a Moved GATKException to exceptions
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4266 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-12 14:47:19 +00:00
depristo 595907e98e Moving StingException
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4262 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-12 14:34:15 +00:00
depristo 40e6179911 Penultimate step in exception system overhaul. UserError is now UserException. This class should be used for all communication with the USER for problems with their inputs. Engine now validates sequence dictionaries for compatibility, detecting not only lack of overlap but now inconsistent headers (b36 ref with v37 BAM, for example) as well as ref / bam order inconsistency. New -U option to allow users to tolerate dangerous seq dict issues. WalkerTest system now supports testing for exceptions (see email and wiki for docs). Tests for vcf and bam vs. ref incompatibility. Waiting on Tribble seq dict improvements to detect b36 VCF with b37 ref (currently cannot tell this is wrong.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4258 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-12 14:02:43 +00:00
depristo 8f1a32acae All exceptions thrown by the GATK have been reviewed and UserErrors replaced where appropriate. Shazam. Another check-in will remove the GATKException and restore the StingException.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4252 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-10 15:25:30 +00:00
depristo 1de713f354 Massive review of maybe 50% of the exceptions in the GATK. GATKException is a tmp. tracker so that I can tell which StingExceptions I've reviewed. Please don't use it. If you are working on new code and are considering throwing exceptions, it's either UserError or StingException, please
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4246 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-09 23:21:17 +00:00
depristo 6a30617a60 Initial implementation of UserError exceptions and error message overhaul. UserErrors and their subclasses UserError.MalFormedBam for example should be used when the GATK detects errors on part of the user. The output for errors is now much clearer and hopefully will reduce GS posts. Please start using UserError and its subclasses in your code. I've replace some, but not all, of the StingExceptions in the GATK with UserError where appropriate.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4239 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-09 11:32:20 +00:00
depristo 3fd2392090 Improved interface to getting command line options. Now fully traverses all objects to get all internal argument collections. Preliminary (but disabled version) of phoning home (see -et argument for more information). Captures correct and erroring out runs and writes out gzipped, xml report with lots of useful information. Needs a bit more information but is approximately working. Reports going to /humgen/gsa-hpprojects/GATK/reports/ in submitted directory that will be collated by some external tool. Only operating if -et STANDARD or -et STDOUT are provided currently and REPORT_DIR contains a file called ENABLE. WalkerTest now adds -et NO_ET to tests to avoid populating the reports with tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4155 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-28 22:53:32 +00:00
hanna bdb3a7ebe6 The tagger was automatically combining identical tags, but this is a problem
for the ROD system.  Eliminate tag combine operation.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4121 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-25 22:01:32 +00:00
hanna bf0b6bd486 Update integration tests to use the new ROD syntax.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4112 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-25 18:13:30 +00:00
hanna 3dc78855fd Command-line argument tagging is in, and the ROD system is hacked slightly to support the new syntax
(-B:name,type file) as well as the old syntax.  Also, a bonus feature: BAMs can now be tagged at the
command-line, which should allow us to get rid of some of the hackier calls in GenomeAnalysisEngine.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4105 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-25 03:47:57 +00:00
ebanks 44f3c5639a I have finally figured out that when you volunteer to do something in group meeting, you keep getting pestered about it on Mark's Omniplan doc until it gets done (except for contig aliasing, of course). As such...
We can now emit bzipped VCFs from the GATK.

Details: any walker that defines a VCFWriter for its @Output (i.e. pretty much every core walker from UG and on), also has associated with it the -bzip (--bzip_compression) boolean argument.  When set, it will emit a VCF that is compressed with bzip2.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4093 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-24 04:14:50 +00:00
hanna 691333f75c Force isRequired() to be false for @Deprecated args.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4092 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-23 23:50:30 +00:00
hanna 5d6a6420a9 New behavior for filling it output streams: if required==true for a field and the field
is an output stream, we'll automatically create it and point it to stdout.  Otherwise, 
we'll leave it empty.  
I think about it like this: marking a field 'required' indicates to the GATK that the 
walker author requires a value for this field, and if the GATK can provide one without 
end user intervention, it will.  Maybe this is hackish.  We'll try it and see.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4091 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-23 23:39:13 +00:00
hanna c177801d81 Add deprecated command-line arguments, and switched over UG to output to
-o/--out instead of -varout.  Let's watch as our intrepid support engineer
gracefully responds to all the incoming questions of the form: "the GATK told
me to use -o instead of -varout.  What do I do?"


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4078 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-22 21:01:44 +00:00
hanna b80cf7d1d9 Modifications to the output system for better interaction with @Output. Multiplexed arguments. More details in the Monday meeting.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4077 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-22 14:27:05 +00:00
kshakir 542d394e09 Cleaning up Queue debugging output.
-l DEBUG with local programs now prints out the stdout/stderr of the programs as they are run.
More documentation in the examples with a new even simpler CountReads example.
Took out unused option to build Queue GATK extensions separately.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4025 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-13 15:54:08 +00:00
kshakir f39dce1082 Exposed CommandLineFunction defaults to the Queue.jar command line (see -help).
Added ability to skip up-to-date jobs where the outputs are older than the inputs.
Changed -T CountDuplicates --quiet to --quietLocus so that Queue GATK extensions can use both short and full argument names.
Short names can be used to set values on Queue GATK extensions, for example: vf.XL :+= myFile
Moved Hidden from the GATK to StingUtils.
Updated ivy from 2.0.0 to 2.2.0-rc1 to fix sha1 issue: http://bit.ly/aX72w7
Added Queue to javadoc and testing build targets.
Added first Queue unit test.
Another pass at avoiding cycles in the DAG thanks to all function I/O being files.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4017 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-11 21:58:26 +00:00
kshakir 4f51a02dea Changed logging level to default at INFO instead of WARN.
Changes to StingUtils command line for use in Queue, replacing Queue's use of property files.
Updates to walkers used in existing QScripts to add @Input/@Output.
RMD used in @Required/@Allows now has a new default equal to "any" type.
New QueueGATKExtensions.jar generator for auto wrapping walkers as Queue CommandLineFunctions.
Added hooks to modify the functions that perform the Scattering and Gathering (setting their jar files, other arguments, etc.)
Removed dependency on BroadCore by porting LSF job submitter to scala.
Ivy now pulls down module dependencies from maven.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3984 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-09 16:42:48 +00:00
ebanks 594b7912f1 Added a generic method for returning the complete command-line used when calling a walker, to be used in the bam/vcf headers. As requested, every possible engine/walker argument is included. I've added it to the Unified Genotyper output, so people can try it out and let me know what they think. Something that needs to be discussed in group meeting: what happens when we merge VCFs? Do we keep all of the command-lines?
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3969 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-08 03:53:07 +00:00
hanna 78bfe6ac48 Added @Hidden annotation, a way to deliberately exclude experimental fields and
walkers from the help system.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3946 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-05 02:26:46 +00:00
hanna 5f1b67c1de Coping out and forcing the entire GATK (and associated JVM) to use US English
locale.  Method to force JVM into proper locale exists in CommandLineProgram
and is disabled by default, but implementers of CommandLineProgram can opt in
to the forced US locale by calling a static method.

Question for the VCF developers: I removed the code to explicitly output doubles
in US locale.  Do you / how do you want to handle this in applications that use
Tribble outside the GATK?


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3917 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-03 03:48:26 +00:00
ebanks 7c5a3836db Trivial changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3875 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-25 04:00:47 +00:00
ebanks 56de475f11 Based on feedback from non-GSA users, who claim that our exceptions are 'scary and overwhelming,' I've cleaned up the error message to first describe the error and what users should do and then ask them to copy the subsequent stack trace into their GetSatisfaction posting.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3874 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-25 03:57:44 +00:00
kshakir 7be8c35eb2 Workaround for scala trait erasing parameterized types:
- Requiring explicit @ClassType on parameterized fields in traits.
- Scatter / Gather functions are now abstract classes since @ClassType can't be used on parameterized fields with type parameters.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3726 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-07 03:15:10 +00:00
kshakir 178cf64a0c Refactored ArgumentDefinition to absorb functionality from ArgumentDefinition and ArgumentTypeDescriptor.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3688 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-30 18:37:58 +00:00
kshakir 75c98c42b8 Started path of deprecation of Sting's @Argument by splitting the annotation into @Output and @Input. Anything that's not an @Output should be an @Input.
Checked in example qscripts that are basically todo integration tests.
Replaced use of queue @Input/@Output with Sting's new @Input/@Output.  This means you'll now have to doc-ument the annotations.
More work on dependency resolution cycles being created in the graph during scatter/gather.
Filtering nulls to avoid NPE exceptions in scala's 'Collection'.hashCode.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3643 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-25 20:51:13 +00:00
bthomas de9f1f575f Fixing command line parsing to accept negative number arguments. Command line definitions must now start with a letter or underscore; previously, they could start with a digit.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3603 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-21 21:54:31 +00:00
hanna c1e53d407d The copyright tag that I copied/pasted from a LaTeX document into IntelliJ had
unicode quote characters embedded in it.  These characters were invisible inside
IntelliJ but cause compile warnings for Ryan and Aaron, who for whatever reason
have a different default charset.  Fixed.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3203 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 15:26:32 +00:00
hanna 1bc26f69e9 An attempt to cleanup the Utils directory. Email to follow.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3198 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 23:00:08 +00:00