gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Roger Zurawicki	f3c504769b	Added the ability to update the Forum GATKDocs looks for a key on gsa4, and updates the forum with new walker if it exists. More changes were made to the GATKDocs. Works nicely with bootstrap on and offline. Cleaned up the code as well Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>	2012-07-23 17:17:33 -04:00
Mauricio Carneiro	116885a450	Removed the "Walker" suffix from all walkers that had it. * Did not touch archived walkers... those can be named whatever. * Kept abstract classes that end in Walker untouched (e.g. LocusWalker, ReadWalker, ...) * Renamed a few inner classes due to conflict when stripping off Walker from their outer classes: ContigStats, FlagStats and FastaStats.	2012-07-20 17:27:11 -04:00
Joel Thibault	abe74dc32d	Navel -> GXDB	2012-06-28 13:38:00 -04:00
Mark DePristo	31ee8aa01a	JEXL update -- Update to 2.1.1 from 2.0 -- VariantFiltrationWalker now allows you to run with type unsafe selects, which all default to false when matching. So "AF < 0.5" works even in the presence of multi-allelics now. --	2012-06-21 15:17:21 -04:00
Mark DePristo	e9c22b9aad	Final updates to integration tests for BCF2 -- Fully working version -- Use -generateShadowBCF to write out foo.bcf as well as foo.vcf anywhere you use -o foo.vcf -- Moved MedianUnitTest to its proper home in Utils -- Added reportng to ivy and testng, so build/report/X/html/ is a nicely formatted output for Unit and Integration tests. From this website it's easy to see md5 diffs, etc. This is a vastly better way to manage unit and integration test output	2012-05-24 10:58:59 -04:00
Joel Thibault	085588cb04	Not Nexus. Need new name. Navel?	2012-05-24 10:11:58 -04:00
David Roazen	9c6bccfd8b	build system overhaul * Added support for a protected directory whose contents are only made public in binary form * Simplified and reorganized build.xml to improve readability and maintainability * build.xml now autodetects most build properties: -Includes private/protected if they exist -No more STING_BUILD_TYPE or specialized targets for public-only, etc. * Build targets have changed! There are now two main build options: "ant" build everything (GATK and Queue) "ant gatk" build just the GATK It was too hard to build everything before -- now it is the default. * To run tests with debugging, use -Dtest.debug=true -Dtest.debug.port=XXXX on the command line. Much better than the old comment/uncomment method!	2012-05-17 15:16:29 -04:00
Joel Thibault	229d1aa904	Bjorn -> Nexus	2012-05-15 13:30:29 -04:00
Joel Thibault	aa4d41cce0	Minor cleanup before push	2012-05-01 14:16:44 -04:00
Joel Thibault	d93a413f2e	Add MongoDB dependency	2012-05-01 13:53:43 -04:00
Mark DePristo	3164c8dee5	S3 upload now directly creates the XML report in memory and puts that in S3 -- This is a partial fix for the problem with uploading S3 logs reported by Mauricio. There the problem is that the java.io.tmpdir is not accessible (network just hangs). Because of that the s3 upload fails because the underlying system uses tmpdir for caching, etc. As far as I can tell there's no way around this bug -- you cannot overload the java.io.tmpdir programmatically and even if I could what value would we use? The only solution seems to me is to detect that tmpdir is hanging (how?!) and fail with a meaningful error.	2012-01-29 15:14:58 -05:00
Mark DePristo	cb04c0bf11	Removing javaassist 3.7, lucene library dependancies	2012-01-27 08:24:22 -05:00
Khalid Shakir	5793625592	No more "Q-<pid>@<host>". Generated log file names now use the first output + ".out" (ex. my.vcf.out) or the name of the first QScript plus the order the function was added (ex. MyScript-1.out). The same function added twice with the same outputs will now have the same default logs, meaning the 2nd instance of the function won't be added to the graph twice. QScript accessor to QSettings to specify a default runName and other default function settings. Because log files are no longer pseudo-random their presense can be used to tell if a job without other file outputs is "done". For now still using the log's .done file in addition to original outputs. Gathered log files concatenate all log files together into the stdout. InProcessFunctions now have PrintStreams for stdout and stderr. Updated ivy to use commons-io 2.1 for copying logs to the stdout PrintStream. Removed snakeyaml. During graph tracking of outputs the Index files, and now BAM MD5s, are tracked with the gathering of the original file. In Queue generated wrappers for the GATK the Index and MD5s used for tracking are switched to private scope. Added more detailed output when running with -l DEBUG. Simplified graphviz visualization for additional debugging. Switched usage of the scala class 'List' to the trait 'Seq' (think java.util.ArrayList vs. using the interface java.util.List) Minor cleanup to build including sending ant gsalib to R's default libloc.	2012-01-08 12:11:55 -05:00
David Roazen	ea6e718cb8	SnpEff 2.0.5 support. Re-enabled SnpEff in the HybridSelectionPipeline. For now, we recommend only running with the GRCh37.64 database.	2012-01-03 15:18:36 -05:00
Mauricio Carneiro	f7a5752025	Let this one slip through my commits.	2011-12-26 21:55:02 -05:00
Mauricio Carneiro	4633637af6	Moved ReduceReads to static ReadClipper * all clipping done in ReduceReads is done using the static methods of the ReadClipper now.	2011-12-26 21:14:40 -05:00
David Roazen	68b2a0968c	Updating the HybridSelectionPipeline for SnpEff 2.0.4 RC3 This will have to be done again when the 2.0.4 release becomes official, but it's necessary to do now in order to re-enable the pipeline tests.	2011-11-17 14:46:12 -05:00
Khalid Shakir	b090751f62	Fixed Ant / PluginManager issue where reflections was picking up all class files under current working directory due to "." in jar manifest classpaths. Updates to HybridSelectionPipeline: - Added annotations back via snpEff - Minor updates to VQSR paths and lowered memory	2011-09-27 14:33:57 -04:00
Khalid Shakir	61b89e236a	To work around potential problem with invalid javax.mail 1.4.1 in ivy cache, added explicit javax.mail 1.4.4 along with build.xml code to remove 1.4.1.	2011-09-20 00:14:35 -04:00
David Roazen	bd5cdb8a43	The tribble dependency is now handled through ivy. Revved tribble to r18 and removed obsolete build targets in build.xml	2011-08-11 16:38:29 -04:00
Mark DePristo	45c73ff0e5	Runs and emits an HTML document	2011-07-20 17:16:33 -04:00
David Roazen	68e19edf59	Merged bug fix from Stable into Unstable, and resolved merge conflicts. Conflicts: build.xml settings/ivysettings.xml	2011-07-08 15:50:31 -04:00
David Roazen	a3c9d9c3ff	Fixing Contracts for Java, and enabling contracts by default for unit/integration tests. The NullPointerException we were seeing when trying to run with contracts enabled was being caused by an outdated version of the asm library. To run tests without contracts and disable their compilation, pass in "-Duse.contracts=false" to ant. Also did some minor unrelated cleanup in build.xml	2011-07-08 15:34:39 -04:00
Khalid Shakir	7b699f8b17	Switched GridEngine from looking from environment variable to using embedded jar.	2011-07-05 21:59:00 -04:00
droazen	cc1f94310d	A prototype script and library dependencies to extract a BAM list from a reasonably well-formed PM's xls{x}-format spreadsheet or tsv file. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@6036 348d0f76-0448-11de-a6fe-93d51630548a	2011-06-22 22:53:45 +00:00
hanna	c2e8c460cb	Factor out all testing dependencies into a separate test configuration and only download that test configuration when running unit/integration tests. This means that the build will (hopefully) never break because it can't fetch a file that isn't required for the GATK to run. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5775 348d0f76-0448-11de-a6fe-93d51630548a	2011-05-05 22:42:11 +00:00
hanna	45d8634522	Intermediate commit: bring Google Caliper into our private repository (even though sonatype is back up). This will tide us over until I figure out how to add caliper to test configuration, so that it's only swapped in when we actually run our unit / performance tests. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5770 348d0f76-0448-11de-a6fe-93d51630548a	2011-05-05 14:33:14 +00:00
hanna	b915520653	Updating to apache commons math v2. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5689 348d0f76-0448-11de-a6fe-93d51630548a	2011-04-26 17:31:49 +00:00
hanna	57a4700299	Ported small BAM performance test suite to the Google Caliper microbenchmarking suite. Looks promising, but I'm still not sure that GC is a good long-term solution. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5683 348d0f76-0448-11de-a6fe-93d51630548a	2011-04-22 22:09:17 +00:00
hanna	b8c3c3ae6e	Added commons math, for Kristian. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5238 348d0f76-0448-11de-a6fe-93d51630548a	2011-02-14 18:57:21 +00:00
depristo	197c91e2fb	Working implementation of GATKRunReport POSTing to Amazon Web Services S3 storage. Requires users to explicitly provide the secret key to do the upload. Am investigating options to avoid having to do this in the future. Pretty cool little experiment for those who are interested in S3 interaction (extremely trivial) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5130 348d0f76-0448-11de-a6fe-93d51630548a	2011-01-30 21:23:54 +00:00
depristo	7b92cd5008	Adding lucene dependency for file locking -- may be removed in the near future git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5065 348d0f76-0448-11de-a6fe-93d51630548a	2011-01-24 18:59:42 +00:00
kshakir	b34e2f733f	Removed stochasticity from IndelRealigner by random sampling using and seed based on the read list. Updated the Queue scatter/gather for read walkers to include -L unmapped on the last scatter job when intervals aren't specified, and to map it correctly when it is explicitly set. Simplified the build.xml/ivy.xml to fix a bug reported with "ant clean dist test" where the scalac target wasn't found. Now building all scala code at the same time, just like all java code is compiled at the same time. Sped up the build for everyone by uncommenting a small bit of classes so that javac/scalac will not constantly launch trying to build .class files that will never compile. Moved some source files to their expected location so that the .java/.scala -> .class is a one-to-one match, again keeping the compilers from wasting cycles. Used <uptodate> and <touch> to skip extracting the help text and generating the GATK Queue extensions when the source files haven't been modified. Fixed a couple errors when the <javadoc> task is run. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4963 348d0f76-0448-11de-a6fe-93d51630548a	2011-01-07 22:03:36 +00:00
kshakir	56433ebf6b	Switched from LSF command line wrappers to JNA wrappers around the C API. Side effects: - bsub command line is no longer fully printed out. - extraBsubArgs hack is now a callback function updateJobRun. Updated FullCallingPipelineTest to reflect latest changes to fullCallingPipeline.q. Added a pipeline that tests the UGv2 runtimes at different bam counts and memory limits. Updated VE packages that live in oneoffs to compile to oneoffs. Added a hack to replace the deprecated symbol environ in Mac OS X 10.5+ which is needed by LSF7 on Mac. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4816 348d0f76-0448-11de-a6fe-93d51630548a	2010-12-10 04:36:06 +00:00
kshakir	673fa841a4	Updated PluginManager so that during testing Queue can dynamically compile and load separately multiple class directories into the same class loader. Removed obsolete usages of PackageUtils with updated PluginManager. Ported Queue interval utilities written in scala over to Sting's java IntervalUtils. Added a very basic intergration test to ensure that the fullCallingPipeline.q compiles. Added options to specify the temporary directories without having to use -Djava.io.tmpdir (useful during the above integration test). While adding tempDir added options to specify the run directory from the command line, for example "-runDir v1". Upgraded to scala 2.8.1 and updated calls to deprecated functions. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4661 348d0f76-0448-11de-a6fe-93d51630548a	2010-11-12 20:14:28 +00:00
hanna	861ee3e37a	Changing testing framework from junit -> testng, for its enhanced configurability. Initial test to see how Bamboo will respond. More detailed email to follow. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4609 348d0f76-0448-11de-a6fe-93d51630548a	2010-11-01 21:31:44 +00:00
kshakir	b954a5a4d5	- After removing special code for intervals, instead of being of type File they are generated as List[File]. Changed previous checkin that was appending to this list and instead assigning a singleton list. - More cleanup including removing the temporary classes and intermediate error files. Quieting any errors using Apache Commons IO 2.0. - Counting the contigs during the QScript generation instead of the end user having to pass a separate contig interval list. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4539 348d0f76-0448-11de-a6fe-93d51630548a	2010-10-21 06:37:28 +00:00
kshakir	63e3848187	Added status email support with -statusTo. Will send emails on failure of an individual function or success/failure of the whole pipeline. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4496 348d0f76-0448-11de-a6fe-93d51630548a	2010-10-14 15:58:52 +00:00
kshakir	db47230dd9	Wrapping ScatterGatherableFunctions with a facade instead of using slower clone library. Will require keeping Clone's facade code in sync with CommandLineFunction but runs much faster. Shell invoking scripts so that even really long shell scripts make it through LSF. Using the truncated (up to 1000 characters) of the command line for the job name for use with bjobs. Switched the default from re-running everything to re-running only files that need to be regenerated. --skip_up_to_date replaced with --start_clean for those who want to regenerate everything. Updated logging to let users know when the scatter gather generator is running, which still takes a while but is orders of magnatudes faster for large lists of functions. (40s for a 100 function graph exploding to a 2500 function graph) git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4448 348d0f76-0448-11de-a6fe-93d51630548a	2010-10-07 01:19:18 +00:00
kshakir	20b38b38f3	Updated from SnakeYAML 1.6 to 1.7. Added a pipeline java bean and YAML utility to serialize java beans. Added a getFirehosePipelineYaml.sh that can pull firehose data into the pipeline yaml file format. Updated the fullCallingPipeline.q to begin using the pipeline yaml file format for bams and reference. More changes to come as this code gets tested out in the fullCallingPipeline. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4329 348d0f76-0448-11de-a6fe-93d51630548a	2010-09-22 19:47:49 +00:00
bthomas	e5f81d25d4	Adding the --sample-metadata (-SM) command line argument and associated functionality. This is something Matt and I have been working on for a while. Basically, it allows you to integrate sample metadata into an analysis, by including a sample file. More detailed documentation is on the wiki: http://www.broadinstitute.org/gsa/wiki/index.php/Adding_Sample_data_to_an_analysis This commit adds two important classes: Sample, which contains data about one sample; and SampleDataSource, which manages sample data a la ReferenceDataSource and ReadsDataSource. This code should be stable, but it has not been integrated with existing walkers yet. That's the next commit. In the meantime, feel free to experiment with the code - there are two basic example walkers in the playground.sample package. And PLEASE let me know if you see any errors/inconsistencies. Note that this also adds a new dependency on SnakeYaml, a YAML parser. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4285 348d0f76-0448-11de-a6fe-93d51630548a	2010-09-15 11:50:22 +00:00
ebanks	43f1fb2380	Okay, finally done with VCF compression. Now: 1. Uses blocked gzip compression. 2. No more -bzip option available (since we can't compress to sdout). 3. Only file extensions that are compressed are .gz and .gzip. 4. No more need for CompressedVCFWriter.java git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4099 348d0f76-0448-11de-a6fe-93d51630548a	2010-08-24 16:36:54 +00:00
ebanks	44f3c5639a	I have finally figured out that when you volunteer to do something in group meeting, you keep getting pestered about it on Mark's Omniplan doc until it gets done (except for contig aliasing, of course). As such... We can now emit bzipped VCFs from the GATK. Details: any walker that defines a VCFWriter for its @Output (i.e. pretty much every core walker from UG and on), also has associated with it the -bzip (--bzip_compression) boolean argument. When set, it will emit a VCF that is compressed with bzip2. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4093 348d0f76-0448-11de-a6fe-93d51630548a	2010-08-24 04:14:50 +00:00
aaron	3dc4d3c3a9	removing the custom reflections library from the libs, and adding a release version. Hopefully this will fix the problem Menachem has been seeing with random JVM crashes. Also removed the auto-deletion of the reflections jar, and removed the very old OmniPlan document we had checked-in. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4056 348d0f76-0448-11de-a6fe-93d51630548a	2010-08-19 00:42:37 +00:00
kshakir	4f51a02dea	Changed logging level to default at INFO instead of WARN. Changes to StingUtils command line for use in Queue, replacing Queue's use of property files. Updates to walkers used in existing QScripts to add @Input/@Output. RMD used in @Required/@Allows now has a new default equal to "any" type. New QueueGATKExtensions.jar generator for auto wrapping walkers as Queue CommandLineFunctions. Added hooks to modify the functions that perform the Scattering and Gathering (setting their jar files, other arguments, etc.) Removed dependency on BroadCore by porting LSF job submitter to scala. Ivy now pulls down module dependencies from maven. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3984 348d0f76-0448-11de-a6fe-93d51630548a	2010-08-09 16:42:48 +00:00
aaron	72ae81c6de	VariantContext has now moved over to Tribble, and the VCF4 parser is now the only VCF parser in town. Other changes include: - Tribble is included directly in the GATK repo; those who have access to commit to Tribble can now directly commit from the GATK directory from Intellij; command line users can commit from inside the tribble directory. - Hapmap ROD now in Tribble; all mentions have been switched over. - VariantContext does not know about GenomeLoc; use VariantContextUtils.getLocation(VariantContext vc) to get a genome loc. - VariantContext.getSNPSubstitutionType is now in VariantContextUtils. - This does not include the checked-in project files for Intellij; still running into issues with changes to the iml files being marked as changes by SVN I'll send out an email to GSAMembers with some more details. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3954 348d0f76-0448-11de-a6fe-93d51630548a	2010-08-05 18:47:53 +00:00
aaron	35ce367898	adding the annotations for findbugs as dependencies in the GATK. They have to be in the default config so that we can annotate code without running findbugs. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3829 348d0f76-0448-11de-a6fe-93d51630548a	2010-07-19 16:34:57 +00:00
aaron	7ff6106c14	adding Ivy lines for findbug, and adding a build task (to run it locally you need to have installation of findbug). I'll put more information on the wiki when it's up and running. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3744 348d0f76-0448-11de-a6fe-93d51630548a	2010-07-08 19:10:19 +00:00
kshakir	30cf78fdc0	Refactoring for a first version of scatter gather api with basic shell script implementations. Modified build script so that queue is cleaned during "ant clean". git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3611 348d0f76-0448-11de-a6fe-93d51630548a	2010-06-22 18:39:20 +00:00
kshakir	32fc221ffe	Replaced pattern matched pipeline spec with annotated objects. Old version is no longer available. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3558 348d0f76-0448-11de-a6fe-93d51630548a	2010-06-15 04:43:46 +00:00

1 2

84 Commits (874dbf5b58669dfe2d0b22aa954598ca89a77b41)