Commit Graph

94 Commits (d55dddfdbab3a0faac4fa562fbc4090d8b7b331b)

Author SHA1 Message Date
bthomas e5f81d25d4 Adding the --sample-metadata (-SM) command line argument and associated functionality. This is something Matt and I have been working on for a while. Basically, it allows you to integrate sample metadata into an analysis, by including a sample file. More detailed documentation is on the wiki: http://www.broadinstitute.org/gsa/wiki/index.php/Adding_Sample_data_to_an_analysis
This commit adds two important classes: Sample, which contains data about one sample; and SampleDataSource, which manages sample data a la ReferenceDataSource and ReadsDataSource. 

This code should be stable, but it has not been integrated with existing walkers yet. That's the next commit. 

In the meantime, feel free to experiment with the code - there are two basic example walkers in the playground.sample package. And PLEASE let me know if you see any errors/inconsistencies.

Note that this also adds a new dependency on SnakeYaml, a YAML parser.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4285 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-15 11:50:22 +00:00
ebanks 43f1fb2380 Okay, finally done with VCF compression. Now:
1. Uses blocked gzip compression.
2. No more -bzip option available (since we can't compress to sdout).
3. Only file extensions that are compressed are .gz and .gzip.
4. No more need for CompressedVCFWriter.java



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4099 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-24 16:36:54 +00:00
ebanks 44f3c5639a I have finally figured out that when you volunteer to do something in group meeting, you keep getting pestered about it on Mark's Omniplan doc until it gets done (except for contig aliasing, of course). As such...
We can now emit bzipped VCFs from the GATK.

Details: any walker that defines a VCFWriter for its @Output (i.e. pretty much every core walker from UG and on), also has associated with it the -bzip (--bzip_compression) boolean argument.  When set, it will emit a VCF that is compressed with bzip2.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4093 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-24 04:14:50 +00:00
aaron 3dc4d3c3a9 removing the custom reflections library from the libs, and adding a release version. Hopefully this will fix the problem Menachem has been seeing with random JVM crashes. Also
removed the auto-deletion of the reflections jar, and removed the very old OmniPlan document we had checked-in.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4056 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-19 00:42:37 +00:00
kshakir 4f51a02dea Changed logging level to default at INFO instead of WARN.
Changes to StingUtils command line for use in Queue, replacing Queue's use of property files.
Updates to walkers used in existing QScripts to add @Input/@Output.
RMD used in @Required/@Allows now has a new default equal to "any" type.
New QueueGATKExtensions.jar generator for auto wrapping walkers as Queue CommandLineFunctions.
Added hooks to modify the functions that perform the Scattering and Gathering (setting their jar files, other arguments, etc.)
Removed dependency on BroadCore by porting LSF job submitter to scala.
Ivy now pulls down module dependencies from maven.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3984 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-09 16:42:48 +00:00
aaron 72ae81c6de VariantContext has now moved over to Tribble, and the VCF4 parser is now the only VCF parser in town. Other changes include:
- Tribble is included directly in the GATK repo; those who have access to commit to Tribble can now directly commit from the GATK directory from Intellij; command line users can commit from 
inside the tribble directory.
- Hapmap ROD now in Tribble; all mentions have been switched over.
- VariantContext does not know about GenomeLoc; use VariantContextUtils.getLocation(VariantContext vc) to get a genome loc.
- VariantContext.getSNPSubstitutionType is now in VariantContextUtils.
- This does not include the checked-in project files for Intellij; still running into issues with changes to the iml files being marked as changes by SVN

I'll send out an email to GSAMembers with some more details.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3954 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-05 18:47:53 +00:00
aaron 35ce367898 adding the annotations for findbugs as dependencies in the GATK. They have to be in the default config so that we can
annotate code without running findbugs.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3829 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-19 16:34:57 +00:00
aaron 7ff6106c14 adding Ivy lines for findbug, and adding a build task (to run it locally you need to have installation of findbug). I'll put more information on the wiki when it's up and running.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3744 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-08 19:10:19 +00:00
kshakir 30cf78fdc0 Refactoring for a first version of scatter gather api with basic shell script implementations.
Modified build script so that queue is cleaned during "ant clean".



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3611 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-22 18:39:20 +00:00
kshakir 32fc221ffe Replaced pattern matched pipeline spec with annotated objects.
Old version is no longer available.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3558 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-15 04:43:46 +00:00
depristo 6eeb1693ca JEXL2 upgrade. Improvements to JEXL processing including dynamically resolving variable -> value bindings instead of up front adding them to a map. Performance improvements and code cleanup throughout.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3494 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-07 00:33:02 +00:00
depristo 5f950dcc61 Added Apache Commons IO
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3467 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-31 16:25:27 +00:00
kshakir e9ee55d7dd A cleaned up functioning early, early access version of Queue for others to play with and provide feedback about next steps.
Current version only has syntatic sugar for accessing the graph via rules ex. "bam" -> "bam.bai", "samtools index ${bam}" and DOES NOT have sugar for constructing your own graph.
Usage info on the internal wiki at https://iwww.broadinstitute.org/gsa/wiki/index.php/Queue


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3420 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-23 20:21:09 +00:00
aaron 7467ec2fd6 updating the reflections library; Matt found a problem where the reflections library doesn't sort out non-java objects from the classpath (affects only OS X so far). I'll push back the changes to
the reflections library people.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3307 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 02:08:41 +00:00
aaron 64c5f287c5 fixes for edge-cases when using reflections to find classes outside of the main jar. Will push as a patch to reflections
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3264 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-27 17:46:46 +00:00
aaron c647153b10 Adding Jama for Ryan.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3262 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-27 14:30:36 +00:00
aaron f6468f9143 a fix for a bug we've worked around in the reflections package: previously it didn't find classes that weren't in the main jar. Fixed in this version.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3261 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-27 04:49:49 +00:00
aaron 8fd59c8823 Modified the report system based on Ryan's feedback: tables are now created independently to avoid the permutation problem when they were all compressed in rows, and removed our dependency on FreeMarker. The Grep format stays the same.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3130 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 20:39:55 +00:00
aaron f455412ea8 adding a dependency that I forgot.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3033 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 13:32:37 +00:00
aaron 8fd3351971 adding a stripped down Tribble library for the start of integration
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2851 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-17 21:29:25 +00:00
ebanks 797bb83209 New VariantFiltration.
Wiki docs are updated.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2105 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-20 19:50:26 +00:00
hanna 8eff1cc436 Extract and include only the Tim Fennell-approved parts of picard private.
Hopefully this is a temporary solution and these classes will be migrated
into picard-public.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2041 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-13 19:42:33 +00:00
aaron de6ae51f7e Scala walkers can now be build and run like any other walker in the GATK. Added the getUrlsForClasspath to PackageUtils, the Reflections package isn't getting the manifest files from jars in the classpath, and so we weren't seeing any walkers outside of the GenomeAnalysisTK.jar.
A couple of notes:
-Commented out BaseTransitionTableCalculator.scala because it's won't build; Chris could you fix this one (or kill it if it's not needed).
-Removed the PrintReadsScala walker; moved the code over to a ScalaCountLoci walker (which is what the code was really doing).
-Added configurations items to the ivy xml file.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1956 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-02 06:02:41 +00:00
hanna c35a457a09 Delete duplicate jgrapht reference.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1935 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-29 17:38:01 +00:00
depristo 86573177d1 Reverting rod walkers to use underlying refwalker implementation while we work on ROD2 and reenable the system. Added some serious sparse file parsing to variant eval tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1929 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-29 01:04:37 +00:00
hanna e3b9114664 Added jgrapht dependency.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1923 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-28 14:29:17 +00:00
hanna b43925c01e Switched to Reflections (http://code.google.com/p/reflections/) project for
inspecting the source tree and loading walkers, rather than trying to roll
our own by hand.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1286 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-21 18:32:22 +00:00
hanna ed7fac1c90 Add bcel and cleanup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1032 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-17 19:28:04 +00:00
hanna 5e8c08ee63 Update to latest version of picard. Change imports in all classes dependent on picard public from import edu.mit.broad.picard... to import net.sf.picard...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@849 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 20:13:01 +00:00
hanna aa17c4a468 Farewell, functionalj. You promised much, but you could not deliver.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@847 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-28 01:35:49 +00:00
hanna df8490a0cf Remove unused dependency on commons logging.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@829 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-27 14:12:26 +00:00
aaron 1a3ca97d29 remove the ivy command for dependency on BCEL, we're not using it right now.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@775 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 19:35:53 +00:00
aaron c34eaa6962 add javassist, which is a less lower level version of bcel.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@755 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-20 05:11:03 +00:00
aaron 21536df308 Change the sample XML marshalling code over to simple XML, and take out the castor lines in the ivy.xml
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@633 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 00:08:25 +00:00
aaron 5d9536d2b4 Added simple xml to replace castor (at least for now)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@632 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 23:57:18 +00:00
hanna ef211f96b1 Remove old Apache CLI-based arg system.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@604 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 18:37:51 +00:00
aaron 98f4920739 Added BCEL and some basic instrumenation code to the test library.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@602 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 17:18:23 +00:00
hanna abe2d25f10 Added castor dependency.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@321 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:27:39 +00:00
hanna 067ae09cd0 Bump picard and samtools to latest.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@207 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 02:26:28 +00:00
hanna 9e2a373184 Prototype, buggy implementation of walker command-line arguments. Doesn't
(yet) deal elegantly with even simple cases.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@180 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 00:12:00 +00:00
aaron 046cecb067 Switched our code over to the new command line style (gnu style args), added the initial logger code, and added apache commons CLI to the IVY script.
There will be a slow conversion of all the System.out and System.err in other files to the logger style output.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@145 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-22 21:06:22 +00:00
hanna 1fcf4c0cbf Update picard to work with new samtools.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@123 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-20 21:51:26 +00:00
aaron 7bc45b68aa Added dependences on two libraries: the Colt package, which is a collection of high performance computing libraries from CERN; and Log4j, which will be our new logging platform.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@100 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-19 16:16:31 +00:00
hanna 1096bbd4d9 Moved build.xml, ivy.xml and settings to root of Sting repository.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@88 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-18 19:13:19 +00:00