Commit Graph

4389 Commits (7f1e44b764ccaeaf5eeaab5e885c9aa0a07b3ad9)

Author SHA1 Message Date
ebanks 7f1e44b764 update the example: /broad/1KG doesn't exist anymore
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4429 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-05 13:50:24 +00:00
hanna 250c18e679 Error message fixes for the following issues:
nvjpM4yOwQAu3fNGxi4oXLuVpKn6aAlf,1GL0OuXK2xKQfvbu34tWYgbojSVSLo0l,
ehEGBJOfgc4V7qj8W0Homf5ICuVK5Sm3,cZsreLm1CbY3aYKZhV7DOSvQNwur41zp,
GlrlyGEyP9kJDIRCQNFQp7BGJBXSzdDJ,hyz1uiHXr39ANmdZu9K1epOSX8EL3mDw,
q0n4EucZESCI4LZhQik306zD4VAuH2cb.  

Messages:
camrhG5tHzlY9WUSEVpVZGkU1tyJqKb5,s0OX2g7nYRctJxyFoQCa6clac9IsjHyi,
THIAtjllvYNlnTmiMnJEIHd2Ju4gqQIO,jwVk3JYZJNHloW7HO4LeGxFexknqro0v,
BFNRGOGmGGJNNPZqgeF1ikTNFfskbyLc,...

Were fixed in 4392.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4428 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-05 03:37:13 +00:00
corin e340be34d8 upping mem limit since something was unhappy with the lower limit
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4427 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-05 02:38:17 +00:00
kshakir bb44044ce0 Fixed re-builds of queue so that previously compiled classes are included. Fixes redundant case of "ant queue test" vs. "ant test".
Refactored temp directory utils.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4426 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 21:12:07 +00:00
kshakir 4dfed62e7d Generating the Queue GATK extensions using java, then compiling all the Queue scala code at once to allow circular dependencies between existing and generated scala code.
Will see how this behaves for those using IntelliJ as generated source code will disappear during an ant clean.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4425 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 19:38:29 +00:00
kiran 24cf6f9e36 Fix to handle situation where there are no filtered variants.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4424 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 18:34:01 +00:00
ebanks aa00801108 remove reference to -mrl
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4423 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 17:27:01 +00:00
chartl f978c25b9d Perhaps both, Eric. Perhaps both.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4422 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 13:56:04 +00:00
chartl 0eb777612a Swap "." over to VCFConstants.MISSING_DEPTH_v3
Why v3, you ask? Why not? Simply because v2 was a String so old and clunky, the sun would fizzle out and grow cold before any VCF could be successfully parsed.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4421 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 13:41:41 +00:00
chartl 74087c44ae Fixed a bug which caused a parsing exception when there was a variant with a dp field of ".", e.g. "GT:DP 0/1:." -- which can happen when using imputation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4420 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 12:37:36 +00:00
ebanks 6448753cf7 Removed the SequenomValidationConvertor and renamed it VariantValidationAssessor since it no longer handles ped/sequenom files (but instead works on vcfs/variantcontexts). Updated all of the wiki docs, including adding instructions on how to convert ped files to vcf, a la Shaun Purcell. We now officially no longer support ped files everyone. Other misc cleanup in the code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4419 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 02:11:38 +00:00
kiran a15757b8e8 Obsoleted by VariantReport.R
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4418 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 01:00:59 +00:00
kshakir cf01f6d58a Renamed conflicting 'package.dir' in build.xml to 'package.xml.dir'.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4417 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 00:46:47 +00:00
kiran 62f5383859 * Added an R package, "gsalib", providing a place to store common, useful, documented R methods. To use this module, you must follow three steps:
1) Build the module with the following command:
$ ant gsalib

2) Add the module path to your ~/.Rprofile file:
.libPaths("/path/to/Sting/trunk/R/")

3) At the top of each R script that will use the library, include the line:
library(gsalib)

You can now use the package like any other R package.  To get high-level documentation, supply the following command to R:
help(gsalib)

The methods contained herein are:

    getargs         : A method to easily provide arguments to interactive and non-interactive scripts.
                        Prints out a help message specifying how the script should be run if no arguments
                        or "-h" is provided.  Very helpful when you're writing an R-script piecemeal in
                        interactive mode, then want to make it a command-line program.
    plot.venn       : Plots a two-way or three-way proportional Venn diagram.
    read.eval       : Reads VariantEval output that's formatted in R style.
    read.gatkreport : Reads GATKReport output.
    gsa.message     : Emits a message with the prefix "[gsalib]" to stdout.
    gsa.warn        : Emits a warning message with the prefix "[gsalib] Warning:" to stdout.
    gsa.error       : Emits an error message with the prefix "[gsalib] Error: to stdout, calls traceback()
                        and halts execution.

Documentation on each of these methods can be obtained by typing "help(method_name)" at the R prompt.

* Retired GATKReport.R, as that functionality has now been moved to gsalib.
* Retired gsacommons, as that functionality has been split between gsalib and VariantReport.R.
* Modified VariantReport.R to make use of gsalib.  The script now uses the getargs() method to provide the user with some information as to the proper way to run the script.  Documentation on how to prepare output is given at http://www.broadinstitute.org/gsa/wiki/index.php/VariantEval .
* Added 'gsalib' target to build.xml file.  Running "ant gsalib" will compile this module and place the R-ready package in R/gsalib .



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4416 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 00:27:59 +00:00
ebanks d8db48204e Fix typo and tell people not to post user errors
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4415 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-03 18:58:03 +00:00
ebanks 490e5e1b0f Better error when bad ref bases are provided
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4414 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-03 05:40:37 +00:00
kiran 40b2f62a83 Changed precision on Ti/Tv in venn diagrams
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4413 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-02 05:27:13 +00:00
kiran d0e44b7a8e Lower precision on Ti/Tv in variant summary matrix
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4412 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-02 05:18:48 +00:00
kiran 6deb755164 Ti/Tv plots are restricted to a Ti/Tv range of 0.0-4.0. Added column to variant summary specifying the total variant counts (known+novel). Allele spectrum plots now show neutral expectation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4411 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-02 05:15:34 +00:00
aaron 64b7b3f83b fix for a recent change to the indexing code where we ignore the results of locking the file (this is bad), and as a result don't write the index; this should fix the build.
Off to Yosemite in 4 hours, enjoy the week gsa folks!



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4410 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-02 04:35:11 +00:00
depristo 7551ba8249 Trival refactoring in preparation for on-the-fly indexing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4409 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 22:32:59 +00:00
hanna 399e6f1463 Make package dir configurable.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4408 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 22:25:17 +00:00
kiran 1d7e48c4b0 Venn diagrams are now oriented properly when a < b. Added a slide with callset summary table. All plots now show the present-in-a, filtered-in-b metrics. Added title page with project name, author, and timestamp.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4407 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 22:17:21 +00:00
rpoplin 2f7892601c Useful debugging argument added to VariantRecalibrator to only use sites whose qual field is above --qual
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4406 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 21:08:55 +00:00
hanna 575c38fc04 Accidental fail to commit missing file.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4405 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 20:26:51 +00:00
kiran fe29c8b09c Placeholder commit: improvements to VariantReport (now shows stats for variants that are called in one set and filtered in another). Better command-line argument support.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4404 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 18:46:53 +00:00
delangel d4398f2686 silly bug fix: if I'm to do a short term hack to avoid -infinity likelihoods I might as well do it right.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4403 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 18:39:45 +00:00
hanna 8d25a5f9f2 A mechanism for supplying attribution text -- mainly useful for external
walkers.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4402 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 18:31:19 +00:00
delangel e920badcc4 Temporary fix for case where genotype likelihoods are exactly (1,0,0) or (0,1,0) etc. at a site with new indel genotyper: this would make us blow up when converting to log space and try to assign genotypes at a site. A more robust solution is in the works.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4401 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 17:43:43 +00:00
rpoplin b83fdf8a17 Bug fix in AnalyzeAnnotations. Be sure the site is a biallelic, unfiltered SNP.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4400 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 13:09:46 +00:00
chartl 7639692e5b Sigh. Fix the source of even more UserErrors in the phone home directory: make sure to gunzip the beagle files before passing them into the conversion walker...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4399 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 03:28:36 +00:00
delangel fa9c21c020 More fixes for exact AF calculation model in new unified genotyper:
a) Fixed bugs in new dynamic programming-based genotyper
b) Fixed up temp hack that handles extended pileups for now.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4398 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 02:32:50 +00:00
delangel eb67aee732 bug fix: forgot to uncomment code to compute genotype likelihoods
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4397 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 21:38:22 +00:00
delangel ece694d0af Next iteration on new UG framework:
- Brought over exact AF estimation from branch (which is now dead). Exact model is default in UnifiedGenotyperV2.
- Implemented completely new genotyping algorithm given best AF estimate using dynamic programming, which in theory should be better than both greedy search and any HWE-based genotyper.
- Integrated and added new Dindel likelihood estimation model.
- Corrected annotators that would call readBasePileup: since we can be annotating extended events, best way is to interrogate context for kind of pileup and either readBasePileup or readExtendedEventPileup.

All changes above except last one are still in playground since they require more testing.




git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4396 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 21:33:59 +00:00
kshakir 027241e1ca Moved the test classes from java/classes/testclasses to java/testclasses. Update your IntelliJ settings!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4395 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 20:42:41 +00:00
hanna 4ea73bcfb1 Basic unit tests for WalkerManager.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4394 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 19:27:41 +00:00
hanna bf7fd08810 Fix newly-introduced bug in the PluginManager/DynamicClassResolutionException
where, when the system can't find a plugin of the correct name, the system
prefers to crap all over itself and throw an unintelligible NullPointerException
rather than displaying an intelligent error.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4393 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 19:07:05 +00:00
hanna 14e19f4605 (Slightly) better exception text when SAM/BAM output file can't be created.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4392 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 18:43:22 +00:00
hanna 1fb8c86f6d Looks like we've got two competing models for an empty interval list: null and
the empty list.  Score another victory for the integration tests.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4391 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 17:11:47 +00:00
hanna 78343be52c At some time in the recent past, we lost our ability to process the '-L all'
argument.  Brought it back, and added an integrationtest to make sure it
stays around.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4390 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 15:58:43 +00:00
chartl f34b4c6b82 Be smarter if the beagle output is set such that getParent() returns null. Up the memory limit.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4389 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 12:48:47 +00:00
chartl 0142047da9 And a bugfix 3 seconds later. Don't tell java to use up to 20g while telling the farm to kill the job if it tries to exceed 4g.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4388 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 02:08:47 +00:00
chartl 06970ae039 A qscript that refines genotypes with beagle and merges them into one vcf (running currently on the recent chr20 production calls).
This will be librarized soon; but if you need to do something like this, feel free to cannibalize.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4387 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 02:05:30 +00:00
delangel e80742e72f Use -o as argument for output file in ProduceBeagleInputWalker, to be consistent with other walkers (you're welcome, chartl :)).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4386 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 22:46:39 +00:00
hanna 732aa32758 Every Sting app from now on will be forced into the US English locale.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4385 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 21:55:21 +00:00
fromer 20ffe484bc Added detection and INFO field marking of phasing inconsistencies (and optional filtration using --filterInconsistentSites)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4384 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 19:28:56 +00:00
rpoplin a6c7de95c8 By using the AC info field instead of parsing the genotypes we cut 78% off the runtime of VariantRecalibrator. There is a new argument to force the parsing of genotypes if necessary. Various other optimizations throughout.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4383 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 18:56:50 +00:00
ebanks 2d1265771f Fix for G: make sure to generate the genotype conformations in the grid for the target frequency when not using grid search for anything except the conformations
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4382 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 16:44:53 +00:00
delangel 4556e3b273 First iteration in filling up exact AF calculation with new refactored UG. Code computes EM iterations of exact AF spectrum and returns to caller.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4381 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 16:21:54 +00:00
chartl 2708e83198 For show (Queue works nicely): An analysis script that runs QC for the omni chip
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4380 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 15:04:17 +00:00