chartl
f978c25b9d
Perhaps both, Eric. Perhaps both.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4422 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 13:56:04 +00:00
chartl
0eb777612a
Swap "." over to VCFConstants.MISSING_DEPTH_v3
...
Why v3, you ask? Why not? Simply because v2 was a String so old and clunky, the sun would fizzle out and grow cold before any VCF could be successfully parsed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4421 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 13:41:41 +00:00
chartl
74087c44ae
Fixed a bug which caused a parsing exception when there was a variant with a dp field of ".", e.g. "GT:DP 0/1:." -- which can happen when using imputation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4420 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 12:37:36 +00:00
ebanks
6448753cf7
Removed the SequenomValidationConvertor and renamed it VariantValidationAssessor since it no longer handles ped/sequenom files (but instead works on vcfs/variantcontexts). Updated all of the wiki docs, including adding instructions on how to convert ped files to vcf, a la Shaun Purcell. We now officially no longer support ped files everyone. Other misc cleanup in the code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4419 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-04 02:11:38 +00:00
ebanks
d8db48204e
Fix typo and tell people not to post user errors
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4415 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-03 18:58:03 +00:00
ebanks
490e5e1b0f
Better error when bad ref bases are provided
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4414 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-03 05:40:37 +00:00
aaron
64b7b3f83b
fix for a recent change to the indexing code where we ignore the results of locking the file (this is bad), and as a result don't write the index; this should fix the build.
...
Off to Yosemite in 4 hours, enjoy the week gsa folks!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4410 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-02 04:35:11 +00:00
depristo
7551ba8249
Trival refactoring in preparation for on-the-fly indexing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4409 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 22:32:59 +00:00
rpoplin
2f7892601c
Useful debugging argument added to VariantRecalibrator to only use sites whose qual field is above --qual
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4406 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 21:08:55 +00:00
hanna
575c38fc04
Accidental fail to commit missing file.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4405 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 20:26:51 +00:00
delangel
d4398f2686
silly bug fix: if I'm to do a short term hack to avoid -infinity likelihoods I might as well do it right.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4403 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 18:39:45 +00:00
hanna
8d25a5f9f2
A mechanism for supplying attribution text -- mainly useful for external
...
walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4402 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 18:31:19 +00:00
delangel
e920badcc4
Temporary fix for case where genotype likelihoods are exactly (1,0,0) or (0,1,0) etc. at a site with new indel genotyper: this would make us blow up when converting to log space and try to assign genotypes at a site. A more robust solution is in the works.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4401 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 17:43:43 +00:00
rpoplin
b83fdf8a17
Bug fix in AnalyzeAnnotations. Be sure the site is a biallelic, unfiltered SNP.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4400 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 13:09:46 +00:00
delangel
fa9c21c020
More fixes for exact AF calculation model in new unified genotyper:
...
a) Fixed bugs in new dynamic programming-based genotyper
b) Fixed up temp hack that handles extended pileups for now.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4398 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-01 02:32:50 +00:00
delangel
eb67aee732
bug fix: forgot to uncomment code to compute genotype likelihoods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4397 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 21:38:22 +00:00
delangel
ece694d0af
Next iteration on new UG framework:
...
- Brought over exact AF estimation from branch (which is now dead). Exact model is default in UnifiedGenotyperV2.
- Implemented completely new genotyping algorithm given best AF estimate using dynamic programming, which in theory should be better than both greedy search and any HWE-based genotyper.
- Integrated and added new Dindel likelihood estimation model.
- Corrected annotators that would call readBasePileup: since we can be annotating extended events, best way is to interrogate context for kind of pileup and either readBasePileup or readExtendedEventPileup.
All changes above except last one are still in playground since they require more testing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4396 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 21:33:59 +00:00
hanna
bf7fd08810
Fix newly-introduced bug in the PluginManager/DynamicClassResolutionException
...
where, when the system can't find a plugin of the correct name, the system
prefers to crap all over itself and throw an unintelligible NullPointerException
rather than displaying an intelligent error.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4393 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 19:07:05 +00:00
hanna
14e19f4605
(Slightly) better exception text when SAM/BAM output file can't be created.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4392 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 18:43:22 +00:00
hanna
1fb8c86f6d
Looks like we've got two competing models for an empty interval list: null and
...
the empty list. Score another victory for the integration tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4391 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 17:11:47 +00:00
hanna
78343be52c
At some time in the recent past, we lost our ability to process the '-L all'
...
argument. Brought it back, and added an integrationtest to make sure it
stays around.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4390 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-30 15:58:43 +00:00
delangel
e80742e72f
Use -o as argument for output file in ProduceBeagleInputWalker, to be consistent with other walkers (you're welcome, chartl :)).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4386 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 22:46:39 +00:00
hanna
732aa32758
Every Sting app from now on will be forced into the US English locale.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4385 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 21:55:21 +00:00
fromer
20ffe484bc
Added detection and INFO field marking of phasing inconsistencies (and optional filtration using --filterInconsistentSites)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4384 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 19:28:56 +00:00
rpoplin
a6c7de95c8
By using the AC info field instead of parsing the genotypes we cut 78% off the runtime of VariantRecalibrator. There is a new argument to force the parsing of genotypes if necessary. Various other optimizations throughout.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4383 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 18:56:50 +00:00
ebanks
2d1265771f
Fix for G: make sure to generate the genotype conformations in the grid for the target frequency when not using grid search for anything except the conformations
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4382 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 16:44:53 +00:00
delangel
4556e3b273
First iteration in filling up exact AF calculation with new refactored UG. Code computes EM iterations of exact AF spectrum and returns to caller.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4381 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 16:21:54 +00:00
ebanks
0d71dff928
Small bug fix to the new UG (need to initialize the entire posteriors array) means that we also get identical results as old UG when calling with 60 samples in the pilot1 data. Now that I'm happier with UGv2, I've transitioned it to use the correct AF priors instead of the busted ones still in the old UG.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4379 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 14:24:50 +00:00
hanna
eee134baf2
Chris found a bug in the downsampler where, if the number of reads entering
...
the pileup at the next alignment start is large, we don't add as many of those
incoming reads as we should. No integration tests were affected.
Thanks, Chris!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4378 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 11:18:12 +00:00
ebanks
0ec07ad99a
Initial version of refactored Unified Genotyper. Using SNP genotype likelihoods and GRID_SEARCH AF estimation models, achieves the exact same results as original UG on 1-2 samples with the exception of strand bias (not implemented yet); other than that I have no idea. Needs tons more testing. Do not use. For Guillermo only.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4377 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 08:42:25 +00:00
kshakir
6df7f9318f
For enums generate the full path to the Enum type to avoid collisions such as enum Model and enum Model used in the same class.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4376 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 05:28:59 +00:00
fromer
e322e71c2f
Restored SVN history for phasing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4373 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-29 00:02:02 +00:00
fromer
720aaca8a0
Trying to restore SVN history for phasing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4372 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 23:50:28 +00:00
fromer
bf88117ead
Trying to restore SVN history for phasing directory
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4371 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 23:48:24 +00:00
fromer
dfb5143a41
Restore folder
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4370 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 23:46:07 +00:00
fromer
7c909bef82
Moved phasing classes out of playground! The code is still under production, though...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4369 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 23:21:28 +00:00
fromer
8d8980e8eb
Fixed phasing algorithm to: 1. More correctly weed out irrelevant reads and sites; 2. Crudely flag sites with large phase discrepancies betweens reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4368 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 23:02:53 +00:00
chartl
5a5c72c80d
Accidentally commited some debug output to PackageUtils, reverting change.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4367 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 21:58:42 +00:00
chartl
862c94c8ce
Small change for Matt -- output partition types in lexicographic order.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4365 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 20:08:03 +00:00
ebanks
7ad87d328d
Make sure to uppercase ref bases since they aren't coming from the engine
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4364 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 19:05:46 +00:00
bthomas
96cccafb0d
Adding a few helper methods for accessing sample metadata, and associated unit tests. These are motivated by discussion with Ryan about how he'll use sample metadata in VariantEvalwalker - hopefully will make it easier for him. Methods are:
...
-- getToolkit().subContextFromSampleProperty(): filters a VariantContext to genotypes that come from samples that have a given property value
-- getToolkit().getSamplesWithProperty(): gets all samples with a given property
-- getToolkit().getSamplesFromVariantContext(): sample objects that are referenced by name in a VariantContext
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4361 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-28 02:16:25 +00:00
ebanks
1034853a84
Adding 'solexa' to list of known/supported platforms
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4357 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-27 02:38:38 +00:00
aaron
70f03a7113
first pass of well-formatted tribble exceptions
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4352 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-25 03:29:33 +00:00
kshakir
edaa278edd
Removed cases where various toolkit functions were accessing GenomeAnalysisEngine.instance.
...
This will allow other programs like Queue to reuse the functionality.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4351 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-25 02:49:30 +00:00
hanna
497bcbcbb7
Recent changes to the build system make the build system complain loudly about
...
pieces of core that depend on playground. Most of these have been eliminated by
(temporarily) promoting Aaron's report system to core in this checkin. I'll
follow up with other changes in separately.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4350 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-24 22:09:12 +00:00
hanna
6ebca5d219
Enhancements to build external projects for walker sharing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4348 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-24 21:17:16 +00:00
corin
eb1fa4bff3
changes an argument to an output so I can use it to track dependencies in queue
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4347 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-24 21:07:09 +00:00
depristo
745b8cc6d3
GATK now detects and UserExceptions when human lexicographically sorted data is provided
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4343 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-24 15:19:48 +00:00
rpoplin
1931b2e1bd
Three fixes for VariantFiltrationWalker: Trying to filter an empty VCF file will produce a well-formed VCF file with zero records instead of a blank file, needed for pipelines. The first record's genotype info fields are now in the same order as all the others. The VCF header lines are pulled from just the input variant rod instead of from all rods.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4341 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-24 13:52:56 +00:00
kshakir
4ed9f437e9
Sliced the GAE in half like a gordian knot to avoid the constant merge conflicts.
...
The GAE half has all the walker specific code. The new "Abstract" GAE has the rest of the logic.
More refactoring to come, with the end goal of having a tool that other java analysis programs (Queue, etc.) can use to read in genomic data.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4339 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-23 23:28:55 +00:00