Mark DePristo
282b4afdca
Licenses for GATK 1 and 2 beta
2012-04-17 11:40:02 +01:00
Mark DePristo
87be63c7e4
Improve variantCallQC.R
...
-- Refactor plotting utilities into master utility in gsalib. Everyone can use it now
-- Better plots for standard variantCallQC
2012-04-13 17:00:37 -04:00
Mark DePristo
3f6b2423d8
Update VE IT to reflect new fields and bugfixes
2012-04-13 17:00:37 -04:00
Mark DePristo
f9190b6fcd
VariantEvalUnitTest is better named VariantEvalWalkerUnitTest
2012-04-13 17:00:37 -04:00
Mark DePristo
23ccf772d4
IndelSummary now emits all of the underlying counts for ratios, percentages, etc it computes
2012-04-13 17:00:36 -04:00
Mark DePristo
542a8e3306
No longer stratify ratio_of_1_and_2_to_3_bp_insertions by 1bp context
2012-04-13 17:00:36 -04:00
Mark DePristo
84d1e8713a
Infrastructure for combining VariantEvaluations
...
-- Not hooked up yet, so the output of VariantEval should be the same as before
-- Implemented a VariantEvalUnitTest that tests the low level strat / eval combinatorics and counting routines
-- Better docs throughout
2012-04-13 17:00:36 -04:00
Mark DePristo
38986e4240
Documentation for StratificationManager
2012-04-13 17:00:36 -04:00
Mark DePristo
ab06d53867
Useful test constructor or Unit tests in RefMetaDataTracker
2012-04-13 17:00:36 -04:00
Mark DePristo
285e61a227
Bugfix for IndelSummary
...
-- multi allelic count should be % not ratio
2012-04-13 17:00:35 -04:00
Mark DePristo
adc519f7d1
VariantCallQC improvements
...
-- More meaningful histogram axes
-- Counts of multi-allelic variants
-- Removed from stratified plots that didn't mean anything (SNP : indel ratio, for example)
2012-04-13 17:00:35 -04:00
Mark DePristo
e6d5cb46d2
Improvements and bugfixes to IndelSummary
...
-- Now properly includes both bi and multi-allelic variants. These are actually counted as well, and emitted as counts and % of sites with multiple alleles
-- Bug fix for gold standard rate
2012-04-13 17:00:35 -04:00
Mark DePristo
bfa966a4e9
Bugfix for OneBPIndel
...
-- Previously was only including 1 bp insertions in stratification
2012-04-13 17:00:35 -04:00
Mark DePristo
2aa2d9aec0
Merged bug fix from Stable into Unstable
2012-04-13 09:25:43 -04:00
Mark DePristo
27e7e17dc7
New way to handle exceptions in multi-threaded GATK
...
-- HMS no longer tries to grab and throw all exceptions. Exceptions are just thrown directly now.
-- Proper error handling is handled by functions in HMS, which are used by ShardTraverser and TreeReducer
-- Better printing of stack traces in WalkerTest
2012-04-13 09:23:33 -04:00
Mark DePristo
e85e9a8cf5
More extensive testing of type of error thrown in multi-threaded walker test
...
-- Unfortunately the result of the multi-threaded test is non-deterministic so run the test 10x times to see if the right expection is always thrown
-- Now prints the stack trace and exception message of the caught exception of the wrong type, if this occurs
2012-04-13 09:23:33 -04:00
Eric Banks
297afc7911
Added unit test to ensure that we genotype correctly cases with really large GLs
2012-04-12 15:43:14 -04:00
Eric Banks
818e8c2fb9
Resolving merge conflicts
2012-04-12 15:19:44 -04:00
Eric Banks
0dd571928d
Let's not have the indel model emit more than the max possible number of genotypable alt alleles (since we may not be able to subset down to the best ones).
2012-04-12 15:16:29 -04:00
Guillermo del Angel
4004ec3e6f
Preemptive bug fixes in CalibrateGenotypeLikelihoods: prevent null pointer exception if we're in indels mode and truth file happens to have a snp or mixed record
2012-04-12 14:10:09 -04:00
Eric Banks
f77a6d18b8
Bad conflict merge before
2012-04-12 09:56:49 -04:00
Eric Banks
33a8bdd75f
Resolving merge conflicts
2012-04-12 09:51:55 -04:00
Eric Banks
b659b16b31
Generate User Error for bad POS value
2012-04-12 09:49:35 -04:00
Eric Banks
cc71baf691
Don't allow users to try to genotype more than the max possible value (catch and throw a User Error at startup). Better docs explaining that users shouldn't play with this value unless they know what they are doing.
2012-04-12 09:18:44 -04:00
Eric Banks
5bf9dd2def
A framework to get annotations working in the HaplotypeCaller (and ART walkers in general).
...
Adding support for active-region-based annotation for most standard annotations. I need to discuss with Ryan what to do about tests that require offsets into the reads (since I don't have access to the offsets) like e.g. the ReadPosRankSumTest.
IMPORTANT NOTE: this is still very much a dev effort and can only be accessed through private walkers (i.e. the HaplotypeCaller). The interface is in flux and so we are making no attempt at all to make it clean or to merge this with the Locus-Traversal-based annotation system. When we are satisfied that it's working properly and have settled on the proper interface, we will clean it up then.
2012-04-11 16:22:12 -04:00
Eric Banks
5b7da3831f
Not sure why this didn't make it into the last push, but here's a working MD5 for the NDA annotation in UG
2012-04-11 13:49:50 -04:00
Eric Banks
7aa654d13f
New interface for some dev work that Ryan and I are doing; only accessible from private walkers right now
2012-04-11 13:49:09 -04:00
Eric Banks
dc90508104
Adding a new annotation to UG calls: NDA = number of discovered (but not necessarily genotyped) alleles for the site. This could help downstream analysis esp. of indels for wonky sites (since we only use the top 2-3 alleles). Not enabled by default but we can change that if this turns out to be useful.
2012-04-11 13:47:10 -04:00
Eric Banks
d2142c3aa7
Adding integration test for Flag Stat
2012-04-10 22:40:38 -04:00
Eric Banks
f560611fe8
Merged bug fix from Stable into Unstable
2012-04-10 22:26:53 -04:00
Eric Banks
f46f7d0590
Fix the stats coming out of FlagStat. I will add an integration test in unstable
2012-04-10 22:26:10 -04:00
Mauricio Carneiro
cd842b650e
Optimizing DiagnoseTargets
...
* Fixed output format to get a valid vcf
* Optimzed the per sample pileup routine O(n^2) => O(n) pileup for samples
* Added support to overlapping intervals
* Removed expand target functionality (for now)
* Removed total depth (pointless metric)
2012-04-10 17:43:59 -04:00
Ryan Poplin
1df0adf862
Fixing ActivityProfile unit test.
2012-04-10 15:28:27 -04:00
Ryan Poplin
e3cc7cc59c
Resolving merge conflict.
2012-04-10 14:50:27 -04:00
Ryan Poplin
a4634624b7
There are now three triggering options in the HaplotypeCaller. The default (mismatches, insertions, deletions, high quality soft clips), an external alleles file (from the UG for example), or extended triggers which include low quality soft clips, bad mates and unmapped mates. Added better algorithm for band pass filtering an ActivityProfile and breaking them apart when they get too big. Greatly increased the specificity of the caller by battening down the hatches on things like base quality and mapping quality thresholds for both the assembler and the likelihood function.
2012-04-10 14:48:23 -04:00
Eric Banks
10e74a71eb
We now allow arbitrary annotations other than dbSNP (e.g. HM3) to come out of the Unified Genotyper. This was already set up in the Variant Annotator Engine and was just a matter of hooking UG up to it. Added integration test to ensure correct behavior.
2012-04-10 12:30:35 -04:00
David Roazen
ec9822b2a7
Revert "Disable HaplotypeCaller tests in Stable"
...
These tests should remain enabled in Unstable
This reverts commit 0e250b050f88777b5f0bbfccf93a3315701d3ab0.
2012-04-10 09:47:58 -04:00
David Roazen
2091a30b8f
Merged bug fix from Stable into Unstable
2012-04-10 09:47:40 -04:00
David Roazen
bb1dff4ea4
Disable HaplotypeCaller tests in Stable
2012-04-10 09:46:08 -04:00
Mark DePristo
b43d21056b
Merged bug fix from Stable into Unstable
2012-04-10 09:42:09 -04:00
Mark DePristo
6885e2d065
UserException fixes for GATK_logs recent errors
...
-- SamFileReader.java:525
-- BlockCompressedInputStream:376
These were both instances were we weren't catching and rethrowing picard exceptions as UserExceptions.
2012-04-10 07:37:42 -04:00
Mark DePristo
8507cd7440
Throw UserException for bad dict / chain files
2012-04-10 07:22:43 -04:00
Ryan Poplin
cd9bf1bfc3
Changing IndelSummary eval module so that PostCallingQC.scala can run with MIXED-record VCFs.
2012-04-10 00:22:40 -04:00
Roger Zurawicki
9ece93ae9c
DiagnoseTargets now outputs a VCF file
...
- refactored the statistics classes
- concurrent callable statuses by sample are now available.
Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>
2012-04-09 16:40:20 -04:00
Guillermo del Angel
719ec9144a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-09 14:53:19 -04:00
Guillermo del Angel
550179a1f7
Major refactorings/optimizations of pool caller, output still bit-true to older version: a) Move DEFAULT_PLOIDY from UnifiedGenotyperEngine to VariantContextUtils. b) Optimize iteration through all possible allele combinations. c) Don't store log PL's in hashmap from allele conformations to double, it was too slow. Things can still be optimized much more down the line if needed. d) Remove remaining traces of genotype priors.
2012-04-09 14:53:05 -04:00
Eric Banks
d312fcdae8
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-09 14:28:49 -04:00
Eric Banks
f82986ee62
Adding unit tests for the very important log10sumLog10 util method.
2012-04-09 14:28:25 -04:00
Mark DePristo
63b080e353
Added index for end_time
2012-04-09 14:11:32 -04:00
Eric Banks
ea4300d583
Refactoring so that Unified Argument Collection doesn't use deprecated classes.
2012-04-09 13:45:17 -04:00