Eric Banks
827fe6130c
Adding hidden printing option. Also, always run UG in mode GENOTYPE_GIVEN_ALLELES given that we don't actually test for the correct alleles (otherwise UG may choose a different allele and we may falsely validate the wrong one).
2011-09-01 11:40:35 -04:00
Mark DePristo
ac49b8d26b
Conditional support for PerformanceTrackingQuerySource to measure Tribble / GATK bridge performance
...
-- Removed DEBUG option, instead use MEASURE_TRIBBLE_QUERY_PERFORMANCE in RMDTrackerBuilder
2011-09-01 10:41:55 -04:00
Mauricio Carneiro
4b5a7046c5
Making ReadLengthDistribution Public
...
Found this neat little walker Kiran wrote stashed in the private tree. Very useful. Generalized it a bit, added GATKDocs and moved it to public. I might include it as a QC step on the pacbio processing pipeline.
* generalize it so it works with non pair ended reads.
* generalize it to work with no read group information
2011-08-31 15:52:28 -04:00
Eric Banks
c2f0db969b
Don't use the default deletion value from UG if not asking to have it set
2011-08-29 13:48:10 -04:00
Eric Banks
bb7a37e8f2
We need to allow reference calls in the input VCF for the GenotypeAndValidate walker when using the BAM as truth so that we can test supposed monomorphic calls against the truth.
2011-08-29 13:19:35 -04:00
Ryan Poplin
bc252a0d62
misc minor bug fixes in assembly. Increasing the minimum number of bad variants to be used in negative model training in the VQSR
2011-08-29 08:11:31 -04:00
Mark DePristo
a5c65fc133
Debugging information to print out the Query tracks
2011-08-28 18:54:49 -04:00
Mark DePristo
7bf006278d
Moved ResolveHostname to general utils as a static function
2011-08-28 12:04:16 -04:00
Mark DePristo
e37a638e09
Fix for disallowed characters in GATKReportTable
...
-- Illegal characters are automatically replaced with _
2011-08-26 13:24:06 -04:00
Mark DePristo
eef1ac415a
Merge branch 'master' into rodTesting
...
Conflicts:
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantsToTable.java
2011-08-26 00:35:41 -04:00
Eric Banks
9b7512fd94
Just because there's a ref base doesn't mean the VC needs to be padded
2011-08-25 22:42:14 -04:00
Ryan Poplin
29c7b10f7b
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-24 15:18:58 -04:00
Guillermo del Angel
e618cb1e79
a) Renamed/expanded SelectVariants arguments that choose particular kinds of variants and particular allelic types, now instead of -Indels or -SNPs we can specify for example -selectType [MIXED|INDEL|SNP|MNP|SYMBOLIC]. To select biallelic, multiallelic variants, use -restrictAllelesTo [BIALLELIC|MULTIALLELIC]. Corresponding gatkdocs changes.
...
b) More useful AC,AF logging in VariantsToTable with multiallelic sites: instead of logging comma-separated values, log max value by default. Hidden, experimental argument -logACSum to log sum of ACs instead. This is due to extreme slowness of R in parsing strings to tokens and computing max/sum itself (~100x slower than gatk).
c) Added integrationtest for new SelectVariants commands
2011-08-24 12:25:50 -04:00
Mark DePristo
28ee6dac41
Fixed spelling mistake
2011-08-24 10:14:45 -04:00
Mark DePristo
569e1a1089
Walker.isDone() aborts execution early
...
-- Useful if you want to have a parameter like MAX_RECORDS that wants the walker to stop after some number of map calls without having to resort to the old System.exit() call directly.
2011-08-23 16:53:06 -04:00
Ryan Poplin
a1a1fac9e4
Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties
2011-08-23 13:43:07 -04:00
Guillermo del Angel
6e2552a9ef
Merge fix
2011-08-23 12:40:43 -04:00
Guillermo del Angel
8b7a0b3b62
Two new arguments to SelectVariants to exclude either multiallelic or biallelic sites from input vcf
2011-08-23 12:40:01 -04:00
Guillermo del Angel
ee68713267
Further Bug fixes to CountVariants: stratifications were wrong in case genotypes had no-calls, for example if we stratified by sample and a sample had a no-call, this no-call was considered a true variant and counts were incorrectly increased
2011-08-22 20:42:47 -04:00
Guillermo del Angel
c270384b2e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-22 20:39:32 -04:00
Guillermo del Angel
8ae24912f4
a) Misc fixes in Phase1 indel vqsr script,
...
b) More R-friendly VariantsToTable printing of AC in case of multiple alt alleles
c) Rename FixPLOrderingWalker to FixGenotypesWalker and rewrote: no longer need older code, replaced with code to replace genotypes with all-zero PL's with a no-call.
2011-08-22 20:39:06 -04:00
Mark DePristo
1eab9be35d
Now with accurate javadoc
2011-08-22 17:25:15 -04:00
Ryan Poplin
f93a554b01
updating exome specific parameters in MDCP
2011-08-21 10:25:36 -04:00
Ryan Poplin
dbff84c54e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-21 10:09:19 -04:00
Eric Banks
a8cbced71b
Bug fix for Ryan: check for no context
2011-08-20 22:49:51 -04:00
Eric Banks
0ccd173967
Fixing the recent SelectVariants fix
2011-08-20 21:30:08 -04:00
Ryan Poplin
b008676878
fixing the previous fix
2011-08-20 21:21:55 -04:00
Guillermo del Angel
782453235a
Updated VariantEvalIntegrationTest since there's a new column separating nMixed and nComplex in CountVariants
...
Misc updates to WholeGenomeIndelCalling.scala
Bug fix in VariantEval (may be temporary, need more investigation): if -disc option is used in sites-only vcf's then a null pointer exception is produced, caused by recent introduction of -xl_sf options.
2011-08-20 12:24:22 -04:00
Ryan Poplin
539e157ecd
Fixing misc parameters in MDCP. The pipeline now does VariantEval of output by default. Fix for NaN vqslod values in VQSR
2011-08-20 11:28:48 -04:00
Guillermo del Angel
4939648fd4
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-20 08:50:43 -04:00
Ryan Poplin
ddb5045e14
Updating the methods development calling pipeline for the new rod binding syntax and the new best practices.
2011-08-19 19:29:51 -04:00
Mark DePristo
b08d63a6b8
Documentation and code cleanup for ClipReads, CallableLoci, and VariantsToTable
...
-- Swapped -o [summary] and -ob [bam] for more standard -o [bam] and -os [summary] arguments.
-- @Advanced arguments
2011-08-19 15:06:37 -04:00
Mark DePristo
49e831a13b
Should have checked in
2011-08-19 14:35:16 -04:00
Mauricio Carneiro
7b5fa4486d
GenotypeAndValidate - Added docs to the @Arguments
2011-08-19 13:35:11 -04:00
Ryan Poplin
0f25167efd
minor fix in VariantEval docs
2011-08-19 11:01:04 -04:00
Guillermo del Angel
269ed1206c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-19 09:32:20 -04:00
Eric Banks
40e67cff1b
I like the @Advanced annotation
2011-08-18 22:27:34 -04:00
Mark DePristo
2457c7b8f5
Merge branch 'master' into help
2011-08-18 22:20:43 -04:00
Eric Banks
77fa2c1546
Renaming read filters with a superfluous 'Read' in their names. Kept the ones that made sense to have it (e.g. MalformedReadFilter).
2011-08-18 22:01:33 -04:00
Mark DePristo
1d3799ddf7
Merge branch 'master' into help
2011-08-18 22:00:29 -04:00
Mark DePristo
f7414e39bc
Improvements to GATKDocs
...
-- Allowed values for RodBinding<T> are displayed in the GATKDocs
-- Longest name up to 30 characters is chosen for main argument list (suggested by Ryan/Mauricio)
-- Features are listed in alphabetical order
-- Moved useful getParameterizedType() function to JVMUtils
-- Tests of these features in the Documentation Test
2011-08-18 21:20:09 -04:00
Ryan Poplin
09d099cada
Added GATKDocs to the UnifiedGenotyper.
2011-08-18 20:57:02 -04:00
Guillermo del Angel
626cbf9411
Bug fixes and cleanups for IndelStatistics
2011-08-18 16:28:40 -04:00
Guillermo del Angel
58560a6d50
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-18 16:17:52 -04:00
Guillermo del Angel
3dfb60a46e
Fixing up and refactoring usage of indel categories. On a variant context, isInsertion() and isDeletion() are now removed because behavior before was wrong in case of multiallelic sites. Now, methods isSimpleInsertion() and isSimpleDeletion() will return true only if sites are biallelic. For multiallelic sites, isComplex() will return true in all cases.
...
VariantEval module CountVariants is corrected and an additional column is added so that we log mixed events and complex indels separately (before they were being conflated).
VariantEval module IndelStatistics is considerably simplified as the sample stratification was wrong and redundant, now it should work with the VE-generic Sample stratification. Several columns are renamed or removed since they're not really useful
2011-08-18 16:17:38 -04:00
Chris Hartl
6b256a8ac5
Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/chartl/dev/git
2011-08-18 15:29:24 -04:00
Chris Hartl
a8935c99fc
dding docs for DepthOfCoverage and ValidationAmplicons
2011-08-18 15:28:35 -04:00
Mark DePristo
f2f51e35e3
Merge branch 'master' into help
2011-08-18 14:05:33 -04:00
Mark DePristo
faa3f8b6f6
Only concrete classes are now documented
2011-08-18 14:04:47 -04:00
Ryan Poplin
7c4ce6d969
Added GATKDocs for the VQSR walkers.
2011-08-18 14:00:39 -04:00