Mark DePristo
e37a638e09
Fix for disallowed characters in GATKReportTable
...
-- Illegal characters are automatically replaced with _
2011-08-26 13:24:06 -04:00
Mark DePristo
c0503283df
Spelling fix requires md5 updates
2011-08-26 07:40:44 -04:00
Mark DePristo
eef1ac415a
Merge branch 'master' into rodTesting
...
Conflicts:
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantsToTable.java
2011-08-26 00:35:41 -04:00
Eric Banks
9b7512fd94
Just because there's a ref base doesn't mean the VC needs to be padded
2011-08-25 22:42:14 -04:00
Mark DePristo
e01273ca7c
Queue now writes out queueJobReport.pdf
...
-- General purpose RScript executor in java (please use when invoking RScripts)
-- Removed groupName. This is now analysisName
-- Explicitly added capability to enable/disable individual QFunction
2011-08-25 16:57:11 -04:00
Eric Banks
09a729da3a
Removing incorrect comment
2011-08-25 15:42:52 -04:00
Eric Banks
8bbef79fc2
Create clipped alleles during allele parsing instead of creating a full VC, clipping alleles, and regenerating the VC from scratch.
2011-08-25 15:37:26 -04:00
Ryan Poplin
29c7b10f7b
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-24 15:18:58 -04:00
Ryan Poplin
e5008aba00
Output the top two haplotypes as a variant call by running smith-waterman alignment against the reference and calling any difference as variation. This is the first verion that runs end-to-end by taking in reads as bam file and writing out variant calls in VCF.
2011-08-24 15:18:44 -04:00
Guillermo del Angel
e618cb1e79
a) Renamed/expanded SelectVariants arguments that choose particular kinds of variants and particular allelic types, now instead of -Indels or -SNPs we can specify for example -selectType [MIXED|INDEL|SNP|MNP|SYMBOLIC]. To select biallelic, multiallelic variants, use -restrictAllelesTo [BIALLELIC|MULTIALLELIC]. Corresponding gatkdocs changes.
...
b) More useful AC,AF logging in VariantsToTable with multiallelic sites: instead of logging comma-separated values, log max value by default. Hidden, experimental argument -logACSum to log sum of ACs instead. This is due to extreme slowness of R in parsing strings to tokens and computing max/sum itself (~100x slower than gatk).
c) Added integrationtest for new SelectVariants commands
2011-08-24 12:25:50 -04:00
Mark DePristo
28ee6dac41
Fixed spelling mistake
2011-08-24 10:14:45 -04:00
Ryan Poplin
f37875600a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-24 09:02:44 -04:00
Khalid Shakir
1ecbf05aae
Avoid segfaults due to out of date and possibly abandonded LSF DRMAA implementation when use'ing LSF instead of .combined_LSF_SGE
2011-08-23 23:49:36 -04:00
Mark DePristo
569e1a1089
Walker.isDone() aborts execution early
...
-- Useful if you want to have a parameter like MAX_RECORDS that wants the walker to stop after some number of map calls without having to resort to the old System.exit() call directly.
2011-08-23 16:53:06 -04:00
Ryan Poplin
a1a1fac9e4
Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties
2011-08-23 13:43:07 -04:00
Guillermo del Angel
6e2552a9ef
Merge fix
2011-08-23 12:40:43 -04:00
Guillermo del Angel
8b7a0b3b62
Two new arguments to SelectVariants to exclude either multiallelic or biallelic sites from input vcf
2011-08-23 12:40:01 -04:00
Mark DePristo
6d6feb5540
Better error message when you cannot determine a ROD type because the file doesn't exist or cannot be read
2011-08-23 10:56:37 -04:00
Mauricio Carneiro
feeab6075f
Merging ReduceReads development with unstable repo
...
It is time to bring the ReadClipper class to the main repo. Read Clipper has tested functionality for soft and hard clipping reads. I will prepare thorough documentation for it as it will be very useful for the assembler and the GATK in general.
2011-08-22 23:03:03 -04:00
Guillermo del Angel
ee68713267
Further Bug fixes to CountVariants: stratifications were wrong in case genotypes had no-calls, for example if we stratified by sample and a sample had a no-call, this no-call was considered a true variant and counts were incorrectly increased
2011-08-22 20:42:47 -04:00
Guillermo del Angel
c270384b2e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-22 20:39:32 -04:00
Guillermo del Angel
8ae24912f4
a) Misc fixes in Phase1 indel vqsr script,
...
b) More R-friendly VariantsToTable printing of AC in case of multiple alt alleles
c) Rename FixPLOrderingWalker to FixGenotypesWalker and rewrote: no longer need older code, replaced with code to replace genotypes with all-zero PL's with a no-call.
2011-08-22 20:39:06 -04:00
Mark DePristo
85c5a6f890
Merge branch 'rodTesting'
...
Conflicts:
private/java/src/org/broadinstitute/sting/gatk/walkers/performance/ProfileRodSystem.java
2011-08-22 17:43:47 -04:00
Mark DePristo
1eab9be35d
Now with accurate javadoc
2011-08-22 17:25:15 -04:00
Mark DePristo
3612a3501d
info, not warn, about dynamic type determination
2011-08-22 17:24:51 -04:00
Eric Banks
dc42571dd9
Only create the genotype map when necessary
2011-08-22 15:40:36 -04:00
Khalid Shakir
c4c90c8826
Updates to JobRunners from the Queue developer community and from running the WholeGenomePipeline:
...
- Ability to pass a different resident memory reservation and limits. Useful for large pileups of low pass genome data that sometimes need high -Xmx6g but usually don't exceed 2-3g in actual heap size.
- Fixed jobPriority to work for all job runners. Now must be a integer between 0 and 100- even for GridEngine- and will be mapped to the correct values.
- Passing parallel environment and job resource requests to LSF and GridEngine. Useful for passing tokens like iodine_io=1 and -pe pe_slots 8
- Refactored GridEngine JobRunner to also provide basic support for other job dispatchers with DRMAA implementations such as Torque/PBS. Should work for basic running but advanced users must pass their own jobNativeArgs from the command line or in customized QScripts until someone maps properties like jobQueue, jobPriority, residentRequest, etc. into a Torque/PBS/etc. dispatcher.
2011-08-22 15:13:27 -04:00
Eric Banks
2c24b68a96
Working implementation of DecodeLoc for VCF parsing. Makes indexing 3x faster.
2011-08-22 15:11:21 -04:00
Eric Banks
518b3dd291
Don't let the genotypes map be null
2011-08-22 15:10:30 -04:00
Ryan Poplin
f93a554b01
updating exome specific parameters in MDCP
2011-08-21 10:25:36 -04:00
Ryan Poplin
dbff84c54e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-21 10:09:19 -04:00
Khalid Shakir
22ca44c015
Fixed Queue's tagging of RodBindings.
...
Fixed argument definition names.
2011-08-21 02:34:20 -04:00
Eric Banks
a8cbced71b
Bug fix for Ryan: check for no context
2011-08-20 22:49:51 -04:00
Eric Banks
0ccd173967
Fixing the recent SelectVariants fix
2011-08-20 21:30:08 -04:00
Ryan Poplin
b008676878
fixing the previous fix
2011-08-20 21:21:55 -04:00
Guillermo del Angel
782453235a
Updated VariantEvalIntegrationTest since there's a new column separating nMixed and nComplex in CountVariants
...
Misc updates to WholeGenomeIndelCalling.scala
Bug fix in VariantEval (may be temporary, need more investigation): if -disc option is used in sites-only vcf's then a null pointer exception is produced, caused by recent introduction of -xl_sf options.
2011-08-20 12:24:22 -04:00
Ryan Poplin
539e157ecd
Fixing misc parameters in MDCP. The pipeline now does VariantEval of output by default. Fix for NaN vqslod values in VQSR
2011-08-20 11:28:48 -04:00
Guillermo del Angel
4939648fd4
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-20 08:50:43 -04:00
Ryan Poplin
a96ecbab71
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-19 19:30:05 -04:00
Ryan Poplin
ddb5045e14
Updating the methods development calling pipeline for the new rod binding syntax and the new best practices.
2011-08-19 19:29:51 -04:00
Mark DePristo
ff018c7964
Swapped argument order but not MD5 order
2011-08-19 16:55:56 -04:00
Mark DePristo
8b3cfb2f1c
Final documented version of GATKDoclet and associated classes
...
-- Docs on everything.
-- Feature complete. At this point only minor improvements and bugfixes are anticipated
2011-08-19 16:52:17 -04:00
Mark DePristo
b08d63a6b8
Documentation and code cleanup for ClipReads, CallableLoci, and VariantsToTable
...
-- Swapped -o [summary] and -ob [bam] for more standard -o [bam] and -os [summary] arguments.
-- @Advanced arguments
2011-08-19 15:06:37 -04:00
Mark DePristo
49e831a13b
Should have checked in
2011-08-19 14:35:16 -04:00
Mauricio Carneiro
7b5fa4486d
GenotypeAndValidate - Added docs to the @Arguments
2011-08-19 13:35:11 -04:00
Mark DePristo
9f7d4beb89
Merge branch 'help'
2011-08-19 13:14:02 -04:00
Mark DePristo
4d1fd17a97
GATKDoclet cleanup and documentation
...
-- Fixed bug in the way ArgumentCollections were handled that lead to failure in handling the dbsnp argument collection.
2011-08-19 13:13:41 -04:00
Ryan Poplin
0f25167efd
minor fix in VariantEval docs
2011-08-19 11:01:04 -04:00
Mark DePristo
198955f752
GATKDoc descriptions for all standard codecs, or TODO for their owners
...
-- Also added vcf.gz support in the VCF codec. This wasn't committed in the last round, because it was missed by the parallel documentation effort.
2011-08-19 09:57:21 -04:00
Guillermo del Angel
269ed1206c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-19 09:32:20 -04:00