Mark DePristo
28ee6dac41
Fixed spelling mistake
2011-08-24 10:14:45 -04:00
Ryan Poplin
f37875600a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-24 09:02:44 -04:00
Khalid Shakir
1ecbf05aae
Avoid segfaults due to out of date and possibly abandonded LSF DRMAA implementation when use'ing LSF instead of .combined_LSF_SGE
2011-08-23 23:49:36 -04:00
Mark DePristo
569e1a1089
Walker.isDone() aborts execution early
...
-- Useful if you want to have a parameter like MAX_RECORDS that wants the walker to stop after some number of map calls without having to resort to the old System.exit() call directly.
2011-08-23 16:53:06 -04:00
Ryan Poplin
a1a1fac9e4
Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties
2011-08-23 13:43:07 -04:00
Guillermo del Angel
6e2552a9ef
Merge fix
2011-08-23 12:40:43 -04:00
Guillermo del Angel
8b7a0b3b62
Two new arguments to SelectVariants to exclude either multiallelic or biallelic sites from input vcf
2011-08-23 12:40:01 -04:00
Roger Zurawicki
ac36271457
Fixed extra reads showing up in Variable Sites
...
Reads that were not hard clipped for the variable site no longer show up in output file
Walker now uses unclippedStart of Read to determine position in the sliding Window
2011-08-23 11:26:00 -04:00
Mark DePristo
6d6feb5540
Better error message when you cannot determine a ROD type because the file doesn't exist or cannot be read
2011-08-23 10:56:37 -04:00
Mauricio Carneiro
feeab6075f
Merging ReduceReads development with unstable repo
...
It is time to bring the ReadClipper class to the main repo. Read Clipper has tested functionality for soft and hard clipping reads. I will prepare thorough documentation for it as it will be very useful for the assembler and the GATK in general.
2011-08-22 23:03:03 -04:00
Guillermo del Angel
ee68713267
Further Bug fixes to CountVariants: stratifications were wrong in case genotypes had no-calls, for example if we stratified by sample and a sample had a no-call, this no-call was considered a true variant and counts were incorrectly increased
2011-08-22 20:42:47 -04:00
Guillermo del Angel
c270384b2e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-22 20:39:32 -04:00
Guillermo del Angel
8ae24912f4
a) Misc fixes in Phase1 indel vqsr script,
...
b) More R-friendly VariantsToTable printing of AC in case of multiple alt alleles
c) Rename FixPLOrderingWalker to FixGenotypesWalker and rewrote: no longer need older code, replaced with code to replace genotypes with all-zero PL's with a no-call.
2011-08-22 20:39:06 -04:00
Mark DePristo
85c5a6f890
Merge branch 'rodTesting'
...
Conflicts:
private/java/src/org/broadinstitute/sting/gatk/walkers/performance/ProfileRodSystem.java
2011-08-22 17:43:47 -04:00
Mark DePristo
1eab9be35d
Now with accurate javadoc
2011-08-22 17:25:15 -04:00
Mark DePristo
3612a3501d
info, not warn, about dynamic type determination
2011-08-22 17:24:51 -04:00
Eric Banks
dc42571dd9
Only create the genotype map when necessary
2011-08-22 15:40:36 -04:00
Khalid Shakir
c4c90c8826
Updates to JobRunners from the Queue developer community and from running the WholeGenomePipeline:
...
- Ability to pass a different resident memory reservation and limits. Useful for large pileups of low pass genome data that sometimes need high -Xmx6g but usually don't exceed 2-3g in actual heap size.
- Fixed jobPriority to work for all job runners. Now must be a integer between 0 and 100- even for GridEngine- and will be mapped to the correct values.
- Passing parallel environment and job resource requests to LSF and GridEngine. Useful for passing tokens like iodine_io=1 and -pe pe_slots 8
- Refactored GridEngine JobRunner to also provide basic support for other job dispatchers with DRMAA implementations such as Torque/PBS. Should work for basic running but advanced users must pass their own jobNativeArgs from the command line or in customized QScripts until someone maps properties like jobQueue, jobPriority, residentRequest, etc. into a Torque/PBS/etc. dispatcher.
2011-08-22 15:13:27 -04:00
Eric Banks
2c24b68a96
Working implementation of DecodeLoc for VCF parsing. Makes indexing 3x faster.
2011-08-22 15:11:21 -04:00
Eric Banks
518b3dd291
Don't let the genotypes map be null
2011-08-22 15:10:30 -04:00
Ryan Poplin
f93a554b01
updating exome specific parameters in MDCP
2011-08-21 10:25:36 -04:00
Ryan Poplin
dbff84c54e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-21 10:09:19 -04:00
Khalid Shakir
22ca44c015
Fixed Queue's tagging of RodBindings.
...
Fixed argument definition names.
2011-08-21 02:34:20 -04:00
Eric Banks
a8cbced71b
Bug fix for Ryan: check for no context
2011-08-20 22:49:51 -04:00
Eric Banks
0ccd173967
Fixing the recent SelectVariants fix
2011-08-20 21:30:08 -04:00
Ryan Poplin
b008676878
fixing the previous fix
2011-08-20 21:21:55 -04:00
Guillermo del Angel
782453235a
Updated VariantEvalIntegrationTest since there's a new column separating nMixed and nComplex in CountVariants
...
Misc updates to WholeGenomeIndelCalling.scala
Bug fix in VariantEval (may be temporary, need more investigation): if -disc option is used in sites-only vcf's then a null pointer exception is produced, caused by recent introduction of -xl_sf options.
2011-08-20 12:24:22 -04:00
Ryan Poplin
539e157ecd
Fixing misc parameters in MDCP. The pipeline now does VariantEval of output by default. Fix for NaN vqslod values in VQSR
2011-08-20 11:28:48 -04:00
Guillermo del Angel
4939648fd4
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-20 08:50:43 -04:00
Ryan Poplin
a96ecbab71
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-19 19:30:05 -04:00
Ryan Poplin
ddb5045e14
Updating the methods development calling pipeline for the new rod binding syntax and the new best practices.
2011-08-19 19:29:51 -04:00
Mark DePristo
ff018c7964
Swapped argument order but not MD5 order
2011-08-19 16:55:56 -04:00
Mark DePristo
8b3cfb2f1c
Final documented version of GATKDoclet and associated classes
...
-- Docs on everything.
-- Feature complete. At this point only minor improvements and bugfixes are anticipated
2011-08-19 16:52:17 -04:00
Mark DePristo
b08d63a6b8
Documentation and code cleanup for ClipReads, CallableLoci, and VariantsToTable
...
-- Swapped -o [summary] and -ob [bam] for more standard -o [bam] and -os [summary] arguments.
-- @Advanced arguments
2011-08-19 15:06:37 -04:00
Mark DePristo
49e831a13b
Should have checked in
2011-08-19 14:35:16 -04:00
Mauricio Carneiro
7b5fa4486d
GenotypeAndValidate - Added docs to the @Arguments
2011-08-19 13:35:11 -04:00
Mark DePristo
9f7d4beb89
Merge branch 'help'
2011-08-19 13:14:02 -04:00
Mark DePristo
4d1fd17a97
GATKDoclet cleanup and documentation
...
-- Fixed bug in the way ArgumentCollections were handled that lead to failure in handling the dbsnp argument collection.
2011-08-19 13:13:41 -04:00
Ryan Poplin
0f25167efd
minor fix in VariantEval docs
2011-08-19 11:01:04 -04:00
Mark DePristo
198955f752
GATKDoc descriptions for all standard codecs, or TODO for their owners
...
-- Also added vcf.gz support in the VCF codec. This wasn't committed in the last round, because it was missed by the parallel documentation effort.
2011-08-19 09:57:21 -04:00
Guillermo del Angel
269ed1206c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-19 09:32:20 -04:00
Mark DePristo
a5e279d697
Dynamic typing of vcf.gz files
...
-- CombineVariantsIntegrationTests now use dynamic typing of vcf.gz files
-- FeatureManagerUnitTests tests for correctness.
2011-08-19 09:05:11 -04:00
Eric Banks
40e67cff1b
I like the @Advanced annotation
2011-08-18 22:27:34 -04:00
Mark DePristo
2457c7b8f5
Merge branch 'master' into help
2011-08-18 22:20:43 -04:00
Mark DePristo
5fbdf968f7
ArgumentSource no longer comparable. Arguments sorted by GATKDoclet
2011-08-18 22:20:14 -04:00
Eric Banks
77fa2c1546
Renaming read filters with a superfluous 'Read' in their names. Kept the ones that made sense to have it (e.g. MalformedReadFilter).
2011-08-18 22:01:33 -04:00
Mark DePristo
1d3799ddf7
Merge branch 'master' into help
2011-08-18 22:00:29 -04:00
Mark DePristo
d1892cd0d7
Bug fixes
...
-- Sorting of ArgumentSources now done in GATKDoclet, not in the ParsingEngine, as the system depends on the LinkedTreeMap
-- Fixed broken exception throwing in the case where a file's type could not be determined
2011-08-18 21:58:36 -04:00
Mark DePristo
c5efb6f40e
Usability improvements to GATKDocs
...
-- ArgumentSources are now sorted by case insensitive names, so arguments are shown in alphabetical order (Ryan)
-- @Advanced annotation can be used to indicate that an argument is an advanced option and should be visually deemphasized in the GATKs. There's now an advanced section. Mauricio or Ryan -- could you figure out how to make this section less prominent in the style.css?
2011-08-18 21:39:11 -04:00
Mark DePristo
d94da0b1cf
Moved CG and SOAP codecs to private
2011-08-18 21:20:26 -04:00
Mark DePristo
f7414e39bc
Improvements to GATKDocs
...
-- Allowed values for RodBinding<T> are displayed in the GATKDocs
-- Longest name up to 30 characters is chosen for main argument list (suggested by Ryan/Mauricio)
-- Features are listed in alphabetical order
-- Moved useful getParameterizedType() function to JVMUtils
-- Tests of these features in the Documentation Test
2011-08-18 21:20:09 -04:00
Ryan Poplin
09d099cada
Added GATKDocs to the UnifiedGenotyper.
2011-08-18 20:57:02 -04:00
Mauricio Carneiro
6ef01e40b8
Complete rewrite of Hard Clipping (ReadClipper)
...
Hard clipping is now completely independent from softclipping and plows through previously hard or soft clipped reads.
2011-08-18 18:35:45 -04:00
Guillermo del Angel
626cbf9411
Bug fixes and cleanups for IndelStatistics
2011-08-18 16:28:40 -04:00
Guillermo del Angel
58560a6d50
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-18 16:17:52 -04:00
Guillermo del Angel
3dfb60a46e
Fixing up and refactoring usage of indel categories. On a variant context, isInsertion() and isDeletion() are now removed because behavior before was wrong in case of multiallelic sites. Now, methods isSimpleInsertion() and isSimpleDeletion() will return true only if sites are biallelic. For multiallelic sites, isComplex() will return true in all cases.
...
VariantEval module CountVariants is corrected and an additional column is added so that we log mixed events and complex indels separately (before they were being conflated).
VariantEval module IndelStatistics is considerably simplified as the sample stratification was wrong and redundant, now it should work with the VE-generic Sample stratification. Several columns are renamed or removed since they're not really useful
2011-08-18 16:17:38 -04:00
Chris Hartl
6b256a8ac5
Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/chartl/dev/git
2011-08-18 15:29:24 -04:00
Chris Hartl
a8935c99fc
dding docs for DepthOfCoverage and ValidationAmplicons
2011-08-18 15:28:35 -04:00
Mark DePristo
f2f51e35e3
Merge branch 'master' into help
2011-08-18 14:05:33 -04:00
Mark DePristo
faa3f8b6f6
Only concrete classes are now documented
2011-08-18 14:04:47 -04:00
Ryan Poplin
7c4ce6d969
Added GATKDocs for the VQSR walkers.
2011-08-18 14:00:39 -04:00
Mark DePristo
5772766dd5
Improvements to GATKDocs
...
-- Now supports a static list of root classes / interfaces that should receive docs. A complementary approach to documenting features to the DocumentedGATKFeature annotation
-- Tribble codecs are now documented!
-- No longer displayed sub and super classes
2011-08-18 14:00:09 -04:00
Mark DePristo
e03db30ca0
New uses DocumentedGATKFeatureObject instead of annotation directly
...
-- Step 1 on the way to creating a static list of additional classes that we want to document.
2011-08-18 12:31:04 -04:00
Mark DePristo
d4511807ed
Merge branch 'master' into help
2011-08-18 11:53:37 -04:00
Mark DePristo
c787fd0b70
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-18 11:52:45 -04:00
Mark DePristo
c797616c65
If you have one sample in your BAM, getToolkit().getSamples().size() == 2
...
Also deleted double initializationm, where a line of code was duplicated in creating the GATK engine.
2011-08-18 11:51:53 -04:00
Mark DePristo
cbec69a130
Merge branch 'master' into help
...
Conflicts:
public/java/src/org/broadinstitute/sting/utils/help/HelpUtils.java
2011-08-18 11:33:27 -04:00
Eric Banks
aa21fc7c9c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-18 11:30:59 -04:00
Mark DePristo
f5d7cabb20
Fix for reintroducing an already solved problem.
2011-08-18 11:20:12 -04:00
Eric Banks
a45498150a
Remove non-ascii char
2011-08-18 11:18:29 -04:00
Ryan Poplin
c08a9964d4
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-18 10:58:04 -04:00
Ryan Poplin
bb79d3edae
Added GATKDocs for the BQSR walkers.
2011-08-18 10:57:48 -04:00
Mark DePristo
47bbddb724
Now provides type-specific user feedback
...
For RodBinding<VariantContext> error messages now list only the Tribble types that produce VariantContexts
2011-08-18 10:47:16 -04:00
Mark DePristo
2d41ba15a4
Vastly better Tribble help message
...
Here's a new example:
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 1.1-520-g76495cd):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Invalid command line: Failed to parse value /humgen/gsa-hpprojects/GATK/data/refGene_b37.filtered.sorted.txt for argument refSeqRodBinding. Message: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :TYPE listing the correct type from among the supported types:
##### ERROR Name FeatureType Documentation
##### ERROR BEAGLE BeagleFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_beagle_BeagleCodec.html
##### ERROR BED BEDFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broad_tribble_bed_BEDCodec.html
##### ERROR BEDTABLE TableFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_table_BedTableCodec.html
##### ERROR CGVAR VariantContext http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_completegenomics_CGVarCodec.html
##### ERROR DBSNP DbSNPFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broad_tribble_dbsnp_DbSNPCodec.html
##### ERROR GELITEXT GeliTextFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broad_tribble_gelitext_GeliTextCodec.html
##### ERROR MAF MafFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_gatk_features_maf_MafCodec.html
##### ERROR MILLSDEVINE VariantContext http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_MillsDevineCodec.html
##### ERROR RAWHAPMAP RawHapMapFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_hapmap_RawHapMapCodec.html
##### ERROR REFSEQ RefSeqFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_refseq_RefSeqCodec.html
##### ERROR SAMPILEUP SAMPileupFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_sampileup_SAMPileupCodec.html
##### ERROR SAMREAD SAMReadFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_samread_SAMReadCodec.html
##### ERROR SNPEFF SnpEffFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_snpEff_SnpEffCodec.html
##### ERROR SOAPSNP VariantContext http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_soapsnp_SoapSNPCodec.html
##### ERROR TABLE TableFeature http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_table_TableCodec.html
##### ERROR VCF VariantContext http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_vcf_VCFCodec.html
##### ERROR VCF3 VariantContext http://www.broadinstitute.org/gsa/gatkdocs/release/org_broadinstitute_sting_utils_codecs_vcf_VCF3Codec.html
##### ERROR ------------------------------------------------------------------------------------------
2011-08-18 10:31:32 -04:00
Mark DePristo
c2287c93d7
Cleanup of codec locations. No more dbSNPHelper
...
-- refdata/features now in utils/codecs with the other codecs
-- Deleted dbsnpHelper. rsID function now in VCFutils. Remaining code either deleted or put into VariantContextAdaptors
-- Many associated import updates due to code move
2011-08-18 10:02:46 -04:00
Mark DePristo
9c17d54cb6
getFeatureClass() now returns Class<T> not Class to avoid yesterday's runtime error
2011-08-18 09:39:20 -04:00
Mark DePristo
c30e1db744
Better location for help utils
2011-08-18 09:38:51 -04:00
Mark DePristo
4da42d9f39
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-18 09:32:57 -04:00
Eric Banks
c91a442be1
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-17 22:40:16 -04:00
Eric Banks
b75a1807e3
Adding integration test to cover sample exclusion
2011-08-17 22:40:09 -04:00
Eric Banks
a7b70e6bb4
Adding feature for Khalid: ability to exclude particular samples.
2011-08-17 22:28:22 -04:00
Mauricio Carneiro
cc3df8f11a
Moving GAV walker to public
...
Walker is updated to the new RodBinding system and has the new GATKDocs layout.
2011-08-17 21:55:17 -04:00
Eric Banks
fa1db3913b
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-17 21:49:25 -04:00
Eric Banks
8e83b6646b
Bug fix for Chris: don't validate ref base for complex events.
2011-08-17 21:49:14 -04:00
Matt Hanna
c104dd7a09
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-17 16:59:12 -04:00
Matt Hanna
81a792afeb
Reverting optimization disable in unstable.
2011-08-17 16:58:24 -04:00
Mark DePristo
2e35592295
GATKDocs for CallableLoci
2011-08-17 16:32:01 -04:00
Guillermo del Angel
c193f52e5d
Fixed up examples: pasting from wiki still had old rod syntax
2011-08-17 16:29:45 -04:00
Matt Hanna
297c9e513c
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable into unstable
2011-08-17 16:24:02 -04:00
Matt Hanna
a210a62ab9
Merged bug fix from Stable into Unstable
2011-08-17 16:23:31 -04:00
Mark DePristo
d59e6ed274
Fix for RefSeqCodec bug and better error messages
...
-- RefSeqCodec bug: getFeatureClass() returned RefSeqCodec.class, not RefSeqFeature.class. Really should change this in Tribble to require Class<T extends Feature> to get compile time type checking
-- Better error messages that actually list the available tribble types, when there's a type error
2011-08-17 16:22:07 -04:00
Matt Hanna
d170187896
Disable optimization that increases marginal speed of the GATK slightly but
...
can produce data loss in a narrow corner case where the BGZF block(s) locations
and offsets in the last index bucket of contig n overlap exactly with the BGZF
block locations and offset in the last index bucket of contig n+1.
A proper fix that keeps the optimization has already been introduced into
unstable, but disabling the optimization is a low risk way to make sure that
users of stable experience no data loss.
2011-08-17 16:16:05 -04:00
David Roazen
53006da9a5
Improved descriptions for the SnpEff annotations in the VCF header
...
(based on Eric's feedback).
2011-08-17 16:09:10 -04:00
Guillermo del Angel
784fb148b9
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-17 15:47:01 -04:00
Guillermo del Angel
671330950d
Updated Beagle walker for gatkdocs format. Pushed unsupported, undocumented arguments to @Hidden
2011-08-17 15:46:31 -04:00
Andrey Sivachenko
0af68e052a
Merge branch 'master' of ssh://cga1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-17 15:17:47 -04:00
Andrey Sivachenko
a423546cdd
fix: RefSeq contains records with zero coding length and the refsec codec/feature used to crash on those; now such records are ignored, with warning printed (once)
2011-08-17 15:17:31 -04:00
Andrey Sivachenko
710d34633e
now the reads that are too long are truly ignored (fix of the fix)
2011-08-17 15:16:23 -04:00
Eric Banks
2f19046f0c
Adding docs to the 2 beasts. Saved the worst for last.
2011-08-17 14:19:14 -04:00
Andrey Sivachenko
069554efe5
somatic indel detector does not die on reads that are too long (likely contain a huge deletion) anymore; instead print a warning and ignore the read
2011-08-17 14:05:19 -04:00