Guillermo del Angel
55ffbec000
Fixes to set output QC files, prepare for GridEngine usage
2012-10-31 07:55:04 -04:00
Christopher Hartl
ec17bb1a75
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2012-10-30 16:52:14 -04:00
Eric Banks
eccb76c304
Only run UG in the bundle for chr20
2012-10-30 15:09:46 -04:00
Eric Banks
e1e480a0b9
Bug fix: don't add no-call alleles to the list of ALT alleles being validated.
2012-10-30 14:54:29 -04:00
Eric Banks
2aa28abe0a
Fixing md5s to reflect the new HapMap file
2012-10-30 14:27:10 -04:00
Guillermo del Angel
c8e17a7adf
totally experimental UG feature, to be removed
2012-10-30 13:57:54 -04:00
Kristian Cibulskis
a3ba3e0a61
fixed invocation of indel single sample VCF filter to supply GATK classpath
2012-10-30 11:14:08 -04:00
Eric Banks
8a402024c2
Updating bundle script to handle new naming convention of CEU trio best practices callset
2012-10-30 09:11:56 -04:00
Eric Banks
c95e893920
Better error message for unused ALT alleles
2012-10-29 21:51:35 -04:00
Eric Banks
b6a1967f12
Better documentation for ValidateVariants so that people realize it's used for strict validation of the VCF file. Added an option to turn off strict validation and an integration test to cover it.
2012-10-29 21:47:09 -04:00
Ryan Poplin
21fa5f70ca
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-29 18:53:41 -04:00
Ryan Poplin
5ee2feb2a3
updating pipeline test md5s
2012-10-29 18:53:27 -04:00
Eric Banks
be902375ac
'Bug' fix: fix the error message from the vcf validator so people realize that the file fails strict validation but still adheres to the spec.
2012-10-29 16:29:27 -04:00
Ryan Poplin
4e661847b2
DelocalizedBaseRecalibrator becomes the BaseRecalibrator.
2012-10-29 12:53:39 -04:00
Kristian Cibulskis
cb7fe3f881
implementation of
...
DEV-15 #resolve #time 1m
DEV-16 #resolve #time 3m
DEV-17 #resolve #time 1m
DEV-18 #resolve #time 3m
2012-10-29 12:00:36 -04:00
Menachem Fromer
5070304ef4
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-29 11:50:29 -04:00
Eric Banks
ac99437eec
Bug fixes to hapmap conversion in VariantsToVCF
2012-10-29 01:45:33 -04:00
Guillermo del Angel
43f7e8dc47
Queue installed in aws refuses to see new GATK extension, so disable for now. Reenable QC now that Rscript is installed in cloud
2012-10-28 15:58:36 -04:00
Guillermo del Angel
9fe79ec74b
Reenable parallelization in BQSR
2012-10-27 19:56:46 -04:00
Guillermo del Angel
c21bc5ac4c
Hook up duplicate metrics to script outputs. Disable GC and multiple metrics until R is installed in nodes
2012-10-27 19:48:48 -04:00
Eric Banks
43625f652e
Shoot, mixed up the md5s last time.
2012-10-27 19:43:46 -04:00
Mauricio Carneiro
468640476f
Merge remote-tracking branch 'unstable/master' into develop
2012-10-27 13:50:03 -04:00
Andrey Sivachenko
f3ac5d404d
updating vcf header attribute descriptions in order to reflect correctly what's actually being written...
2012-10-26 23:52:21 -04:00
Andrey Sivachenko
b4fbf6280a
fixing missing sample genotype bug, missing AD/DP bug, and putting annotations in more natural order (Ref/Alt)
2012-10-26 23:48:40 -04:00
Mark DePristo
ac5e58a265
Bugfix for GSA-540 / Update metadata maps when adding lines to VCFHeader
...
-- https://jira.broadinstitute.org/browse/GSA-540
-- http://gatkforums.broadinstitute.org/discussion/1433/possible-bug-and-fix-in-java-code-of-vcfheader-org-broadinstitute-sting-utils-codecs-vcf-vcfheader
2012-10-26 16:34:16 -04:00
Mark DePristo
fa9b2a91d0
Bugfix for GSA-552
...
-- https://jira.broadinstitute.org/browse/GSA-552
-- User reports a null exception while using VariantsToVCF:
http://gatkforums.broadinstitute.org/discussion/1461/nullpointerexception-converting-vcf3-to-vcf-using-variantstovcf
The problem is that he left out an input VCF file for the --variant argument and the command-line argument parsing code didn't catch this, so we NPE out later on.
2012-10-26 16:34:16 -04:00
Eric Banks
682a72faf7
Hmm, thought I got all the md5s last time. Apparently not.
2012-10-26 16:10:12 -04:00
Christopher Hartl
db8cbb5444
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2012-10-26 15:15:10 -04:00
Ryan Poplin
b0dcc2c78e
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-26 15:14:41 -04:00
Ryan Poplin
610e003b93
Fixing a bug in the BQSR where the reference context gets out of sync after a read is adapter clipped inside the walker.
2012-10-26 15:14:27 -04:00
David Roazen
35483a7eef
Update MD5s for PrintReads with BQSR Integration Test
...
The MD5s for these tests were changed in commit 87435f1074615b2cd016f042980109fd53962c8d
to match the output of a broken version of BaseRecalibration. With the patch in
commit c397102ecc1fd1d2cd8f209a8f358ab4a60b50a7, the output once again matches the
*original* MD5s for these tests, and does not vary as you increase -nct.
Final resolution to GSA-632
2012-10-26 14:25:25 -04:00
Eric Banks
f66d812778
Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-26 13:20:41 -04:00
Eric Banks
a8704ca73f
Adding TODO notes for Ami
2012-10-26 13:20:27 -04:00
Mark DePristo
251983b8fb
Add GATK-wide command line argument to control the maximum runtime allowed for the GATK
...
-- Providing this optional argument -maxRuntime (in -maxRuntimeUnits units) causes the GATK to exit gracefully when the max. runtime has been exceeded. By cleanly I mean that the engine simply stops at the next available cycle in the walker as through the end of processing had been reached. This means that all output files are closed properly, etc.
-- Emits an info message that looks like "INFO 10:36:52,723 MicroScheduler - Aborting execution (cleanly) because the runtime has exceeded the requested maximum 10.0000 s". Otherwise there's currently no way to differentiate a truly completed run from a timelimit exceeded run, which may be a useful thing for a future update
-- Resolves GSA-630 / GATK max runtime to deal with bad LSA calling?
-- Added new JIRA entry for Ami to restart chr1 macarthur with this argument set to -maxRuntime 1 -maxRuntimeUnits DAYS to see if we can do all of chr1 in one weekend.
2012-10-26 13:18:34 -04:00
Eric Banks
46099af8db
Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-26 12:10:53 -04:00
Eric Banks
ed11b7dab2
Fix UG parallelization test
2012-10-26 12:10:44 -04:00
Eric Banks
7a706ed345
Fix some of the broken integration tests
2012-10-26 11:23:44 -04:00
Yossi Farjoun
27a4d6d90e
Merge branch 'master' of ssh://gsa4/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-26 09:33:10 -04:00
Yossi Farjoun
a3193a1743
fixed LargeScaleValidationCallingSingle.scala for new version of scala
2012-10-26 09:32:19 -04:00
Menachem Fromer
28393e9b30
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-26 03:55:02 -04:00
Eric Banks
ebebec7fdb
Accidentally left one test disabled
2012-10-26 02:15:32 -04:00
Eric Banks
b06f689d4b
Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-26 02:13:26 -04:00
Eric Banks
a53e03d525
Do not let reduced reads get removed in the contamination down-sampling
2012-10-26 02:13:04 -04:00
Menachem Fromer
9af4b34fd8
Changed @Input to @Argument for non-File types
2012-10-26 01:21:05 -04:00
Eric Banks
bf3d61ce82
The default value for --contamination_fraction_to_filter is now 0.05 (5%) in both UG and HC. Users of GATK-lite get pushed down to 0% by default (since it's not enabled) or get a user error if they try to set it.
2012-10-26 01:04:51 -04:00
Eric Banks
91f2c847a3
Fixing problem reported on forum for VF: DP couldn't be filtered from the FORMAT field, only from the INFO field. Fixed and added integration test.
2012-10-26 00:57:40 -04:00
Menachem Fromer
e0fa7d1497
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-26 00:14:58 -04:00
Mark DePristo
d879c77aca
Don't scale up memory requirements by nct for PrintReads tests
2012-10-25 17:43:49 -04:00
Mark DePristo
6b8b7df651
Queue now understands -nct and requests the appropriate number of cores from LSF, SGE, etc
...
-- NCT wasn't previously recognized by Queue as needing more processors per machine. This commit fixes this. Also a potential cause of poor GATKPerformanceOverTime, in that runs with -nct could flood a node and cause it to have hundreds of cores in contention.
2012-10-25 17:26:58 -04:00
David Roazen
422e16c62e
BaseRecalibration: don't cache instances of ReadCovariates across reads
...
Caching and reusing ReadCovariates instances across reads sounds good in theory, but:
-it doesn't work unless you zero out the internal arrays before each read
-the internal arrays must be sized proportionally to the maximum POSSIBLE
recalibrated read length (5000!!!), instead of the ACTUAL read lengths
By contrast, creating a new instance per read is basically equivalent to doing an
efficient low-level memset-style clear on a much smaller array (since we use the actual
rather than the maximum read length to create it). So this should be faster than caching
instances and calling clear() but slower than caching instances and not calling clear().
Credit to Ryan to proposing this approach.
2012-10-25 17:02:55 -04:00