Mark DePristo
7dd7afe739
Use BQSR in SingleExomeCalling evaluation
2012-11-01 15:34:12 -04:00
Mark DePristo
1444cd753b
Bugfix for GSA-647 HaplotypeCaller misses good variant because the active region doesn't trigger for an exome
...
-- The logic for determining active regions was a bit broken in the HC when intervals were used in the system
-- TraverseActiveRegions now uses the AllLocus view, since we always want to see all reference sites, not just those covered. Simplifies logic of TAR
-- Non-overlapping intervals are always treated as separate objects for determing active / inactive state. This means that each exon will stand on its own when deciding if it should be active or inactive
-- Misc. cleanup, docs of some TAR infrastructure to make it safer and easier to debug in the future.
-- Committing the SingleExomeCalling script that I used to find this problem, and will continue to use in evaluating calling of a single exome with the HC
-- Make sure to get all of the reads into the set of potentially active reads, even for genomic locations that themselves don't overlap the engine intervals but may have reads that overlap the regions
-- Remove excessively expensive calls to check bases are upper cased in ReferenceContext
-- Update md5s after a lot of manual review and discussion with Ryan
2012-11-01 15:34:04 -04:00
Mark DePristo
9cd04c335c
Work on GSA-508 / CachingIndexedFastaReader should internally upper case bases loading data
...
-- As one might expect, CachingIndexedFastaSequenceFile now internally upper cases the FASTA reference bases. This is now done by default, unless requested explicitly to preserve the original bases.
-- This is really the correct place to do this for a variety of reasons. First, you don't need to work about upper casing bases throughout the code. Second, the cache is only upper cased once, no matter how often the bases are accessed, which walkers cannot optimize themselves. Finally, this uses the fastest function for this -- Picard's toUpperCase(byte[]) which is way better than String.toUpperCase()
-- Added unit tests to ensure this functionality works correct.
-- Removing unnecessary upper casing of bases in some core GATK tools, now that RefContext guarentees that the reference bases are all upper case.
-- Added contracts to ensure this is the case.
-- Remove a ton of sh*t from BaseUtils that was so old I had no idea what it was doing any longer, and didn't have any unit tests to ensure it was correct, and wasn't used anywhere in our code
2012-11-01 15:34:03 -04:00
Guillermo del Angel
b9d796f502
More memory/parallelization tweaks: BQSR scatter-gather broken in latest GATK, so disabled. Used multithreading with bigger memory instead
2012-11-01 12:55:00 -04:00
Eric Banks
94a13c05ed
Merged bug fix from Stable into Unstable
2012-10-31 22:57:26 -04:00
Eric Banks
47a0f5859e
Don't run these tests if not GAKT lite
2012-10-31 22:56:38 -04:00
Guillermo del Angel
2db2747723
More tweaks in pipeline parameters
2012-10-31 21:32:19 -04:00
Eric Banks
881c843307
Merged bug fix from Stable into Unstable
2012-10-31 21:28:27 -04:00
Eric Banks
f8af8a2355
Moving UG integration tests to protected since they use protected-only contamination filtering. Adding a new UGLite integration test to confirm that contamination filtering is ignored in lite.
2012-10-31 21:28:07 -04:00
Guillermo del Angel
9a1667f31b
Tweaks in SGE usage
2012-10-31 20:01:33 -04:00
Guillermo del Angel
6aec172c16
Merge branch 'develop' of github.com:broadinstitute/cmi-gatk into develop
2012-10-31 19:51:53 -04:00
Guillermo del Angel
74e842ca95
Merge branch 'develop' of https://github.com/broadinstitute/cmi-gatk into develop
2012-10-31 19:48:52 -04:00
Guillermo del Angel
24e6da25cc
Merge branch 'master' of ssh://gsa3/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-31 14:17:41 -04:00
Eric Banks
96344c6b62
Add note to realigner docs
2012-10-31 12:35:45 -04:00
Eric Banks
0a56fe5bc3
Merge remote-tracking branch 'unstable/master'
2012-10-31 12:17:24 -04:00
Guillermo del Angel
651b1dbb97
Merge branch 'unstable' into develop
2012-10-31 12:06:09 -04:00
Guillermo del Angel
3dd290b142
Merge branch 'develop' of github.com:broadinstitute/cmi-gatk into develop
2012-10-31 12:00:05 -04:00
kshakir
f5697532d6
Added mvninstall.queue.all target which includes private, along with supporting sub-targets.
2012-10-31 11:49:50 -04:00
Guillermo del Angel
7dc0e26549
Merge remote-tracking branch 'unstable/master' into unstable
2012-10-31 11:47:51 -04:00
Guillermo del Angel
4580e99c0c
Merge branch 'master' of ssh://gsa3/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-31 10:50:54 -04:00
Guillermo del Angel
02b790c8db
Merge fix
2012-10-31 10:50:36 -04:00
Guillermo del Angel
51a9ce28e1
Merge remote-tracking branch 'unstable/master' into develop
2012-10-31 10:29:48 -04:00
Guillermo del Angel
55ffbec000
Fixes to set output QC files, prepare for GridEngine usage
2012-10-31 07:55:04 -04:00
Christopher Hartl
ec17bb1a75
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2012-10-30 16:52:14 -04:00
Eric Banks
eccb76c304
Only run UG in the bundle for chr20
2012-10-30 15:09:46 -04:00
Eric Banks
e1e480a0b9
Bug fix: don't add no-call alleles to the list of ALT alleles being validated.
2012-10-30 14:54:29 -04:00
Eric Banks
2aa28abe0a
Fixing md5s to reflect the new HapMap file
2012-10-30 14:27:10 -04:00
Guillermo del Angel
c8e17a7adf
totally experimental UG feature, to be removed
2012-10-30 13:57:54 -04:00
Kristian Cibulskis
a3ba3e0a61
fixed invocation of indel single sample VCF filter to supply GATK classpath
2012-10-30 11:14:08 -04:00
Eric Banks
8a402024c2
Updating bundle script to handle new naming convention of CEU trio best practices callset
2012-10-30 09:11:56 -04:00
Eric Banks
c95e893920
Better error message for unused ALT alleles
2012-10-29 21:51:35 -04:00
Eric Banks
b6a1967f12
Better documentation for ValidateVariants so that people realize it's used for strict validation of the VCF file. Added an option to turn off strict validation and an integration test to cover it.
2012-10-29 21:47:09 -04:00
Ryan Poplin
21fa5f70ca
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-29 18:53:41 -04:00
Ryan Poplin
5ee2feb2a3
updating pipeline test md5s
2012-10-29 18:53:27 -04:00
Eric Banks
be902375ac
'Bug' fix: fix the error message from the vcf validator so people realize that the file fails strict validation but still adheres to the spec.
2012-10-29 16:29:27 -04:00
Ryan Poplin
4e661847b2
DelocalizedBaseRecalibrator becomes the BaseRecalibrator.
2012-10-29 12:53:39 -04:00
Kristian Cibulskis
cb7fe3f881
implementation of
...
DEV-15 #resolve #time 1m
DEV-16 #resolve #time 3m
DEV-17 #resolve #time 1m
DEV-18 #resolve #time 3m
2012-10-29 12:00:36 -04:00
Menachem Fromer
5070304ef4
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-29 11:50:29 -04:00
Eric Banks
ac99437eec
Bug fixes to hapmap conversion in VariantsToVCF
2012-10-29 01:45:33 -04:00
Guillermo del Angel
43f7e8dc47
Queue installed in aws refuses to see new GATK extension, so disable for now. Reenable QC now that Rscript is installed in cloud
2012-10-28 15:58:36 -04:00
Guillermo del Angel
9fe79ec74b
Reenable parallelization in BQSR
2012-10-27 19:56:46 -04:00
Guillermo del Angel
c21bc5ac4c
Hook up duplicate metrics to script outputs. Disable GC and multiple metrics until R is installed in nodes
2012-10-27 19:48:48 -04:00
Eric Banks
43625f652e
Shoot, mixed up the md5s last time.
2012-10-27 19:43:46 -04:00
Mauricio Carneiro
468640476f
Merge remote-tracking branch 'unstable/master' into develop
2012-10-27 13:50:03 -04:00
Andrey Sivachenko
f3ac5d404d
updating vcf header attribute descriptions in order to reflect correctly what's actually being written...
2012-10-26 23:52:21 -04:00
Andrey Sivachenko
b4fbf6280a
fixing missing sample genotype bug, missing AD/DP bug, and putting annotations in more natural order (Ref/Alt)
2012-10-26 23:48:40 -04:00
Mark DePristo
ac5e58a265
Bugfix for GSA-540 / Update metadata maps when adding lines to VCFHeader
...
-- https://jira.broadinstitute.org/browse/GSA-540
-- http://gatkforums.broadinstitute.org/discussion/1433/possible-bug-and-fix-in-java-code-of-vcfheader-org-broadinstitute-sting-utils-codecs-vcf-vcfheader
2012-10-26 16:34:16 -04:00
Mark DePristo
fa9b2a91d0
Bugfix for GSA-552
...
-- https://jira.broadinstitute.org/browse/GSA-552
-- User reports a null exception while using VariantsToVCF:
http://gatkforums.broadinstitute.org/discussion/1461/nullpointerexception-converting-vcf3-to-vcf-using-variantstovcf
The problem is that he left out an input VCF file for the --variant argument and the command-line argument parsing code didn't catch this, so we NPE out later on.
2012-10-26 16:34:16 -04:00
Eric Banks
682a72faf7
Hmm, thought I got all the md5s last time. Apparently not.
2012-10-26 16:10:12 -04:00
Christopher Hartl
db8cbb5444
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2012-10-26 15:15:10 -04:00