Eric Banks
05b44dd017
The genotypeCounts array wasn't always being initialized before it was accessed, leading to a NPE (which got caught and thrown as a JEXL expression when used in selection). Added unit test to cover all genotype count methods.
2012-04-27 10:49:36 -04:00
Khalid Shakir
9801dd114f
Bug fix for: https://getsatisfaction.com/gsa/topics/problem_with_indelrealigner_and_l_unmapped
...
The GATK -L unmapped is for GenomeLocs with SAMRecord.NO_ALIGNMENT_REFERENCE_NAME, not SAMRecord.getReadUnmappedFlag()
Previously unmapped flag reads in the last bin were being printed while also seeking for the reads without a reference contig.
2012-04-27 09:58:38 -04:00
Khalid Shakir
005cdcad5b
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-27 09:58:14 -04:00
Menachem Fromer
64077ec7c8
Add option to use XHMM to genotype all possible consecutive sub-segments of CNVs
2012-04-27 01:42:20 -04:00
Khalid Shakir
2ad1aa2a2c
Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-26 19:31:47 -04:00
Eric Banks
db41d10f54
First (very rough) version of a haplotype-based resolver of alleles from two provided VCF files. It works well on standard cases (e.g. MNP vs. 2 SNPs) but still needs to be tested more thoroughly and I need to add support for multi-allelics (although I know how to do that now). Being committed in private for Ryan's benefit, but no one else should be using it now.
2012-04-26 16:30:22 -04:00
Khalid Shakir
b8c0405715
Updates to the WGP to only run eval on chr20.
...
PicardIntervals object now prints out a meaningful toString when HSP batches return multiple interval lists.
2012-04-26 16:28:45 -04:00
Guillermo del Angel
2f86ccb086
Correct md5's for previous code change
2012-04-26 16:20:41 -04:00
Guillermo del Angel
972d6531b6
Corner case fix for indel GL computation: sometimes (depending on surrounding context) reads which are not informative of two candidate haplotypes end up having marginally higher likelihoods with one haplotype as opposed to another, depending on uncertainty on alignments in surrounding regions. So, a sample whose GL is -0.0001,-0.0005,-0.001 may have its genotype set to 1/1 due to this statistical noise. We already have a tolerance comparing max(gl)-min(gl) to avoid genotyping, so this tolerance is now increased from 0.001 to 0.1 (equivalent to 1 PL unit) to avoid genotyping a sample if all PLs are within this threshold. Changed 2 integration test md5s that hit this case.
2012-04-26 10:15:26 -04:00
Laurent Francioli
ab2a952ad1
PED support for Inbreeding Coefficient annotation
...
Signed-off-by: Eric Banks <ebanks@broadinstitute.org>
2012-04-25 12:56:47 -04:00
Laurent Francioli
219b0a128b
PED support for ChromosomeCounts annotation
...
Signed-off-by: Eric Banks <ebanks@broadinstitute.org>
2012-04-25 12:50:04 -04:00
Laurent Francioli
19d5213d5a
Added function to get founders IDs in SampleDB
...
Signed-off-by: Eric Banks <ebanks@broadinstitute.org>
2012-04-25 12:49:36 -04:00
Mark DePristo
120deaa010
Remove old licensing
2012-04-25 12:23:08 -04:00
Mark DePristo
dab25afc88
Add warning message about ratios in variantQCreport, give ratio for MAF > 10%
2012-04-25 12:22:32 -04:00
Mauricio Carneiro
902277856e
fix for RBP getPileupsForSamples()
...
do not differentiate per sample pileups from generic pileups. Do the same for both -- it's O(n) either way.
2012-04-24 17:20:30 -04:00
Mauricio Carneiro
82b4798913
CountBasesWalker -- a quick QC walker.
2012-04-24 17:20:30 -04:00
Mauricio Carneiro
e440d0ce69
BQSR triage #4
...
* fixed queue script plot file names
* updated the ReadGroupCovariate to use the platform unit instead of sample + lane.
* fixed plotting of marginalized reported qualities
2012-04-24 17:19:54 -04:00
Eric Banks
d6277b70d8
Forgot to consider the optimized case in hasAllele
2012-04-24 11:32:28 -04:00
Eric Banks
91bad244d5
Using a VCF whose ALT is the reference in GGA mode is a User Error
2012-04-24 11:08:37 -04:00
Eric Banks
74ad008163
Adding VariantContext.hasAlternateAllele functionality
2012-04-24 11:07:46 -04:00
Eric Banks
66f3315548
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-24 09:39:55 -04:00
Eric Banks
bcb93dda5f
Fixing docs (rank sum test values are not phred-scaled)
2012-04-24 09:39:42 -04:00
Mauricio Carneiro
e39a59594a
BQSR triage and test routines
...
* updated BQSR queue script for faster turnaround
* implemented plot generation for scatter/gatherered runs
* adjusted output file names to be cooperative with the queue script
* added the recalibration report file to the argument table in the report
* added ReadCovariates unit test -- guarantees that all the covariates are being generated for every base in the read
* added RecalibrationReport unit test -- guarantees the integrity of the delta tables
2012-04-23 11:23:00 -04:00
Eric Banks
a733723439
Merged bug fix from Stable into Unstable
2012-04-23 10:30:30 -04:00
Eric Banks
2761da975e
Handle null VCs (which can arise when indels are present in the file)
2012-04-23 10:30:00 -04:00
Eric Banks
cd63bcb1b8
Fixing unit tests to register the user exception being thrown (instead of the NumberFormatException)
2012-04-23 10:06:51 -04:00
Eric Banks
63aa79df82
Slightly better error message
2012-04-23 09:37:28 -04:00
Eric Banks
7b5fbf9567
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-23 09:34:08 -04:00
Eric Banks
4edb005411
Catch poorly formatted PL/GL fields
2012-04-23 09:33:50 -04:00
Ryan Poplin
35bb55f562
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-22 13:23:36 -04:00
Ryan Poplin
18e4532d10
Turning down the amount of assembly graph pruning slightly in the case of low coverage.
2012-04-22 13:23:24 -04:00
Menachem Fromer
10e0647347
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-21 23:19:16 -04:00
Menachem Fromer
0630aa1cf0
Only build valid GenomeLoc loci; throw explicit exception when sample's BAM is not found
2012-04-21 23:18:41 -04:00
Eric Banks
1f23d99dfa
If we are subsetting alleles in the UG (either because there were too many or because some were not polymorphic), then we may need to trim the alleles (because the original VariantContext may have had to pad at the end). Thanks to Ryan for reporting this. Only one of the integration tests had even partially covered this case, so I added one that did.
2012-04-20 17:00:05 -04:00
Eric Banks
4b81c75642
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-20 14:30:19 -04:00
Eric Banks
f1c5510ec0
When running SelectVariants with the excludeNonVariants option, remove alleles from the ALT field that are no longer polymorphic.
2012-04-20 14:30:04 -04:00
Ryan Poplin
a1596791af
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-20 14:03:04 -04:00
Ryan Poplin
a57295eb75
Fixing a bug when breaking up active regions where the resulting regions would overlap by one base. Adding quality score manipulation from the UG into the haplotype caller (qual capped by mapping quality, min qual threshold).
2012-04-20 14:02:55 -04:00
Menachem Fromer
40a247e860
Run HC and UG for comparison; run HC with genotypeFullActiveRegion to get phased genotypes
2012-04-20 13:33:09 -04:00
Ryan Poplin
aa903de892
Hooking up the haplotype genotyping option requested by Menachem.
2012-04-20 11:46:37 -04:00
Guillermo del Angel
de68363c23
Removed experimental feature (aka hack) that was meant for 1000G consensus but remained in VQSR data manager - QD was being scaled by indel length. There's no evidence any more that QD is length-dependent, neither in CEU trio data nor in latest 1000G P2 calls
2012-04-20 10:58:34 -04:00
Mauricio Carneiro
a1561a97c4
Changing the name of the integration test (too long to type) and disabling tests during the triage...
2012-04-19 20:36:51 -04:00
Guillermo del Angel
d2488dfb81
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-19 19:40:03 -04:00
Guillermo del Angel
c44c7b9a97
Restored optimization in Pair HMM only to compute HMM matrices starting in index where haplotypes start to diverge - saves about 15-20% of runtime which is what we lost by disabling banding in latest version, so runtime should be now about the same as what it was before refactoring. Output is bit-true to previous commit
2012-04-19 19:39:43 -04:00
Mauricio Carneiro
0f8c77391d
BQSR bug triage #3
...
* fixed context covariate famous "off by one" error
* reduced maximum quality score to Q50 (following Eric/Ryan's suggestion)
* remove context downsampling in BQSR R script
2012-04-19 17:31:04 -04:00
Khalid Shakir
df5dd841af
AC strat now checks if evals will be merged before throwing an error on multiple eval files.
...
Minor tweaks to WGP script based on new recal VCF format.
2012-04-19 16:08:55 -04:00
Guillermo del Angel
3fa9089085
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-19 14:37:34 -04:00
Menachem Fromer
6d5b05c123
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-19 14:18:34 -04:00
Menachem Fromer
53ebde2c3b
Added Queue script to run HaplotypeCaller on user-defined sets of samples at particular loci
2012-04-19 14:17:57 -04:00
Guillermo del Angel
1ae2ab5b63
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-19 12:50:29 -04:00