Mark DePristo
2b601571e7
Better error handling in NanoScheduler
...
-- The previous nanoscheduler would deadlock in the case where an Error, not an Exception, was thrown. Errors, like out of memory, would cause the whole system to die. This bugfix resolves that issue
2012-12-05 14:49:22 -05:00
Eric Banks
0c925856cb
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-05 02:00:39 -05:00
Eric Banks
ef87b18e09
In retrospect, it wasn't a good idea to have FisherStrand handle reduced reads since they are always on the forward strand. For now, FS ignores reduced reads but I've added a note (and JIRA) to make this work once the RR het compression is enabled (since we will have directionality in reads then).
2012-12-05 02:00:35 -05:00
Mauricio Carneiro
30f013aeb0
Added a copy() method for ReadBackedPileups
...
necessary to create new alignment contexts with hard-copies of the pileup.
2012-12-05 01:32:18 -05:00
Mauricio Carneiro
6feda540a4
Better error message for SimpleGATKReports
2012-12-05 01:32:18 -05:00
Randal Moore
8d2d0253a2
introduce a level of indirection for the forum URLs - this new function will allow me a place to morph the URL into something that is supported by Confluence
...
Signed-off-by: Eric Banks <ebanks@broadinstitute.org>
2012-12-03 22:33:02 -05:00
Eric Banks
67932b357d
Bug fix for RR: don't let the softclip start position be less than 1
2012-12-03 15:59:14 -05:00
Ryan Poplin
a47da9bb2f
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-03 14:30:14 -05:00
Eric Banks
5fed9df295
Quick fix: base qual array in the GATKSAMRecord stores the actual phred values (-33) and not the original bytes (duh).
2012-12-03 12:18:20 -05:00
Eric Banks
b6839b3049
Added checking in the GATK for mis-encoded quality scores.
...
The check is performed by a Read Transformer that samples (currently set to once
every 1000 reads so that we don't hurt overall GATK performance) from the input
reads and checks to make sure that none of the base quals is too high (> Q60). If
we encounter such a base then we fail with a User Error.
* Can be over-ridden with --allow_potentially_misencoded_quality_scores.
* Also, the user can choose to fix his quals on the fly (presumably using PrintReads
to write out a fixed bam) with the --fix_misencoded_quality_scores argument.
Added unit tests.
2012-12-03 11:18:41 -05:00
Ryan Poplin
18b002c99c
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-03 10:08:56 -05:00
Ryan Poplin
1bdf17ef53
Reworking of how the likelihood calculation is organized in the HaplotypeCaller to facilitate the inclusion of per allele downsampling. We now use the downsampling for both the GL calculations and the annotation calculations.
2012-12-02 11:58:32 -05:00
Mark DePristo
8020ba14db
Minor cleanup of SAMDataSource as part of my system review
...
-- Changed a few function from public to protected, as they are only used by the package contents, to simplify the SAMDataSource interface
2012-11-30 15:04:41 -05:00
Mauricio Carneiro
fc7fab5f3b
Fixed ReadBackedPileup downsampling
...
Downsampling in the PerSampleReadBackedPileup was broken, it didn't downsample anything, always returning a copy the original pileup.
2012-11-30 00:42:05 -05:00
Joel Thibault
97d29f203e
Add walltime changes to LSF
...
- Check whether the specified attribute is available
- Add pipeline test (disabled due to missing attribute)
2012-11-29 15:23:37 -05:00
Joel Thibault
198923b597
Add ActiveRegionReadState handling
2012-11-28 13:59:57 -05:00
Ryan Poplin
f0395b457a
Adding the work-in-progress, experimental RepeatLengthCovariate to the BQSR so Chris can continue the development.
2012-11-28 13:56:32 -05:00
Eric Banks
3463774f2a
Merged bug fix from Stable into Unstable
2012-11-28 13:26:52 -05:00
Eric Banks
6030605242
Added quick check for creation of bad BAQ values associated with badly encoded base qualities; hopefully this can help us debug the non-reproducible issue seen by many users.
2012-11-28 13:26:31 -05:00
Mark DePristo
c676853731
Merged bug fix from Stable into Unstable. Updating md5s
...
Conflicts:
protected/java/test/org/broadinstitute/sting/gatk/walkers/genotyper/UnifiedGenotyperIntegrationTest.java
2012-11-28 12:54:36 -05:00
Mark DePristo
a1d6461121
Critical bugfix to AFCalcResult affecting UG/HC quality score emission thresholds
...
As reported by Menachem Fromer: a critical bug in AFCalcResult:
Specifically, the implementation:
public boolean isPolymorphic(final Allele allele, final double log10minPNonRef) {
return getLog10PosteriorOfAFGt0ForAllele(allele) >= log10minPNonRef;
}
seems incorrect and should probably be:
getLog10PosteriorOfAFEq0ForAllele(allele) <= log10minPNonRef
The issue here is that the 30 represents a Phred-scaled probability of *error* and it's currently being compared to a log probability of *non-error*.
Instead, we need to require that our probability of error be less than the error threshold.
This bug has only a minor impact on the calls -- hardly any sites change -- which is good. But the inverted logic effects multi-allelic sites significantly. Basically you only hit this logic with multiple alleles, and in that case it'\s including extra alt alleles incorrectly, and throwing out good ones.
Change was to create a new function that properly handles thresholds that are PhredScaled quality scores:
/**
* Same as #isPolymorphic but takes a phred-scaled quality score as input
*/
public boolean isPolymorphicPhredScaledQual(final Allele allele, final double minPNonRefPhredScaledQual) {
if ( minPNonRefPhredScaledQual < 0 ) throw new IllegalArgumentException("phredScaledQual " + minPNonRefPhredScaledQual + " < 0 ");
final double log10Threshold = Math.log10(QualityUtils.qualToProb(minPNonRefPhredScaledQual));
return isPolymorphic(allele, log10Threshold);
}
2012-11-28 12:08:02 -05:00
Menachem Fromer
79bc878e6a
Allow debugging to be set from the command line
2012-11-27 22:37:41 -05:00
Eric Banks
b40d3eb8aa
Merged bug fix from Stable into Unstable
2012-11-27 14:41:07 -05:00
Eric Banks
01abcc3e0f
Tests didn't like my note to Geraldine in the output logs; apparently it's tested in integration tests
2012-11-27 14:40:49 -05:00
Joel Thibault
d83ad906ef
Add profile range contract
2012-11-27 13:03:13 -05:00
Eric Banks
9531e58445
Merged bug fix from Stable into Unstable
2012-11-27 11:00:50 -05:00
Eric Banks
4543ece088
Fixing parsing of genomelocs that contain colons in the contig names (which is allowed by the spec) as reported on the forum. Added unit test for this case.
2012-11-27 11:00:33 -05:00
Eric Banks
a82ec7ad80
Merged bug fix from Stable into Unstable
2012-11-27 10:27:08 -05:00
Eric Banks
e199562c25
I have pulled out all of the documentation URLs and put them into the HelpUtils class as static variables; this way, Appistry can change links as needed to point commercial users to their own internal forum without having to muck things up all over our source. Added some TODOs for Geraldine to update links in the GATK docs that still point to the old wiki. Sorry that I am pushing into stable, but that's what Appistry is pulling from for their release next week (and unstable has been failing forever).
2012-11-27 10:26:17 -05:00
Mauricio Carneiro
97fd5de260
Merging latest CMI updates with UNSTABLE
2012-11-27 09:08:00 -05:00
Eric Banks
b1969a66bd
Update docs
2012-11-27 08:24:41 -05:00
Eric Banks
cc72aaefeb
Minor efficiency: use >= instead of > in test
2012-11-27 01:11:23 -05:00
Eric Banks
405f3c675d
Fix for GSA-649: GenomeLocSortedSet.overlaps is crazy slow. Also improved GenomeLocSortedSet.sizeBeforeLoc.
2012-11-27 01:07:00 -05:00
Ryan Poplin
e27d677c13
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-11-26 12:20:32 -05:00
Ryan Poplin
c3b7dd1374
Misc cleanup in the HaplotypeCaller. Cleaning up unused arguments after recent changes to HC-GenotypingEngine
2012-11-26 12:19:11 -05:00
Eric Banks
4f7fa3009a
I forget why I thought that the VariantAnnotator couldn't run multi-threaded because it works just fine. Now you can specify -nt with VA.
2012-11-26 11:34:59 -05:00
Mauricio Carneiro
a3f5932501
Fixed null pointer exception in Integration Tests
...
When running Utils.setupWriter with NO_PG_TAG set, the writer was attempting to create a program record with the null pointer. Fixed.
2012-11-26 11:12:27 -05:00
Ryan Poplin
fedc4fde6c
Merged bug fix from Stable into Unstable
2012-11-25 21:55:55 -05:00
Ryan Poplin
d978cfe835
Soft clipped bases shouldn't be counted in the delocalized BQSR.
2012-11-25 21:55:29 -05:00
Eric Banks
9719ba7adc
Remove -number example from the docs since it's no longer supported.
2012-11-22 21:53:42 -05:00
Menachem Fromer
2306518ab6
Fix to deal with 'proper' options of casting
2012-11-22 01:45:18 -05:00
Menachem Fromer
d33a412b5f
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-11-22 01:42:29 -05:00
Mark DePristo
48f271c5bd
Adding 80% support for multi-allelic variants
...
-- Multi-allelic variants are split into their bi-allelic version, trimmed, and we attempt to provide a meaningful genotype for NA12878 here. It's not perfect and needs some discussion on how to handle het/alt variants
-- Adding splitInBiallelic funtion to VariantContextUtils as well as extensive unit tests that also indirectly test reverseTrimAlleles (which worked perfectly FYI)
2012-11-21 17:24:59 -05:00
Joel Thibault
c08b782743
Count isActive calls directly
2012-11-21 17:16:45 -05:00
Eric Banks
4f2229d399
As per the TODO message, I removed a check that was no longer necessary. Now ID is an allowable INFO field key.
2012-11-21 16:01:26 -05:00
Menachem Fromer
06261b58c2
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-11-21 15:57:08 -05:00
Eric Banks
ed50814ccb
Finally found a case where user errors were being masked behind other errors and could debug. It turns out that the checkForMaskedUserErrors() method needs to run recursively over all levels (calling exception.getCause()) to check for the original cause.
2012-11-21 15:57:05 -05:00
Menachem Fromer
c8be7c3102
Keep SNPs and indels separately for batch merging; Add options to DepthOfCoverage to count fragments (to not double-count overlapping reads of same fragment); DepthOfCoverage should now support ReducedReads; Replace recusrion with loop in DoC/package.scala (for lists longer than 5000 elements)
2012-11-21 15:56:53 -05:00
Eric Banks
2e1a055aca
Merged bug fix from Stable into Unstable
2012-11-20 23:20:33 -05:00
Eric Banks
c54fc94505
Protect against features that start off the end of the read (otherwise, Arrays.fill fails)
2012-11-20 23:19:59 -05:00