Bug uncovered by some untrimmed alleles in the single sample pipeline output.
Notice however does not fix the untrimmed alleles in general.
Story:
https://www.pivotaltracker.com/story/show/65481104
Changes:
1. Fixed the bug itself.
2. Fixed non-working tests (sliently skipped due to exception in dataProvider).
Note that this tool is still a work in progress and very experimental, so isn't 100% stable. Most of
the features are untested (both by people and by unit/integration tests) because Chris Hartl implemented
it right before he left, and we're going to need to add tests at some point soon. I added a first
integration test in this commit, but it's just a start.
The fixes include:
1. Stop having the genotyping code strip out AD values. It doesn't make sense that it should do this so
I don't know why it was doing that at all.
Updated GenotypeGVCFs so that it doesn't need to manually recover them anymore.
This also helps CalculateGenotypePosteriors which was losing the AD values.
Updated code in LeftAlignAndTrimVariants to strip out PLs and AD, since it wasn't doing that before.
Updated the integration test for that walker to include such data.
2. Chris was calling Math.pow directly on the normalized posteriors which isn't safe.
Instead, the normalization routine itself can revert back to log scale in a safe manner so let's use it.
Also, renamed the variable to posteriorProbabilities (and not likelihoods).
3. Have CGP update the AC/AF/AN counts after fixing GTs.
commit 5e73b94eed3d1fc75c88863c2cf07d5972eb348b
Merge: e12593a d04a585
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Fri Feb 14 09:25:22 2014 +0000
Merge pull request #1 from broadinstitute/checkpoint
SimpleTimer passes tests, with formatting
commit d04a58533f1bf5e39b0b43018c9db3302943d985
Author: kshakir <github@kshakir.org>
Date: Fri Feb 14 14:46:01 2014 +0800
SimpleTimer passes tests, with formatting
Fixed getNanoOffset() to offset nano to nano, instead of nano to seconds.
Updated warning message with comma separated numbers, and exact values of offsets.
commit e12593ae66a5e6f0819316f2a580dbc7ae5896ad
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Wed Feb 12 13:27:07 2014 +0000
Remove instance of 'Timer'.
commit 47a73e0b123d4257b57cfc926a5bdd75d709fcf9
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Wed Feb 12 12:19:00 2014 +0000
Revert a couple of changes that survived somehow.
- CheckpointableTimer,Timer -> SimpleTimer
commit d86d9888ae93400514a8119dc2024e0a101f7170
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Mon Jan 20 14:13:09 2014 +0000
Revised commits following comments.
- All utility merged into `SimpleTimer`.
- All tests merged into `SimpleTimerUnitTest`.
- Behaviour of `getElapsedTime` should now be consistent with `stop`.
- Use 'TimeUnit' class for all unit conversions.
- A bit more tidying.
commit 354ee49b7fc880e944ff9df4343a86e9a5d477c7
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Fri Jan 17 17:04:39 2014 +0000
Add a new CheckpointableTimerUnitTest.
Revert SimpleTimerUnitTest to the version before any changes were made.
commit 2ad1b6c87c158399ededd706525c776372bbaf6e
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Tue Jan 14 16:11:18 2014 +0000
Add test specifically checking behaviour under checkpoint/restart.
Slight alteration to the checkpointable timer based on observations
during the testing - it seems that there's a fair amount of drift
between the sources anyway, so each time we stop we resynchronise the
offset. Hopefully this should avoid gradual drift building up and
presenting as checkpoint/restart drift.
commit 1c98881594dc51e4e2365ac95b31d410326d8b53
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Tue Jan 14 14:11:31 2014 +0000
Should use consistent time units
commit 6f70d42d660b31eee4c2e9d918e74c4129f46036
Author: Nicholas Clarke <nc6@sanger.ac.uk>
Date: Tue Jan 14 14:01:10 2014 +0000
Add a new timer supporting checkpoint mechanisms.
The issue with this is that the current timer is locked to JVM nanoTime. This can be reset after
a checkpoint/restart and result in negative elapsed times, which causes an error.
This patch addresses the issue in two ways:
- Moves the check on timer information in GenomeAnalysisEngine.java to only occur if a time limit has been
set.
- Create a new timer (CheckpointableTimer) which keeps track of the relation between system and nano time. If
this changes drastically, then the assumption is that there has been a JVM restart owing to checkpoint/restart.
Any time straddling a checkpoint/restart event will not be counted towards total running time.
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
Updated path for output gatkdocs in nightly build script.
Removed patch in plugin manager that contained a workaround for gatkdocs running in the top level directory.
After extensive detective work, Joel determined that these tests were failing
due to changes in the implementation of Math.pow() in newer versions of
Java 1.7.
All GSA members should ensure that they're using a JDK that is at least
as current as the one in the Java-1.7 dotkit on the Broad servers
(build 1.7.0_51-b13).
This change should allow us to test that the GATK jar has been
correctly packaged at release time, by ensuring that only the
packaged jar + a few test-related dependencies are on the classpath
when tests are run.
Note that we still need to actually test that this works as intended
before we can make this live in the Bamboo release plan.
1. AD values now propogate up (they weren't before).
2. MIN_DP gets transferred over to DP and removed.
3. SB gets removed after FS is calculated.
Also, added a bunch of new integration tests for GenotypeGVCFs.
This tool will take any number of gVCFs and create a merged gVCF (as opposed to
GenotypeGVCFs which produces a standard VCF).
Added unit/integration tests and fixed up GATK docs.
New properties to disable regenerating example resources artifact when each parallel test runs under packagetest.
Moved collection of packagetest parameters from shell scripts into maven profiles.
Fixed necessity of test-utils jar by removing incorrect dependenciesToScan element during packagetests.
When building picard libraries, run clean first.
Fixed tools jar dependency in picard pom.
Integration tests properly use the ant-bridge.sh test.debug.port variable, like unit tests.
Story:
https://www.pivotaltracker.com/story/show/65048706https://www.pivotaltracker.com/story/show/65116908
Changes:
ActiveRegionTrimmer in now an argument collection and it returns not only the trimmed down active region but also the non-variant containing flanking regions
HaplotypeCaller code has been simplified significantly pushing some functionality two other classes like ActiveRegion and AssemblyResultSet.
Fixed a problem with the way the trimming was done causing some gVCF non-variant records no have conservative 0,0,0 PLs
These changes happened in Tribble, but Joel clobbered them with his commit.
We can now change the logging priority on failures to validate the sequence dictionary to WARN.
Thanks to Tim F for indirectly pointing this out.
1. Throw a user error when the input data for a given genotype does not contain PLs.
2. Add VCF header line for --dbsnp input
3. Need to check that the UG result is not null
4. Don't error out at positions with no gVCFs (which is possible when using a dbSNP rod)
Joel is working on these failures in a separate branch. Since
maven (currently! we're working on this..) won't run the whole
test suite to completion if there's a failure early on, we need
to temporarily disable these tests in order to allow group members
to run tests on their branches again.
Added pom.xml workarounds for duplicate classpath error, due to gatk-framework dependency containing required BaseTest, and jarred *UnitTest/*IntegrationTest classes that also exist as files under target/test-classes.