Laurent Francioli
16cc2b864e
- Corrected bug causing cases where both parents are HET to be accounted twice in the TDT calculation - Adapted TDT Integration test to corrected version of TDT
...
Signed-off-by: Ryan Poplin <rpoplin@broadinstitute.org>
2011-12-19 10:30:59 -05:00
Eric Banks
5fd19ae734
Commented exactly how the results are represented from the exact model so developers can know how to use them.
2011-12-19 10:19:00 -05:00
Mark DePristo
5383c50654
Protect ourselves when iteration is present but there's only a single iteration in queueJobReport.R
2011-12-19 10:08:38 -05:00
Eric Banks
3069a689fe
Bug fix: if there are multiple records at a given position, it turns out that SelectVariants would drop all variants that follow after one that fails filters (instead of dropping just the failing one). Added an integration test to cover this case.
2011-12-19 10:04:33 -05:00
Mauricio Carneiro
728d66cca4
Adding Picard imports to the Haplotype Caller
...
Not sure how this passed my tests before, but clearly these imports got deleted by extra-aggressive 'unused imports cleanup' by IntelliJ.
2011-12-19 09:47:48 -05:00
Mauricio Carneiro
5b678e3b94
Remove ClippingOp UnitTests
...
* all testing functionality is in the ReadClipperUnitTest, no need to double test.
* class and package naming cleanup
2011-12-19 07:49:26 -05:00
Matt Hanna
1ead00cac5
New fork of SamFileHeaderMerger should be cached at the thread level to enable fast (and valid) thread lookups.
2011-12-18 19:04:26 -05:00
Ryan Poplin
bc842ab3a5
Adding option to VariantAnnotator to do strict allele matching when annotating with comp track concordance.
2011-12-18 15:27:23 -05:00
Ryan Poplin
953998dcd0
Now that getSampleDB is public in the walker base class this override in VariantAnnotator isn't necessary.
2011-12-18 14:38:59 -05:00
Eric Banks
76bd13a1ed
Forgot to update the unit test
2011-12-18 01:13:49 -05:00
Eric Banks
07f9d14d9f
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-18 00:43:15 -05:00
Eric Banks
4d11b20118
updating HC test too
2011-12-18 00:43:01 -05:00
Eric Banks
c5ffe0ab04
No reason to sum the normalized posteriors array to get Pr(AF>0) given that we can just compute 1.0 - array[0]. Integration tests change only because of trivial precision artifacts for reference calls using EMIT_ALL_SITES.
2011-12-18 00:31:47 -05:00
Eric Banks
6dc52d42bf
Implemented the proper QUAL calculation for multi-allelic calls. Integration tests pass except for the ones making multi-allelic calls (duh) and one of the SLOD tests (which used to print 0 when one of the LODs was NaN but now we just don't print the SB annotation for that record).
2011-12-18 00:01:42 -05:00
Khalid Shakir
6059ca76e8
Removing cruft that snuck in last commit.
2011-12-16 23:00:16 -05:00
Khalid Shakir
7486696c07
When using bam list mode in HSP deriving VCF name from bam list instead of requiring an additional parameter.
...
Creating a single temporary directory per ant test run instead of a putting temp files across all runs in the same directory.
Updated various tests for above items and other small fixes.
2011-12-16 18:09:25 -05:00
Mauricio Carneiro
e5df9e0684
cleaner test output
...
cleaned up the debug "pass" messages in the unit tests
2011-12-16 18:04:00 -05:00
Mauricio Carneiro
fcc21180e8
Added hardClipLeadingInsertions UnitTest for the ReadClipper
...
fixed issue where a read starting with an insertion followed by a deletion would break, clipper can now safely clip the insertion and the deletion if that's the case.
note: test is turned off until contract changes to allow hanging insertions (left/right).
2011-12-16 18:02:47 -05:00
Mauricio Carneiro
075be52adc
Added hardClipByReferenceCoordinates (left and right tails) UnitTest for the ReadClipper
2011-12-16 18:01:33 -05:00
Mauricio Carneiro
5bba44d693
Added hardClipByReferenceCoordinates UnitTest for the ReadClipper
...
* fixed edge case when requested to hard clip beginning of a read that had hanging soft clipped bases on the left tail.
* fixed edge case when requested to hard clip end of a read that had hanging soft clipped bases on the right tail.
* fixed AlignmentStart of a clipped read that results in only hard clips and soft clips
note: added tests to all these beautiful cases...
2011-12-16 18:01:33 -05:00
Mauricio Carneiro
5838ba529d
Added hardClipByReadCoordinates UnitTest for the ReadClipper
2011-12-16 18:01:33 -05:00
Mauricio Carneiro
c26295919e
Added hardClipBothEndsByReferenceCoordinates UnitTest for the ReadClipper
2011-12-16 18:01:33 -05:00
Mark DePristo
1994c3e3bc
Only print warning about allele incompatibility when running there are genotypes in the file in CombineVariants
2011-12-16 16:50:51 -05:00
Mark DePristo
1863da4d18
Spawn a more reasonable number of jobs in GATKPerformanceOverTime
2011-12-16 16:50:40 -05:00
Mark DePristo
b6067be952
Support for selecting only variants with specific IDs from a file in SelectVariants
...
-- Cleaned up unused variables as well
2011-12-16 16:50:39 -05:00
Mark DePristo
d6d2f49c88
Don't print log if there are no BAMs
2011-12-16 16:50:36 -05:00
Mark DePristo
dbc2ed2887
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-16 15:12:22 -05:00
Mark DePristo
1179588475
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-16 15:11:59 -05:00
Mark DePristo
0e0d022e58
RandomForest.R now caches trees to disk to save cpu and storage costs
...
-- Vastly more efficient to write out trees to disk than recompute them all of the time
2011-12-16 15:10:51 -05:00
Mark DePristo
78e0950a77
Minor bug fix for printing in SAMDataSource
2011-12-16 11:45:40 -05:00
Mark DePristo
7bc0d18418
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-16 11:42:42 -05:00
Ryan Poplin
5aa79dacfc
Changing hidden optimization argument to advanced.
2011-12-16 10:29:20 -05:00
Matt Hanna
3642a73c07
Performance improvements for dynamically merging BAMs in read walkers.
...
This change and my previous change have dropped runtime when dynamically merging 2k BAM files from 72.6min/1M reads to 46.8sec/1M reads.
Note that many of these changes are stopgaps -- the real problem is the way ReadWalkers interface with Picard, and I'll have to work with
Tim&Co to produce a more maintainable patch.
2011-12-16 09:37:44 -05:00
Mark DePristo
3414ecfe2e
Restored serial version of reader initialization. Serial mode is default, as the performance gains aren't so huge.
...
-- Serial version can be re-enabled with a static boolean, if we decide to return to the serial version
-- Comparison of serial and parallel reader with cached and uncached files:
Initialization time: serial with 500 fully cached BAMs: 8.20 seconds
Initialization time: serial with 500 uncached BAMs : 197.02 seconds
Initialization time: parallel with 500 fully cached BAMs: 30.12 seconds
Initialization time: parallel with 500 uncached BAMs : 75.47 seconds
2011-12-16 09:22:10 -05:00
Mark DePristo
fb1c9d2abc
Restored serial version of reader initialization. Parallel mode is default.
...
-- Serial version can be re-enabled with a static boolean, if we decide to return to the serial version
2011-12-16 09:05:28 -05:00
Mark DePristo
16a563889f
GATKPerformance by default runs 10 iterations of each job
2011-12-16 08:27:19 -05:00
Mauricio Carneiro
e61e5c7589
Refactor of ReadClipper unit tests
...
* expanded the systematic cigar string space test framework Roger wrote to all tests
* moved utility functions into Utils and ReadUtils
* cleaned up unused classes
2011-12-15 19:05:43 -05:00
Ryan Poplin
7c58d8e37d
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-15 12:52:46 -05:00
Ryan Poplin
f38ed69fd0
Work around for a known adapter clipping issue. Temporary fix while adapter clipping is being rewritten.
2011-12-15 12:52:34 -05:00
Mauricio Carneiro
4748ae0a14
Bugfix: Softclips before Hardclips weren't being accounted for
...
caught a bug in the hard clipper where it does not account for hard clipping softclipped bases in the resulting cigar string, if there is already a hard clipped base immediately after it.
* updated unit test for hardClipSoftClippedBases with corresponding test-case.
2011-12-15 12:17:25 -05:00
Mauricio Carneiro
62a2e335bc
Changing HardClipper contract to allow UNMAPPED reads
...
shifted the contract to functions that operate on reference based coordinates. The clipper should do the right thing with unmapped reads, but it needs more testing (Ryan is using it at the moment and says it works). Will write some unit tests.
2011-12-15 11:08:19 -05:00
Mark DePristo
4b80e3c034
Support for VariantEval in GATK performance assessment
...
-- Added nt analysis
2011-12-15 09:44:20 -05:00
Mark DePristo
7dc552c1d0
Now subsets down to jobs that actually ran, to avoid numeric problems in printing
2011-12-15 09:03:10 -05:00
Ryan Poplin
598a21d01c
Adding new downsampling argument to haplotype caller qscript.
2011-12-15 08:49:21 -05:00
Mark DePristo
897d44abf9
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-15 08:45:33 -05:00
Ryan Poplin
568d972991
Adding downsample per region argument to the haplotype caller
2011-12-15 08:43:48 -05:00
Mark DePristo
550fb498be
Support for NT testing (default up to 4) for CC and UG
...
-- Added convenience function addJobReportBinding to just new binding to the map (x -> y) as well
2011-12-14 18:45:00 -05:00
Matt Hanna
9333b678b5
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-14 18:05:44 -05:00
Matt Hanna
6fb4be1a09
Cache header merger.
2011-12-14 18:05:31 -05:00
Ryan Poplin
9dbd0ef06a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-12-14 17:11:56 -05:00