gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Ryan Poplin	26e35e5ee2	updating BQSR integration tests	2012-09-19 14:10:34 -04:00
Ryan Poplin	b99099f05c	The BaseRecalibrator and DelocalizedBaseRecalibrator have gotten out of sync. Fixing.	2012-09-19 12:30:26 -04:00
Ryan Poplin	7a7103a757	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-19 10:39:18 -04:00
Ryan Poplin	0ea543e1fd	Removing testing scaffolding from delocalized BQSR. The output recal table reports the data as doubles instead of integers. This changes the mapping-based BQSR integration tests. Final intermediate push before delocalized BQSR replaces previous BQSR.	2012-09-19 10:39:06 -04:00
Guillermo del Angel	bebd5c14b8	Update general ploidy md5's due to bad merge of md5's in previous commit, and new shortened interval definition for EMIT_ALL_CONFIDENT_SITES was buggy	2012-09-18 20:12:15 -04:00
Ami Levy Moonshine	ccc3f4ff8d	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-17 09:58:27 -04:00
Ami Levy Moonshine	ebf609f757	new R script for summmary tables of the pipeline	2012-09-17 09:57:10 -04:00
Ami Levy Moonshine	ee0b17d98f	typo in VE	2012-09-17 09:51:51 -04:00
Guillermo del Angel	ca010160a9	Merge fix	2012-09-14 14:05:21 -04:00
Guillermo del Angel	6b37350bc0	Two hairy bugs in pool caller: a) Site error model wasn't counting errors in insertions correctly - Alleles passed in had padded ref byte, but event base in PileupElement doesn't have it. As a result, mismatch rate was grossly overestimated with insertions and we missed several calls we should have made. Integration test reflects changes. b) Adding a ref GL to the exact model is correct mathematically but AFResult wasn't filled properly. As a result, QUAL was junk in pure ref sites, and in all other sites the last ref GL introduced wasn't properly updating Pr(AF>0). c) Added integration test that covers -out_mode EMIT_ALL_CONFIDENT_SITES. Not fully sure if the math is 100% correct (for both diploid and generalized case) but at least now diploid and non-diploid cases behave similarly. md5 of this new test will fail since it's taking me a long time to run so I'll update from Bamboo output shortly	2012-09-14 13:13:22 -04:00
Ryan Poplin	f4ac92e95c	Add clipping of the adaptor sequence to the delocalized BQSR.	2012-09-14 11:51:54 -04:00
Ryan Poplin	3585f5375e	Bug fix so that the delocalized BAQ GOP parameter is actually used by the BQSR.	2012-09-14 11:02:14 -04:00
Eric Banks	86be50f18d	Add note to docs that the --list argument requires full command-line	2012-09-14 10:58:44 -04:00
Menachem Fromer	182344ad89	Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-12 23:56:44 -04:00
Menachem Fromer	3d3578b1de	Deal with empty Seq	2012-09-12 23:54:41 -04:00
Ryan Poplin	d380ef9956	revert 82b0bab5fbc4e57e0db30b0ec3d4676fccef40ba, bad idea	2012-09-12 15:42:29 -04:00
Ryan Poplin	e7200f1a40	adding verbose debug statements in BQSR	2012-09-12 15:40:07 -04:00
Eric Banks	0206e09a6a	Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-12 15:18:27 -04:00
Eric Banks	d94d0d15c2	Complete overhaul of previous commits to make it all work with scatter-gather. Now tracks output files correctly and can print to stdout.	2012-09-12 15:15:40 -04:00
Ryan Poplin	699a7801b6	Force the in-walker BAQ calculation to use the new BAQGOP parameter.	2012-09-12 14:59:31 -04:00
Ryan Poplin	c9111bb23e	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-12 14:46:50 -04:00
Ryan Poplin	fafecf4ffd	Adding BAQGOP parameter to the delocalized BQSR.	2012-09-12 14:46:18 -04:00
Ryan Poplin	bc1e03a6d8	Adding HC integration test for _structural_ insertions and deletions.	2012-09-12 12:25:39 -04:00
Ryan Poplin	faad2972d6	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-12 12:23:24 -04:00
Ryan Poplin	849a2b8839	Adding HC integration test for _structural_ insertions and deletions.	2012-09-12 12:23:00 -04:00
Eric Banks	4bb7a99f08	Given that all classes implementing output stubs already have getters for the underlying OutputStream and File, it makes sense to unify that functionality into the Stub interface. Now it is possible to have an Engine utility method that iterates over all registered stubs to find the one representing a given OutputStream and return the File associated with it.	2012-09-12 11:51:44 -04:00
Eric Banks	994a4ff387	Track all outputs from BQSR (.table, .csv., and .pdf) as @Output arguments. Updated integration tests because we no longer have command-line options not to generate plots (now just don't provide a pdf) or to keep the intermediate csv (now, just provide a filename on the command-line). This is currently busted because we can't access the original filenames from the Engine's storage/stub system and therefore cannot call out to the Rscript with the executor (which requires filename strings).	2012-09-12 11:24:53 -04:00
Christopher Hartl	96be1cbea9	My own integration test isn't passing with a clean checkout. This fix to the walker ought to do it.	2012-09-12 10:11:06 -04:00
Christopher Hartl	546586b70e	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-12 10:09:42 -04:00
Mark DePristo	bfbf1686cd	Fixed nasty bug with defaulting to diploid no-call genotypes -- For the pooled caller we were writing diploid no-calls even when other samples were haploid. Changed maxPloidy function to return a defaultPloidy, rather than 0, in the case where all samples are missing. -- VCF/BCF Writers now create missing genotypes with the ploidy of other samples, or 2 if none are available at all. -- Updating integration tests for general ploidy, as previously we wrote ./. even when other calls were 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/1/1/1/1/1, but now we write ./././././././././././././././././././././././. (ugly but correct)	2012-09-12 07:08:03 -04:00
Mark DePristo	d1ba17df5d	Fixed nasty bug in BCF2 writer for case where all genotypes are missing -- Previous code was looking for a -1 result from maxPloidy() but the result as actually 0, so instead of writing a diploid no call we were actually writing "unavailable" genotypes, and failing the BCF == VCF test in integration tests. Fixed.	2012-09-12 06:46:27 -04:00
Mark DePristo	91f3204534	VCF/BCF writers once again automatically write out no-call genotypes for samples in the VCFHeader but not in the VC itself -- Turns out this was consuming 30% of the UG runtime, and causing problems elsewhere. -- Removed addMissingSamples from VariantcontextUtils, and calls to it -- Updated VCF / BCF writers to automatically write out a diploid no call for missing samples -- Added unit tests for this behavior in VariantContextWritersUnitTest	2012-09-12 06:46:26 -04:00
Menachem Fromer	d3bdb9c67e	Choose queue based on assumed run time expectation	2012-09-12 03:36:57 -04:00
Menachem Fromer	5764f1037c	Added control of memory for matrix merging	2012-09-12 03:01:01 -04:00
Menachem Fromer	625fb25eca	Updated import	2012-09-12 02:17:24 -04:00
Menachem Fromer	2ea28499e2	Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-12 01:58:53 -04:00
Menachem Fromer	5cb08fd17c	Added XHMM option to outputTargetsBySamples	2012-09-12 01:58:04 -04:00
Christopher Hartl	5d19fca649	A couple of bug-fixy changes. 1) SelectVariants could throw a ReviewedStingException (one of the nasty "Bug:") ones if the user requested a sample that wasn't present in the VCF. The walker now checks for this in the initialize() phase, and throws a more informative error if the situation is detected. If the user simply wants to subset the VCF to all the samples requested that are actually present in the VCF, the --ALLOW_NONOVERLAPPING_COMMAND_LINE_SAMPLES flag changes this UserException to a Warning, and does the appropriate subsetting. Added integration tests for this. 2) GenotypeLikelihoods has an unsafe method getLog10GQ(GenotypeType), which is completely broken for multi-allelic sites. I marked that method as deprecated, and added methods that use the context of the allele ordering (either directly specified or as a VC) to retrieve the appropriate GQ, and added a unit test to cover this case. VariantsToBinaryPed needs to dynamically calculate the GQ field sometimes (because I have some VCFs with PLs but no GQ).	2012-09-11 23:01:00 -04:00
Ryan Poplin	35d15278af	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-11 14:34:17 -04:00
Ryan Poplin	c23b794904	I find these per-readgroup plots to be useful. Not sure why there were turned off by default.	2012-09-11 14:31:59 -04:00
Guillermo del Angel	0dd745bb9b	Merge branch 'master' of ssh://gsa4/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-09-11 11:01:41 -04:00
Guillermo del Angel	13831106d5	Fix GSA-535: storing likelihoods in allele map was busted when running HaplotypeCaller, only the last likelihood of a haplotype was being stored, as opposed to the max likelihood of all haplotypes mapping to an allele	2012-09-11 11:01:26 -04:00
David Roazen	6fad0f25bb	Merge Eric's LocusIteratorByStateUnitTest changes into LocusIteratorByStateExperimentalUnitTest	2012-09-11 10:47:09 -04:00
Mark DePristo	e25e617d1a	Fixes GSA-515 Nanoscheduler GSA-560 / Fix display of NanoScheduler and MonitoringEfficiency -- Now prints out a single combined NanoScheduler runtime profile report across all nano schedulers in use. So now if you run with -nt 4 you'll get one combined NanoScheduler profiler across all 4 instances of the NanoScheduler within TraverseXNano.	2012-09-11 07:38:34 -04:00
Mark DePristo	64ee0a10fe	Fix bad include in package.scala	2012-09-10 20:14:31 -04:00
Mark DePristo	d6e42d839c	Fixes GSA-558 GATK ReadShards don't handle unmapped reads correctly.	2012-09-10 20:14:14 -04:00
Mark DePristo	641c6a361e	Fix nasty memory leak in new data thread x cpu thread parallelism -- Basically you cannot safely use instance specific ThreadLocal variables, as these cannot be safely cleaned up. The old implementation kept pointers to old writers, with huge tribble block indexes, and eventually we crashed out of integration tests -- See http://weblogs.java.net/blog/jjviana/archive/2010/06/10/threadlocal-thread-pool-bad-idea-or-dealing-apparent-glassfish-memor for more information -- New implementation uses a borrow/return schedule with a list of N TraversalEngines managed by the MicroScheduler directly.	2012-09-10 20:14:14 -04:00
Mark DePristo	195cf6df7e	Attempting to fix out of memory errors with new traversal engine creator	2012-09-10 20:14:14 -04:00
Mark DePristo	f713d400e2	Fixed GSA-515 Nanoscheduler GSA-555 / Make NT and NCT work together -- Can now say -nt 4 and -nct 4 to get 16 threads running for you! -- TraversalEngines are now ThreadLocal variables in the MicroScheduler. -- Misc. code cleanup, final variables, some contracts.	2012-09-10 20:14:14 -04:00
Mark DePristo	233f70f8ba	Final cleanup of TraversalProgressMeters, moved to utils.progressmeter -- TraversalProgressMeter now completely generalized, named ProgressMeter in utils.progressmeter. Now just takes "nRecordsProcessed" as an argument to print reads. Completely removes dependence on complex data structures from TraversalProgressMeter. Can be used to measure progress on any task with processing units in genomic locations. -- a fairly simple, class with no dependency on GATK engine or other features. -- Currently only used by the TraversalEngine / MicroScheduler but could be used for any purpose now, really.	2012-09-10 20:14:14 -04:00

1 2 3 4 5 ...

10581 Commits (26e35e5ee2ab8c77d13ab47342cdecf73b617d8a) All Branches Search

10581 Commits (26e35e5ee2ab8c77d13ab47342cdecf73b617d8a)

All Branches