gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Eric Banks	e93ff3ea6e	Let's go back to having the SB/SLOD NOT computed by default. If you recall, it was only enabled by default because we thought we were going to use it when we made VQSR use random forests. But since we decided not to change VQSR, there's no reason to triple the computation for every variant site anymore.	2012-10-25 12:45:23 -04:00
Guillermo del Angel	69f2f1ef29	Merge fix	2012-10-25 12:08:49 -04:00
Guillermo del Angel	2e8366d2d8	Bug fixes in setting @outputs, fix to make IndelRealigner scatter-gather correctly	2012-10-25 12:07:35 -04:00
Guillermo del Angel	a838653822	Merge branch 'master' of ssh://gsa3/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-10-25 10:35:58 -04:00
Guillermo del Angel	596c1723ae	Hidden, unsupported ability of VariantEval to run AlleleCount stratification on sites-only VCFs. I'll expose it/add tests on it if people think this is generaly useful. User needs to specify total # of samples as command line argument since genotypes are not available. Also, fixes to large-scale validation script: lower -minIndelFrac threshold or else we'll kill most indels since default 0.25 is too high for pools, fix also VE stratifications and add one VE run where eval=1KG, comp=pool data and AC stratification based on 1KG annotation	2012-10-25 10:35:43 -04:00
Eric Banks	6dc7d872ec	Fix GenotypeAndValidate to handle SNPs and indels as reported on the forum. Recent changes to the UnifiedArgumentCollection made this stop working. Adding in JIRA to create integration tests for this tool.	2012-10-25 10:06:13 -04:00
Eric Banks	c53c55da12	Re-enable tests	2012-10-25 09:37:08 -04:00
Eric Banks	e6652f7777	Added integration test for contamination down-sampling	2012-10-25 09:36:05 -04:00
Kristian Cibulskis	3db9703a65	Merge branch 'develop' of github.com:broadinstitute/cmi-gatk into develop	2012-10-25 09:21:08 -04:00
Kristian Cibulskis	1788e596c7	DEV-14 #resolve #time 3m	2012-10-25 09:20:48 -04:00
Eric Banks	df9e0b7045	Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-10-25 02:49:54 -04:00
Eric Banks	72714ee43e	Minor patches to get the contamination down-sampling working for indels. Adding @Hidden logging output for easy debugging.	2012-10-25 02:47:42 -04:00
Eric Banks	c6b57fffda	Added allele biased down-sampling capabilities to the PerReadAlleleLikelihoodMap object, which means that both the UG and HC can use this functionality. Note that it's only available in protected, so GATK-lite users won't be allowed to enable it. Needs more testing.	2012-10-24 22:52:25 -04:00
Scott Frazer	327d6f98d7	fixed duplicate argument names	2012-10-24 22:18:18 -04:00
Ami Levy Moonshine	bcf3582095	Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-10-24 21:50:41 -04:00
Guillermo del Angel	660fc120a3	QC fixes in fastq-bam: a) Connect picard outputs with @Output b) Do QC by default and correct picard path in instances. c) EXPERIMENTAL: bump up numThreads to 8 by default and see what happens DEV-155 #resolve #time 1m	2012-10-24 21:37:25 -04:00
David Roazen	2d9e2e6b8e	Delete ExperimentalNestedIntegerArray Forgot to delete this in my last push. This class was only used for profiling purposes to try out different ideas and is no longer needed.	2012-10-24 19:59:46 -04:00
Eric Banks	9da7bbf689	Refactoring the PerReadAlleleLikelihoodMap in preparation for adding contntamination downsampling into protected only.	2012-10-24 15:49:07 -04:00
David Roazen	d9aa9855f8	Better comments in NestedIntegerArray	2012-10-24 15:29:13 -04:00
David Roazen	02018ca764	Legacy BaseRecalibrator walker is neither TreeReducible nor NanoSchedulable The old BaseRecalibrator walker is and never will be thread-safe, since it's a LocusWalker that uses read attributes to track state. ONLY the newer DelocalizedBaseRecalibrator is believed likely to be thread-safe at this point. It is safe to run the DelocalizedBaseRecalibrator with -nct > 1 for testing purposes, but wait for further testing to be done before using it for production purposes in multithreaded mode.	2012-10-24 15:22:50 -04:00
David Roazen	32a6d7000a	Thread-safe ReadGroupCovariate The ReadGroupCovariate class was not thread-safe. This led to horrible race conditions in multithreaded runs of the BQSR where (for example) the same read group could get inserted into the reverse lookup table twice with different IDs. Should fix the intermittent crash reported in GSA-492.	2012-10-24 15:22:50 -04:00
David Roazen	991658acf4	BQSR: use more granular locking for concurrency control -With this change, BQSR performance scales properly by thread rather than gaining nothing from additional threads. -Benefits are seen when using either -nt (HierarchicalMicroScheduler) or -nct (NanoScheduler) -Removes high-level locks in the recalibration engines and NestedIntegerArray in favor of maximally-granular locks on and around manipulation of the leaf nodes of the NestedIntegerArray. -NestedIntegerArray now creates all interior nodes upfront rather than on the fly to avoid the need for locking during tree traversals. This uses more memory in the initial part of BQSR runs, but the BQSR would eventually converge to use this memory anyway over the course of a typical run. IMPORTANT NOTE: This does not mean it's safe to run the old BaseRecalibrator walker with multiple threads. The BaseRecalibrator walker is and will never be thread-safe, as it's a LocusWalker that uses read attributes to track state information. ONLY the newer DelocalizedBaseRecalibrator can be made thread-safe (and will hopefully be made so in my subsequent commits). This commit addresses performance, not correctness.	2012-10-24 15:22:50 -04:00
Ryan Poplin	a27ee26481	updating HC integration test.	2012-10-24 14:08:39 -04:00
Ryan Poplin	094db7bf24	We now require at least 10 samples to merge variants into complex events in the HC. Added a new population based bam for the complex event integration test.	2012-10-24 14:07:36 -04:00
Eric Banks	5b7b42356b	Fix bug in GenotypeAndValidate where it doesn't check vc.hasAttribute() before using vc.getAttribute().	2012-10-24 14:02:50 -04:00
kshakir	72d92c1d1e	Fixed typo.	2012-10-24 12:10:52 -04:00
kshakir	53ab02d1fc	Added @Input and @Output indexes for bam/vcf.	2012-10-24 11:24:46 -04:00
Mark DePristo	6e421a72d6	Add more exhaustive unit tests for input errors to NanoScheduler -- Resolves issue GSA-515 / Nanoscheduler GSA-605 / Seems that -nct may deadlock as not reproducible -- It seems that it's not an input error problem (or at least cannot be provoked with unit tests) -- I'll keep an eye on this later	2012-10-23 20:11:29 -04:00
kshakir	8dfa24df7b	Sending a version of per job status messages. In addition to outputs, inputs are passed to QStatusMessenger.done() CloneFunction.cloneIndex has a new CloneFunction.cloneCount companion useful for display purposes.	2012-10-23 15:55:47 -04:00
Guillermo del Angel	5fac5bf12e	Fixed issues with Queue packaging of Picard QC classes: separate jar's are needed fromPicard. User needs to specify the -picardBase argument to point to input path for jars. > Also, reenable joint cleaning as now it works. > DEV-125 #resolve > DEV-90 #resolve	2012-10-23 14:08:31 -04:00
Kristian Cibulskis	41b5f25bf3	Merge branch 'develop' of github.com:broadinstitute/cmi-gatk into develop	2012-10-23 13:34:43 -04:00
Kristian Cibulskis	9d451b5154	DEV-116 #resolve #time 3m	2012-10-23 13:34:11 -04:00
kshakir	0cce1ae8b2	When gathering VCFs, using CombineVariants from the current classpath, and not the GATK used to run the command. This was a concern for external modules that bundled the engine but not CombineVariants.	2012-10-23 12:44:06 -04:00
Mauricio Carneiro	4cd1a92358	Updating RR integration tests Forgot to update the integration tests after merging DEV-117 with optimizations from GATK main repo.	2012-10-23 11:26:26 -04:00
Mauricio Carneiro	c210b7cde4	Merge GATK repo into CMI-GATK Bringing in the following relevant changes: * Fixes the indel realigner N-Way out null pointer exception DEV-10 * Optimizations to ReduceReads that bring the run time to 1/3rd. Conflicts: protected/java/src/org/broadinstitute/sting/gatk/walkers/compression/reducereads/SlidingWindow.java DEV-10 #resolve #time 2m	2012-10-23 10:59:11 -04:00
Mauricio Carneiro	bbf7a0fb09	Adding integration test to ReduceReads coreduction DEV-117 #resolve	2012-10-23 10:56:33 -04:00
Menachem Fromer	1ec137a40c	Add support for calculating DoC with flanking intervals added	2012-10-23 10:41:07 -04:00
Guillermo del Angel	118da1cd1c	More test data for fastq-bam: new short.interval_list with correct header based on reference dict. b) New interval/bait sets with corrected header for decoy contig in b37 c) New metadata file for test fastqs d) Updated test fastqs as previous version was generated with bad intervals	2012-10-23 09:24:09 -04:00
Mark DePristo	f838815343	Updating MD5s for confidence ref site estimation in IndependentAllelesDiploidExactAFCalc -- Included logic to only add priors for alleles with sufficient evidence to be called polymorphic. If no alleles are poly make sure to add priors of first allele	2012-10-23 06:47:53 -04:00
Guillermo del Angel	7860ff7981	a) Resolve [#DEV-56] - test data with indels in new directory private/testdata/CMITestData/. b) Skeleton (not yet working) of fastq-BAM unit test, c) misc bug fixes for QC functions to work (not done yet)	2012-10-22 19:59:15 -04:00
Mark DePristo	15b28e61cd	Retiring TraverseReads and TraverseLoci after testing confirms nano scheduler version in single threaded version is fine -- There's been no report of problems with the nano scheduled version of TraverseLoci and TraverseReads, so I'm removing the old versions since they are no longer needed -- Removing unnecessary intermediate base classes -- GSA-515 / Nanoscheduler GSA-549 / https://jira.broadinstitute.org/browse/GSA-549	2012-10-22 16:55:06 -04:00
Khalid Shakir	fd59e7d5f6	Better error message when generic types are erased from scala collections.	2012-10-22 16:27:31 -04:00
Ryan Poplin	008df54575	Bug fix in GATKSAMRecord.getSoftEnd() for reads that are entirely clipped.	2012-10-22 14:21:52 -04:00
Mark DePristo	90f59803fd	MaxAltAlleles now defaults to 6, no more MaxAltAllelesForIndels -- Updated StandardCallerArgumentCollection to remove MaxAltAllelesForIndels. Previous argument is deprecated with meaningful doc message for people to use maxAltAlleles -- All constructores, factory methods, and test builders and their users updated to provide just a single argument -- Updating MD5s for integration tests that change due to genotyping more alleles -- Adding more alleles to genotyping results in slight changes in the QUAL value for multi-allelic loci where one or more alleles aren't polymorphic. That's simply due to the way that alternative hypotheses contribute as reference evidence against each true allele. The effect can be large (new qual = old qual / 2 in one case here). -- If we want more precision in our estimates we could decide (Eric, should we discuss?) to actually separately do a discovery phase in the genotyping, eliminate all variants not considered polymorphic, and then do a final round of calling to get the exact QUAL value for only those that are segregating. This would have the value of having the QUAL stay constant as more alleles are genotyped, at the cost of some code complexity increase and runtime. Might be worth it through	2012-10-22 13:47:56 -04:00
Khalid Shakir	97dc3664c9	Fixed yet another NPE related to the ArgumentTypeDescriptor vs. ArgumentMatchValue. Added integration test based on GSA-621.	2012-10-22 12:05:32 -04:00
Eric Banks	ccae6a5b92	Fixed the RR bug I (knowingly) introduced last week: turns out we can't trust a context size's worth of data from the previous marking. I think Mauricio warned me about this but I forgot.	2012-10-22 11:48:34 -04:00
Ami Levy Moonshine	1da4ad9607	new class to test the quals of reduced bam vs non-reduced bam	2012-10-22 10:45:53 -04:00
Mark DePristo	9f2851d769	Updating UnifiedGenotyperGeneralPloidyIntegrationTest following rebasing -- Created a JIRA ticket https://jira.broadinstitute.org/browse/GSA-623 for Guillermo to look at the differences as the multi-allelic nature of many sites seems to change with the new more protected infrastructure. This may be due to implementation issues in the pooled caller, problems with my interface, or could be a genuine improvement.	2012-10-21 20:23:11 -04:00
Mark DePristo	eb6c9a1a79	Disable EfficiencyMonitoringThreadFactoryUnitTest -- This is no longer a core GATK activity, and the tests need to run for so long (2 min each) that it's just too painful to run them. Should be re-eabled if we come to care about this capability again, or if we can run these tests all in parallel in the future.	2012-10-21 12:43:46 -04:00
Mark DePristo	5296de8251	Fix UnifiedArgumentCollection constructor logic error -- The old way of overloading constructors and calling super didn't work (might have been a consequence of merge). This is the right way to do the copy constructor with the call to super()	2012-10-21 12:43:46 -04:00

... 3 4 5 6 7 ...

11158 Commits (4ced2e4ffc7d457cb9a8aad4c4aa2cb3cd3fb705) All Branches Search

11158 Commits (4ced2e4ffc7d457cb9a8aad4c4aa2cb3cd3fb705)

All Branches