gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Joel Thibault	a29df3e094	oops	2012-12-18 19:03:12 -05:00
Joel Thibault	ee22c1bf44	More TODOs	2012-12-18 18:47:43 -05:00
Joel Thibault	2b1db519d7	Add reads which overstep a boundary by a single base	2012-12-18 18:47:43 -05:00
Joel Thibault	9828b2990f	Reads off the end of a contig fail SAM validation when using actual BAMs	2012-12-18 18:47:43 -05:00
Joel Thibault	72e2394b26	Create actual BAM	2012-12-18 18:47:43 -05:00
Joel Thibault	d69d1f8988	Fun with varargs	2012-12-18 18:47:42 -05:00
Joel Thibault	1158c1529f	Refactor region/read comparisons	2012-12-18 18:47:42 -05:00
Yossi Farjoun	19dd2d628a	some changes. some changes.	2012-12-14 17:21:32 -05:00
Eric Banks	696bf95fba	Fix for PBT bug reported on the forum: the AD is actually output correctly now (rather than with 'null' or some gibberish memory pointer).	2012-12-13 23:28:30 +00:00
Ami Levy-Moonshine	2f99569dda	change the md5 in one of the CV intergration tests, since it wasn't use the priority list when printing the origin of the annotation (the setValue field)	2012-12-10 22:48:15 -05:00
David Roazen	46edab6d6a	Use the new downsampling implementation by default -Switch back to the old implementation, if needed, with --use_legacy_downsampler -LocusIteratorByStateExperimental becomes the new LocusIteratorByState, and the original LocusIteratorByState becomes LegacyLocusIteratorByState -Similarly, the ExperimentalReadShardBalancer becomes the new ReadShardBalancer, with the old one renamed to LegacyReadShardBalancer -Performance improvements: locus traversals used to be 20% slower in the new downsampling implementation, now they are roughly the same speed. -Tests show a very high level of concordance with UG calls from the previous implementation, with some new calls and edge cases that still require more examination. -With the new implementation, can now use -dcov with ReadWalkers to set a limit on the max # of reads per alignment start position per sample. Appropriate value for ReadWalker dcov may be in the single digits for some tools, but this too requires more investigation.	2012-12-10 09:44:50 -05:00
Eric Banks	574d5b467f	Bug fix for indel HMM: protect against situation where long reads (e.g. Sanger) in a pileup can lead to a read starting after the haplotype end for a given haplotype.	2012-12-09 02:09:34 -05:00
Mark DePristo	dbf721968d	PrintReads large-scale test to protect against another major low-level performance issue	2012-12-05 21:36:27 -05:00
Joel Thibault	c76c808268	Reads are required to be sorted - Remove the extended_only case because it's outside intervals	2012-11-28 13:59:58 -05:00
Joel Thibault	198923b597	Add ActiveRegionReadState handling	2012-11-28 13:59:57 -05:00
Joel Thibault	9bfe39411e	Equal overlap should match right/later region	2012-11-27 13:03:13 -05:00
Joel Thibault	d83ad906ef	Add profile range contract	2012-11-27 13:03:13 -05:00
Joel Thibault	cc550b4145	Add a read and interval on a different contig	2012-11-27 13:03:13 -05:00
Eric Banks	4f7fa3009a	I forget why I thought that the VariantAnnotator couldn't run multi-threaded because it works just fine. Now you can specify -nt with VA.	2012-11-26 11:34:59 -05:00
Joel Thibault	c68bc95db6	Initial read mapping tests - Failing tests are commented out	2012-11-21 17:16:46 -05:00
Joel Thibault	3ad9128800	Add some reads - Move intervals and reads to init - Update intervals and reads	2012-11-21 17:16:46 -05:00
Joel Thibault	3fa3b00f4a	Add ActiveRegion tests and refactor	2012-11-21 17:16:45 -05:00
Joel Thibault	e8defcb20d	Test multiple bases and intervals	2012-11-21 17:16:45 -05:00
Joel Thibault	c08b782743	Count isActive calls directly	2012-11-21 17:16:45 -05:00
Joel Thibault	b70fd4a242	Initial testing of the Active Region Traversal contract - TODO: many more tests and test cases	2012-11-15 10:08:00 -05:00
Eric Banks	e9183d9fe0	Fix bugs as reported on the forum: BED needs to be explicitly set as the default output format and the output didn't actually adhere to the BED spec.	2012-11-08 15:07:47 -05:00
David Roazen	6185e8c432	Allow large-scale tests 5 hours each to run	2012-11-01 17:48:58 -04:00
Eric Banks	47a0f5859e	Don't run these tests if not GAKT lite	2012-10-31 22:56:38 -04:00
Eric Banks	f8af8a2355	Moving UG integration tests to protected since they use protected-only contamination filtering. Adding a new UGLite integration test to confirm that contamination filtering is ignored in lite.	2012-10-31 21:28:07 -04:00
Eric Banks	2aa28abe0a	Fixing md5s to reflect the new HapMap file	2012-10-30 14:27:10 -04:00
Eric Banks	b6a1967f12	Better documentation for ValidateVariants so that people realize it's used for strict validation of the VCF file. Added an option to turn off strict validation and an integration test to cover it.	2012-10-29 21:47:09 -04:00
Eric Banks	43625f652e	Shoot, mixed up the md5s last time.	2012-10-27 19:43:46 -04:00
Eric Banks	682a72faf7	Hmm, thought I got all the md5s last time. Apparently not.	2012-10-26 16:10:12 -04:00
Mark DePristo	251983b8fb	Add GATK-wide command line argument to control the maximum runtime allowed for the GATK -- Providing this optional argument -maxRuntime (in -maxRuntimeUnits units) causes the GATK to exit gracefully when the max. runtime has been exceeded. By cleanly I mean that the engine simply stops at the next available cycle in the walker as through the end of processing had been reached. This means that all output files are closed properly, etc. -- Emits an info message that looks like "INFO 10:36:52,723 MicroScheduler - Aborting execution (cleanly) because the runtime has exceeded the requested maximum 10.0000 s". Otherwise there's currently no way to differentiate a truly completed run from a timelimit exceeded run, which may be a useful thing for a future update -- Resolves GSA-630 / GATK max runtime to deal with bad LSA calling? -- Added new JIRA entry for Ami to restart chr1 macarthur with this argument set to -maxRuntime 1 -maxRuntimeUnits DAYS to see if we can do all of chr1 in one weekend.	2012-10-26 13:18:34 -04:00
Eric Banks	ed11b7dab2	Fix UG parallelization test	2012-10-26 12:10:44 -04:00
Eric Banks	7a706ed345	Fix some of the broken integration tests	2012-10-26 11:23:44 -04:00
Eric Banks	ebebec7fdb	Accidentally left one test disabled	2012-10-26 02:15:32 -04:00
Eric Banks	a53e03d525	Do not let reduced reads get removed in the contamination down-sampling	2012-10-26 02:13:04 -04:00
Eric Banks	bf3d61ce82	The default value for --contamination_fraction_to_filter is now 0.05 (5%) in both UG and HC. Users of GATK-lite get pushed down to 0% by default (since it's not enabled) or get a user error if they try to set it.	2012-10-26 01:04:51 -04:00
Eric Banks	91f2c847a3	Fixing problem reported on forum for VF: DP couldn't be filtered from the FORMAT field, only from the INFO field. Fixed and added integration test.	2012-10-26 00:57:40 -04:00
Eric Banks	e93ff3ea6e	Let's go back to having the SB/SLOD NOT computed by default. If you recall, it was only enabled by default because we thought we were going to use it when we made VQSR use random forests. But since we decided not to change VQSR, there's no reason to triple the computation for every variant site anymore.	2012-10-25 12:45:23 -04:00
Eric Banks	c53c55da12	Re-enable tests	2012-10-25 09:37:08 -04:00
Eric Banks	e6652f7777	Added integration test for contamination down-sampling	2012-10-25 09:36:05 -04:00
Mark DePristo	f838815343	Updating MD5s for confidence ref site estimation in IndependentAllelesDiploidExactAFCalc -- Included logic to only add priors for alleles with sufficient evidence to be called polymorphic. If no alleles are poly make sure to add priors of first allele	2012-10-23 06:47:53 -04:00
Mark DePristo	15b28e61cd	Retiring TraverseReads and TraverseLoci after testing confirms nano scheduler version in single threaded version is fine -- There's been no report of problems with the nano scheduled version of TraverseLoci and TraverseReads, so I'm removing the old versions since they are no longer needed -- Removing unnecessary intermediate base classes -- GSA-515 / Nanoscheduler GSA-549 / https://jira.broadinstitute.org/browse/GSA-549	2012-10-22 16:55:06 -04:00
Mark DePristo	90f59803fd	MaxAltAlleles now defaults to 6, no more MaxAltAllelesForIndels -- Updated StandardCallerArgumentCollection to remove MaxAltAllelesForIndels. Previous argument is deprecated with meaningful doc message for people to use maxAltAlleles -- All constructores, factory methods, and test builders and their users updated to provide just a single argument -- Updating MD5s for integration tests that change due to genotyping more alleles -- Adding more alleles to genotyping results in slight changes in the QUAL value for multi-allelic loci where one or more alleles aren't polymorphic. That's simply due to the way that alternative hypotheses contribute as reference evidence against each true allele. The effect can be large (new qual = old qual / 2 in one case here). -- If we want more precision in our estimates we could decide (Eric, should we discuss?) to actually separately do a discovery phase in the genotyping, eliminate all variants not considered polymorphic, and then do a final round of calling to get the exact QUAL value for only those that are segregating. This would have the value of having the QUAL stay constant as more alleles are genotyped, at the cost of some code complexity increase and runtime. Might be worth it through	2012-10-22 13:47:56 -04:00
Khalid Shakir	97dc3664c9	Fixed yet another NPE related to the ArgumentTypeDescriptor vs. ArgumentMatchValue. Added integration test based on GSA-621.	2012-10-22 12:05:32 -04:00
Mark DePristo	d21e42608a	Updating integration tests for minor changes due to switching to EXACT_INDEPENDENT model by default	2012-10-21 12:43:46 -04:00
Mark DePristo	6b6caf8e3a	Bugfix for indel DP calculations using reduced reads -- Adding tests for SNP and indel calling on reduced BAM	2012-10-21 12:42:32 -04:00
Ryan Poplin	b4e69239dd	In order to be considered an informative read in the PerReadAlleleLikelihoodMap it has to be informative compared to all other alleles not just the worst allele. Also, fixing a bug when there is only one allele in the map.	2012-10-18 14:31:15 -04:00

1 2 3 4 5 ...

707 Commits (aa39037be8c00a0dfdbe92d4f15951b1cfb81fba)