gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Joel Thibault	47e620dfbc	Create BAM index to test shard boundaries	2013-01-03 17:00:12 -05:00
Joel Thibault	dcb7735d3c	Active Region extensions must stay on contig	2013-01-02 14:46:24 -05:00
Joel Thibault	a15f368bdc	Re-enable testIsActiveRangeLow/High	2013-01-02 11:57:50 -05:00
Joel Thibault	429567cd3f	Rename to TraverseActiveRegionsUnitTest	2013-01-01 19:20:30 -05:00
Joel Thibault	57d38aac8a	Temporarily disable due to unknown contracts problem	2013-01-01 19:20:04 -05:00
Joel Thibault	7748b3816f	Delete the test BAI file as well as the BAM	2013-01-01 19:20:02 -05:00
Joel Thibault	5afeb465aa	TODOs	2013-01-01 19:19:17 -05:00
Joel Thibault	a29df3e094	oops	2012-12-18 19:03:12 -05:00
Joel Thibault	ee22c1bf44	More TODOs	2012-12-18 18:47:43 -05:00
Joel Thibault	2b1db519d7	Add reads which overstep a boundary by a single base	2012-12-18 18:47:43 -05:00
Joel Thibault	9828b2990f	Reads off the end of a contig fail SAM validation when using actual BAMs	2012-12-18 18:47:43 -05:00
Joel Thibault	72e2394b26	Create actual BAM	2012-12-18 18:47:43 -05:00
Joel Thibault	d69d1f8988	Fun with varargs	2012-12-18 18:47:42 -05:00
Joel Thibault	1158c1529f	Refactor region/read comparisons	2012-12-18 18:47:42 -05:00
David Roazen	46edab6d6a	Use the new downsampling implementation by default -Switch back to the old implementation, if needed, with --use_legacy_downsampler -LocusIteratorByStateExperimental becomes the new LocusIteratorByState, and the original LocusIteratorByState becomes LegacyLocusIteratorByState -Similarly, the ExperimentalReadShardBalancer becomes the new ReadShardBalancer, with the old one renamed to LegacyReadShardBalancer -Performance improvements: locus traversals used to be 20% slower in the new downsampling implementation, now they are roughly the same speed. -Tests show a very high level of concordance with UG calls from the previous implementation, with some new calls and edge cases that still require more examination. -With the new implementation, can now use -dcov with ReadWalkers to set a limit on the max # of reads per alignment start position per sample. Appropriate value for ReadWalker dcov may be in the single digits for some tools, but this too requires more investigation.	2012-12-10 09:44:50 -05:00
Eric Banks	574d5b467f	Bug fix for indel HMM: protect against situation where long reads (e.g. Sanger) in a pileup can lead to a read starting after the haplotype end for a given haplotype.	2012-12-09 02:09:34 -05:00
Joel Thibault	c76c808268	Reads are required to be sorted - Remove the extended_only case because it's outside intervals	2012-11-28 13:59:58 -05:00
Joel Thibault	198923b597	Add ActiveRegionReadState handling	2012-11-28 13:59:57 -05:00
Joel Thibault	9bfe39411e	Equal overlap should match right/later region	2012-11-27 13:03:13 -05:00
Joel Thibault	d83ad906ef	Add profile range contract	2012-11-27 13:03:13 -05:00
Joel Thibault	cc550b4145	Add a read and interval on a different contig	2012-11-27 13:03:13 -05:00
Joel Thibault	c68bc95db6	Initial read mapping tests - Failing tests are commented out	2012-11-21 17:16:46 -05:00
Joel Thibault	3ad9128800	Add some reads - Move intervals and reads to init - Update intervals and reads	2012-11-21 17:16:46 -05:00
Joel Thibault	3fa3b00f4a	Add ActiveRegion tests and refactor	2012-11-21 17:16:45 -05:00
Joel Thibault	e8defcb20d	Test multiple bases and intervals	2012-11-21 17:16:45 -05:00
Joel Thibault	c08b782743	Count isActive calls directly	2012-11-21 17:16:45 -05:00
Joel Thibault	b70fd4a242	Initial testing of the Active Region Traversal contract - TODO: many more tests and test cases	2012-11-15 10:08:00 -05:00
Mark DePristo	15b28e61cd	Retiring TraverseReads and TraverseLoci after testing confirms nano scheduler version in single threaded version is fine -- There's been no report of problems with the nano scheduled version of TraverseLoci and TraverseReads, so I'm removing the old versions since they are no longer needed -- Removing unnecessary intermediate base classes -- GSA-515 / Nanoscheduler GSA-549 / https://jira.broadinstitute.org/browse/GSA-549	2012-10-22 16:55:06 -04:00
Mark DePristo	2e94a0a201	Refactor TraversalEngine to extract the progress meter functions -- Previously these core progress metering functions were all in TraversalEngine, and available to subclasses like TraverseLoci via inheritance. The problem here is that the upcoming data threads x cpu threads parallelism requires one master copy of the progress metering shared among all traversals, but multiple instantiations of traverse engines themselves. -- Because the progress metering code has horrible anyway, I've refactored and vastly cleaned up and simplified all of these capabilities into TraversalProgressMeter class. I've simplified down the classes it uses to work (STILL SOME TODOs in there) so that it doesn't reach into the core GATK engine all the time. It should be possible to write some nice tests for it now. By making it its own class, it can protect itself from multi-threaded access with a single synchronized printProgress function instead of carrying around multiple lock objects as before -- Cleaned up the start up of the progress meter. It's now handled when the meter is created, so each micro scheduler doesn't have to deal with proper initialization timing any longer -- Simplified and made clear the interface for shutting down the traversal engines. There's no a shutdown method in TraversalEngine that's called once by the MicroScheduler when the entire traversing in over. Nano traversals now properly shut down (was subtle bug I undercovered here). The printing of on traversal done metering is now handled by MicroScheduler -- The MicroScheduler holds the single master copy of the progress meter, and doles it out to the TraversalEngines (currently 1 but in future commit there will be N). -- Added a nice function to GenomeAnalysisEngine that returns the regions we will be processing, either the intervals requested or the whole genome. Useful for progress meter but also probably for other infrastructure as well -- Remove a lot of the sh*ting Bean interface getting and setting in MicroScheduler that's no longer useful. The generic bean is just a shell interface with nothing in it. -- By removing a lot of these bean accessors and setters many things are now final that used to be dynamic.	2012-09-10 20:14:13 -04:00
Mauricio Carneiro	116885a450	Removed the "Walker" suffix from all walkers that had it. * Did not touch archived walkers... those can be named whatever. * Kept abstract classes that end in Walker untouched (e.g. LocusWalker, ReadWalker, ...) * Renamed a few inner classes due to conflict when stripping off Walker from their outer classes: ContigStats, FlagStats and FastaStats.	2012-07-20 17:27:11 -04:00
Matt Hanna	8bb4d4dca3	First pass of the asynchronous block loader. Block loads are only triggered on queue empty at this point. Disabled by default (enable with nt:io=?).	2011-11-18 15:02:59 -05:00
Mark DePristo	9127849f5d	BugFix for unit test	2011-09-07 14:54:10 -04:00
David Roazen	3c9497788e	Reorganized the codebase beneath top-level public and private directories, removing the playground and oneoffprojects directories in the process. Updated build.xml accordingly.	2011-06-28 06:55:19 -04:00

33 Commits (ffbd4d85f2e0112b32df0bbba00330b00a0806cf)