Commit Graph

33 Commits (ffbd4d85f2e0112b32df0bbba00330b00a0806cf)

Author SHA1 Message Date
Joel Thibault 47e620dfbc Create BAM index to test shard boundaries 2013-01-03 17:00:12 -05:00
Joel Thibault dcb7735d3c Active Region extensions must stay on contig 2013-01-02 14:46:24 -05:00
Joel Thibault a15f368bdc Re-enable testIsActiveRangeLow/High 2013-01-02 11:57:50 -05:00
Joel Thibault 429567cd3f Rename to TraverseActiveRegionsUnitTest 2013-01-01 19:20:30 -05:00
Joel Thibault 57d38aac8a Temporarily disable due to unknown contracts problem 2013-01-01 19:20:04 -05:00
Joel Thibault 7748b3816f Delete the test BAI file as well as the BAM 2013-01-01 19:20:02 -05:00
Joel Thibault 5afeb465aa TODOs 2013-01-01 19:19:17 -05:00
Joel Thibault a29df3e094 oops 2012-12-18 19:03:12 -05:00
Joel Thibault ee22c1bf44 More TODOs 2012-12-18 18:47:43 -05:00
Joel Thibault 2b1db519d7 Add reads which overstep a boundary by a single base 2012-12-18 18:47:43 -05:00
Joel Thibault 9828b2990f Reads off the end of a contig fail SAM validation when using actual BAMs 2012-12-18 18:47:43 -05:00
Joel Thibault 72e2394b26 Create actual BAM 2012-12-18 18:47:43 -05:00
Joel Thibault d69d1f8988 Fun with varargs 2012-12-18 18:47:42 -05:00
Joel Thibault 1158c1529f Refactor region/read comparisons 2012-12-18 18:47:42 -05:00
David Roazen 46edab6d6a Use the new downsampling implementation by default
-Switch back to the old implementation, if needed, with --use_legacy_downsampler

-LocusIteratorByStateExperimental becomes the new LocusIteratorByState, and
the original LocusIteratorByState becomes LegacyLocusIteratorByState

-Similarly, the ExperimentalReadShardBalancer becomes the new ReadShardBalancer,
with the old one renamed to LegacyReadShardBalancer

-Performance improvements: locus traversals used to be 20% slower in the new
downsampling implementation, now they are roughly the same speed.

-Tests show a very high level of concordance with UG calls from the previous
implementation, with some new calls and edge cases that still require more examination.

-With the new implementation, can now use -dcov with ReadWalkers to set a limit
on the max # of reads per alignment start position per sample. Appropriate value
for ReadWalker dcov may be in the single digits for some tools, but this too
requires more investigation.
2012-12-10 09:44:50 -05:00
Eric Banks 574d5b467f Bug fix for indel HMM: protect against situation where long reads (e.g. Sanger) in a pileup can lead to a read starting after the haplotype end for a given haplotype. 2012-12-09 02:09:34 -05:00
Joel Thibault c76c808268 Reads are required to be sorted
- Remove the extended_only case because it's outside intervals
2012-11-28 13:59:58 -05:00
Joel Thibault 198923b597 Add ActiveRegionReadState handling 2012-11-28 13:59:57 -05:00
Joel Thibault 9bfe39411e Equal overlap should match right/later region 2012-11-27 13:03:13 -05:00
Joel Thibault d83ad906ef Add profile range contract 2012-11-27 13:03:13 -05:00
Joel Thibault cc550b4145 Add a read and interval on a different contig 2012-11-27 13:03:13 -05:00
Joel Thibault c68bc95db6 Initial read mapping tests
- Failing tests are commented out
2012-11-21 17:16:46 -05:00
Joel Thibault 3ad9128800 Add some reads
- Move intervals and reads to init
- Update intervals and reads
2012-11-21 17:16:46 -05:00
Joel Thibault 3fa3b00f4a Add ActiveRegion tests and refactor 2012-11-21 17:16:45 -05:00
Joel Thibault e8defcb20d Test multiple bases and intervals 2012-11-21 17:16:45 -05:00
Joel Thibault c08b782743 Count isActive calls directly 2012-11-21 17:16:45 -05:00
Joel Thibault b70fd4a242 Initial testing of the Active Region Traversal contract
- TODO: many more tests and test cases
2012-11-15 10:08:00 -05:00
Mark DePristo 15b28e61cd Retiring TraverseReads and TraverseLoci after testing confirms nano scheduler version in single threaded version is fine
-- There's been no report of problems with the nano scheduled version of TraverseLoci and TraverseReads, so I'm removing the old versions since they are no longer needed
-- Removing unnecessary intermediate base classes
-- GSA-515 / Nanoscheduler GSA-549 / https://jira.broadinstitute.org/browse/GSA-549
2012-10-22 16:55:06 -04:00
Mark DePristo 2e94a0a201 Refactor TraversalEngine to extract the progress meter functions
-- Previously these core progress metering functions were all in TraversalEngine, and available to subclasses like TraverseLoci via inheritance.  The problem here is that the upcoming data threads x cpu threads parallelism requires one master copy of the progress metering shared among all traversals, but multiple instantiations of traverse engines themselves.
-- Because the progress metering code has horrible anyway, I've refactored and vastly cleaned up and simplified all of these capabilities into TraversalProgressMeter class.  I've simplified down the classes it uses to work (STILL SOME TODOs in there) so that it doesn't reach into the core GATK engine all the time.  It should be possible to write some nice tests for it now.  By making it its own class, it can protect itself from multi-threaded access with a single synchronized printProgress function instead of carrying around multiple lock objects as before
-- Cleaned up the start up of the progress meter.  It's now handled when the meter is created, so each micro scheduler doesn't have to deal with proper initialization timing any longer
-- Simplified and made clear the interface for shutting down the traversal engines.  There's no a shutdown method in TraversalEngine that's called once by the MicroScheduler when the entire traversing in over.  Nano traversals now properly shut down (was subtle bug I undercovered here).  The printing of on traversal done metering is now handled by MicroScheduler
-- The MicroScheduler holds the single master copy of the progress meter, and doles it out to the TraversalEngines (currently 1 but in future commit there will be N).
-- Added a nice function to GenomeAnalysisEngine that returns the regions we will be processing, either the intervals requested or the whole genome.  Useful for progress meter but also probably for other infrastructure as well
-- Remove a lot of the sh*ting Bean interface getting and setting in MicroScheduler that's no longer useful.  The generic bean is just a shell interface with nothing in it.
-- By removing a lot of these bean accessors and setters many things are now final that used to be dynamic.
2012-09-10 20:14:13 -04:00
Mauricio Carneiro 116885a450 Removed the "Walker" suffix from all walkers that had it.
* Did not touch archived walkers... those can be named whatever.
   * Kept abstract classes that end in Walker untouched (e.g. LocusWalker, ReadWalker, ...)
   * Renamed a few inner classes due to conflict when stripping off Walker from their outer classes: ContigStats, FlagStats and FastaStats.
2012-07-20 17:27:11 -04:00
Matt Hanna 8bb4d4dca3 First pass of the asynchronous block loader.
Block loads are only triggered on queue empty at this point.  Disabled by
default (enable with nt:io=?).
2011-11-18 15:02:59 -05:00
Mark DePristo 9127849f5d BugFix for unit test 2011-09-07 14:54:10 -04:00
David Roazen 3c9497788e Reorganized the codebase beneath top-level public and private directories,
removing the playground and oneoffprojects directories in the process. Updated
build.xml accordingly.
2011-06-28 06:55:19 -04:00