Commit Graph

33 Commits (5f7564bf0ae441e2a689f42772b710ac63d2b115)

Author SHA1 Message Date
hanna a7ba88e649 Rework the way the MicroScheduler handles locus shards to handle intervals that span shards
with less memory consumption.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2981 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-11 18:40:31 +00:00
hanna 02f48b6457 Fix bug that's been in the GATK for a very long time: update nReads (as well
as nRecords), so that INFO logging doesn't say 'skipped 0 of 0 reads'.  While
I'm in there, update TraversalStatistics to store longs.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2959 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-08 22:44:54 +00:00
aaron 790d2a7776 adding the initial ROD for Reads support; more convenience methods in ReadMetaDataTracker to come.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2918 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-03 15:56:44 +00:00
hanna 199b43fcf2 Reduce by interval alterations to interface with new sharding system. This checkin with be followed by a
simplification of some of the locus traversal code.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2886 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-25 00:16:50 +00:00
hanna b19bb19f3d First successful test of new sharding system prototype. Can traverse over reads from a single
BAM file.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2587 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-15 03:35:55 +00:00
hanna 05deb8796b Simplify handling of reference sequence for unmapped reads. Improvement made based on a suggestion from Alec.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2475 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-29 21:06:20 +00:00
aaron c3c001e02e cleanup of the traversal output code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2026 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-12 06:18:10 +00:00
aaron 2ed423ed56 print the current location in read walkers (in addition to the number of reads processed), along with some refactoring to support the change.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2006 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-10 05:57:01 +00:00
aaron d21b582b18 memory leak, where the Resource Pool was releasing based on the value and not the key, resulting in the resourceAssignments map growing with each additional shard
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1880 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-20 00:39:42 +00:00
hanna 21d1eba502 Cleaned division of responsibilities between arguments to map function. Reference has been changed
from an array of bases to an object (ReferenceContext), and LocusContext has been renamed to reflect
the fact that it contains contextual information only about the alignments, not the locus in general.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1376 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 21:01:37 +00:00
aaron d86717db93 Refactoring of the traversal engine base class, I removed a lot of old code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1209 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-09 21:57:00 +00:00
hanna 5735c87581 Basic infrastructure for filtering malformed reads.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1178 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 22:50:22 +00:00
ebanks e5e249d4ac temporary fix to deal with screwy SOLiD reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1168 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-05 03:25:57 +00:00
aaron bcb64d92e9 Aaron: 1, GenomeLoc: 0. I changed our GenomeLoc class, seperating the creation of a genome loc (with the reference setup) to a parser class. GenomeLoc now just represents the actual genomic postion. The constructors are now package-protected (to enforce using the parser), but we may want to expose some constructors in the future.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1069 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-22 14:39:41 +00:00
depristo 3c40db260d Added REFERENCE_BASES required annotation for performance
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1047 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-18 21:03:57 +00:00
aaron 7db4497013 fixing the readTraversal output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1019 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-16 19:44:38 +00:00
aaron 63b5c12cbd Changed dataSources to datasources, to be consistant with the rest of our package names. Also, this makes me champion in the largest check-in contest.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@985 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-11 18:13:22 +00:00
aaron a62bc6b05d fixed some documentation and attached a correct license
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@953 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-09 14:44:27 +00:00
aaron 3c3cd5bb64 Moving some of the data sharding around. A new shard catagory now exits, INTERVAL. This saved a lot of code that was mirroring the same approach in both the read and locus shard strategies.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@840 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-27 18:24:31 +00:00
hanna 2a5be1debe Cleanup in datasources.providers namespace. Make it easier for others writing traversal engines to use.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@803 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 19:12:00 +00:00
hanna 2c4de7b5c5 Switch TraverseByLoci over to new sharding system, and cleanup some code in passing read files along
the pathway from command line to traversal engine.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@727 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-15 21:02:12 +00:00
aaron d8c1b010f1 Fixing the naming of the function I checked in earlier.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@713 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 19:27:10 +00:00
aaron 7aa90757ac Moved the iterators over to the StingSAMIterator interface. This will help us ensure that iterators that need to be closed get closed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@702 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 16:52:18 +00:00
hanna 608948210c Check for a reference before extraction.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@661 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 13:29:44 +00:00
hanna 23e9e29964 Changed reads traversals from providing a LocusContext from which the reference sequence
could be extracted to a char[] containing the reference bases.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@657 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:45:11 +00:00
aaron e8b8ab5985 Added code to extend Matt's getReferenceBases out to the read walkers, so they can see the corresponding reference for each read.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@652 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 03:42:38 +00:00
hanna 6e394490cb Cleanup in preparation for ByLoci traversal. Also did some work minimizing unit tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@643 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 21:27:54 +00:00
hanna 483a58627b More cleanup -- pushing shared functions down into the traversal engine.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@639 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 14:12:45 +00:00
aaron 5136724884 Added code to the schedulers, one step closer to turning on the new reads traversals
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@613 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 22:36:25 +00:00
aaron f5880109a7 Added TraverseReads test, some bug fixes discovered in the traversal test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@594 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 20:36:00 +00:00
aaron 63403d32cd Changes to the interface to the simple data source rippled out to a bunch of files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@572 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 20:35:56 +00:00
aaron d4de68e260 added changes for the readsTraversal to accomidate design changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@553 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-28 19:49:58 +00:00
aaron 395aaf48b0 Added the new by reads traversal, still needs to be sewn into the micromanager code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@551 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-28 17:55:08 +00:00