gatk-3.8

Commit Graph

Author	SHA1	Message	Date
David Roazen	f57256b6c2	Delete unused FastaSequenceIndexBuilder class and accompanying test This class, being unused, was no longer getting packaged into the GATK release jar by bcel, and so attempting to run its unit test on the release jar was producing an error.	2013-05-01 01:02:01 -04:00
Mauricio Carneiro	2a4ccfe6fd	Updated all JAVA file licenses accordingly GSATDG-5	2013-01-10 17:06:41 -05:00
David Roazen	133085469f	Experimental, downsampler-friendly read shard balancer -Only used when experimental downsampling is enabled -Persists read iterators across shards, creating a new set only when we've exhausted the current BAM file region(s). This prevents the engine from revisiting regions discarded by the downsamplers / filters, as could happen in the old implementation. -SAMDataSource no longer tracks low-level file positions in experimental mode. Can strip out all related code when the engine fork is collapsed. -Defensive implementation that assumes BAM file regions coming out of the BAM Schedule can overlap; should be able to improve performance if we can prove they cannot possibly overlap. -Tests a bit on the extreme side (~8 minute runtime) for now; will scale these back once confidence in the code is gained	2012-09-21 22:17:58 -04:00
Eric Banks	1acf0f0b2c	Fixing bug in fasta .fai generation: trim the contig names to the first whitespace if one appears. We now generate indexes identical to samtools.	2012-08-29 22:36:27 -04:00
Eric Banks	3253fc216b	FindBugs 'Maintainability' fixes	2012-08-16 15:53:06 -04:00
David Roazen	85d31f80a2	Merged bug fix from Stable into Unstable	2012-02-13 16:37:11 -05:00
David Roazen	03e5184741	Fix serious engine bug that could cause reads to be dropped under certain circumstances When aggregating raw BAM file spans into shards, the IntervalSharder tries to combine file spans when it can. Unfortunately, the method that combines two BAM file spans was seriously flawed, and would produce a truncated union if the file spans overlapped in certain ways. This could cause entire regions of the BAM file containing reads within the requested intervals to be dropped. Modified GATKBAMFileSpan.union() to correct this problem, and added unit tests to verify that the correct union is produced regardless of how the file spans happen to overlap. Thanks to Khalid, who did at least as much work on this bug as I did.	2012-02-13 16:25:21 -05:00
Matt Hanna	5b58fe741a	Retiring Picard customizations for async I/O and cleaning up parts of the code to use common Picard utilities I recently discovered. Also embedded bug fix for issues reading sparse shards and did some cleanup based on comments during BAM reading code transition meetings.	2012-02-08 08:34:37 -05:00
Matt Hanna	e923a2e512	Revving Picard to incorporate final version of ReadWalker performance improvements.	2012-01-10 12:12:33 -05:00
Matt Hanna	3642a73c07	Performance improvements for dynamically merging BAMs in read walkers. This change and my previous change have dropped runtime when dynamically merging 2k BAM files from 72.6min/1M reads to 46.8sec/1M reads. Note that many of these changes are stopgaps -- the real problem is the way ReadWalkers interface with Picard, and I'll have to work with Tim&Co to produce a more maintainable patch.	2011-12-16 09:37:44 -05:00
Matt Hanna	8bb4d4dca3	First pass of the asynchronous block loader. Block loads are only triggered on queue empty at this point. Disabled by default (enable with nt:io=?).	2011-11-18 15:02:59 -05:00
Mark DePristo	f3ad4ec94b	Removed annoying FastaSequenceIndexBuilderProgressListener infrastructure that was just a boolean switch on whether to print progress or not.	2011-07-27 22:06:23 -04:00
David Roazen	3c9497788e	Reorganized the codebase beneath top-level public and private directories, removing the playground and oneoffprojects directories in the process. Updated build.xml accordingly.	2011-06-28 06:55:19 -04:00

13 Commits (ec206eccfca28d7dbbabdf351794b21653c355e9)