gatk-3.8/java/src/org/broadinstitute/sting/utils
hanna 41d57b7139 Massive cleanup of read filtering.
- Eliminate reduncancy of filter application.
- Track filter metrics per-shard to facitate per merging.
- Flatten counting iterator hierarchy for easier debugging.
- Rename Reads class to ReadProperties and track it outside of the Sting iterators.
Note: because shards are currently tied so closely to reads and not the merged triplet of <reads,ref,RODs>, the metrics
classes are managed by the SAMDataSource when they should be managed by something more general.  For now, we're hacking
the reads data source to manage the metrics; in the future, something more general should manage the metrics classes.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4015 348d0f76-0448-11de-a6fe-93d51630548a
2010-08-11 20:17:11 +00:00
..
analysis Initial commit of items for analyzing amino acid transitions in variant eval. Blew up my subversion by coding locally while i did not have internet. I hope this doesn't bust any integrationtests since I changed no existing code but...who knows. Crossing my fingers. 2010-06-29 20:57:18 +00:00
bed making 'parseLocation' public static - as simple as the logic is, it's better kept in one place and I need it! 2010-06-11 18:19:59 +00:00
classloader VariantEval now uses the "standard" modules only by default. You can add other modules with the -E argument and not use all of the standard ones with -noStandard (they can be added back individually with -E). 2010-08-03 16:51:10 +00:00
collections Official parallel CountCovariates, passes all integration tests. Now poster-child example of parallelism in GATK (Matt H). Apparent general performance improvements throughout too. 2010-07-19 22:13:18 +00:00
duplicates A refactoring / unification of ReadBackedPileup and ReadBackedExtendedEventPileup. 2010-06-20 04:42:26 +00:00
fasta IndexedFastaSequenceFile is now in Picard; transitioning to that implementation. 2010-07-01 04:40:31 +00:00
fastq Improvements to make this work with uncompressed fastq files. Pulled the fastq parser out into it's own SAMFileReader-like entity. 2009-09-03 17:20:16 +00:00
file Attempt to determine whether underlying filesystem supports file locking and 2010-08-04 19:28:27 +00:00
genotype Moved into Tribble to be with VC 2010-08-11 17:14:32 +00:00
help Added @Hidden annotation, a way to deliberately exclude experimental fields and 2010-08-05 02:26:46 +00:00
instrumentation A Java sizeof, implemented using the Java instrumentation API. Can either get the memory consumed either only by a single 2010-07-27 18:44:15 +00:00
interval protect against nulls 2010-08-09 19:21:39 +00:00
pileup Cleanup for Steve Hershman's issue. In the midst of doing this, I discovered 2010-07-16 18:57:58 +00:00
sam Massive cleanup of read filtering. 2010-08-11 20:17:11 +00:00
text The copyright tag that I copied/pasted from a LaTeX document into IntelliJ had 2010-04-20 15:26:32 +00:00
threading Put a major.minor version into the GATK Javadoc for reading. Also, 2010-01-15 21:48:30 +00:00
vcf Unit, integration, and performance tests are all busted, so this is a good time to make a big commit... 2010-08-10 04:18:29 +00:00
wiggle Support for generating (very basic) wiggle files for use with IGV (see UCSC for wiggle spec); and a walker to take in a variant track and create a transition transversion rate track for the whole genome (due to the wiggle spec, this has to be done by chromosome). It's interesting to see the effect of genes! 2010-07-21 18:04:30 +00:00
BaseUtils.java Solid processing in base quality recalibrator now has several options for how to handle no calls in the color space. --ignore_nocall_colorspace is removed and replace by --solid_nocall_strategy. Fixed some of the @Deprecated tags in BaseUtils. LocusWalkers now filter out FailsVendorQualityCheck reads. HLA caller integration test bam file had bad vendor reads so its integration test changed. 2010-07-19 19:10:29 +00:00
GenomeLoc.java this.intersect(that) method added to GenomeLoc (returns intersection of two intervals or dies if the locations do not overlap) 2010-07-22 16:00:30 +00:00
GenomeLocParser.java Bug fix for Chris: added method createPotentiallyInvalidGenomeLoc() to the GenomeLocParser that doesn't check that the contig exists in the sequence dictionary. This is crucial for lifting over from one reference to another, as sometimes contigs names change in the liftover (e.g. chrM to MT). 2010-08-05 03:19:02 +00:00
GenomeLocSortedSet.java add a fix so that XL arguments won't cancel out -BTI arguments, fixed a bug for Ben where the ROD -> interval list conversion was throwing an exception, and some old code removal. 2010-04-15 16:31:43 +00:00
HeapSizeMonitor.java Checking in downsampling iterator alongside LocusIteratorByState, and removing 2010-05-17 21:00:44 +00:00
MalformedGenomeLocException.java A fix for the 'rod blows up when it hits a GenomeLoc outside the reference' issu 2009-06-02 18:14:46 +00:00
MathUtils.java A utility class that computes running average and standard deviation for a stream of numbers it is being fed with. Updates mean/stddev on the fly and does not cache the observations, so it uses no memory and also should be stable against overflow/loss of precision. Simple unit test is also provided (does *not* stress-test the engine with millions of numbers though). 2010-08-04 21:39:02 +00:00
PathUtils.java Added a method to refresh an NFS mount point (necessary to prevent NFS flakiness when running on the LSF farm. 2009-05-21 19:31:54 +00:00
QualityUtils.java Checking in everyone's changes to the variant recalibrator. We now calculate the variant quality score as a LOD score between the true and false hypothesis. Allele Count prior is changed to be (1 - 0.5^ac). Known prior breaks out HapMap sites 2010-08-05 14:12:19 +00:00
ReservoirDownsampler.java Rethinking DownsamplingLocusIteratorByState with a flattened read structure. Samples are kept 2010-06-13 01:47:02 +00:00
SWPairwiseAlignment.java Reorganization of SW code for clarity. Totally failure at raw optimization. Discovered that ~50% of reads being cleaned were perfect reference matches. New code comes with flag to look at NM field and not clean perfect matches. Can we turned off with command line option (needed for 1KG bams with bad NM fields). Going to rerun cleaning jobs due to accidentally rebuilding of stable codebase and loss of 2 days of runtime. 2010-05-27 23:16:00 +00:00
SampleUtils.java Starting the clean up of the sting.utils.genotype code which is all either moving to Tribble, moving to sting.utils.vcf, or being removed. 2010-08-10 02:16:05 +00:00
StingException.java Changed Sting exception from a base exception to a runtime exception. This makes it so you can throw it without the consumer having to check it, and hopefully people will be more inclined to use it. 2009-04-29 22:09:41 +00:00
Utils.java A couple of type specific implementations of a single extend() method: takes an array (byte[] or short[] currently) and "extends" it to the left or to the right by the specified number of elements. Returns newly allocated array, with the content of original array copied in (if we extend by n elements to the left, then the returned array will have n default-filled elements *followed* by the content of the old array). 2010-08-04 15:30:48 +00:00
WilcoxonRankSum.java The copyright tag that I copied/pasted from a LaTeX document into IntelliJ had 2010-04-20 15:26:32 +00:00
package-info.java Put a major.minor version into the GATK Javadoc for reading. Also, 2010-01-15 21:48:30 +00:00