gatk-3.8/public/java/src/org/broadinstitute/sting/utils
Ryan Poplin b8709d8c67 Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-08-06 11:41:28 -04:00
..
R Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
activeregion HaplotypeCaller now use an excessive number of high quality soft clips as a triggering signal in order to capture both end points of a large deletion in a single active region. 2012-07-27 12:44:02 -04:00
analysis Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
baq On the fly base quality score recalibration now happens up front in a SAMIterator on input instead of in a lazy-loading fashion if the BQSR table is provided as an engine argument. On the fly recalibration is now completely hooked up and live. 2012-02-13 12:35:09 -05:00
classloader Cleanup BQSR classes 2012-07-31 08:11:03 -04:00
clipping Cleanup BQSR classes 2012-07-31 08:11:03 -04:00
codecs Prevent NumberFormatExceptions when parsing the VCF POS field 2012-08-06 11:19:54 -04:00
collections Refactored/renamed the nested integer array; cleaned up code a bit. 2012-07-03 00:12:33 -04:00
crypt Public-key authorization scheme to restrict use of NO_ET 2012-03-06 00:09:43 -05:00
duplicates GATKSAMRecord refactor 2011-11-03 15:43:26 -04:00
exceptions Requested by Geraldine: adding a utility to register deprecated walkers (and the major version of the first release since they were removed) so that the User Error printed out for e.g. CountCovariates now states: Walker CountCovariates is no longer available in the GATK; it has been deprecated since version 2.0. 2012-08-01 09:50:00 -04:00
fasta Fix for ref 0 bases for Chris 2012-01-24 10:55:09 -05:00
file Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
fragments First pass of implementation of Reduced Reads with HaplotypeCaller. Main changes: a) Active region: scale PL's by representative count to determine whether region is active. b) Scale per-read, per-haplotype likelihoods by read representative counts. A read representative count is (temporarily) defined as the average representative count over all bases in read, TBD whether this is good enough to avoid biases in GL's. c) DeBruijn assembler inserts kmers N times in graph, where N is min representative count of read over kmer span - TBD again whether this is the best approach. d) Bug fixes in FragmentUtils: logic to merge fragments was wrong in cases where there is discrepancy of overlaps between unclipped/soft clipped bases. Didn't affect things before but RR makes prevalence of hard-clipped bases in CIGARs more prevalent so this was exposed. e) Cache read representative counts along with read likelihoods associated with a Haplotype. Code can/should be cleaned up and unified with PairHMMIndelErrorModelCode, as well as refactored to support arbitrary ploidy in HaplotypeCaller 2012-08-03 12:24:23 -04:00
help Removed Categories. 2012-07-25 13:46:24 -04:00
instrumentation Optimize imports run on the whole project, public and private. I just got too tired of all of the unused imports floating around. Confirmed that the system builds after the changes. 2011-07-17 20:29:58 -04:00
interval Refactored parsing of Rod/IntervalBinding. Queue S/G now uses all interval arguments passed to CommandLineGATK QFunctions including support for BED/tribble types, XL, ISR, and padding. 2012-06-27 01:15:22 -04:00
io Public-key authorization scheme to restrict use of NO_ET 2012-03-06 00:09:43 -05:00
pileup Officially removing all code associated with extended events. Note that I still have a longer term project on my plate to refactor the ReadBackedPileup, but that's a much larger effort. 2012-06-15 15:55:03 -04:00
pileup2 Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
recalibration Ready for full-scale evaluation adaptive BQSR contexts 2012-08-03 16:02:53 -04:00
runtime No more hunting down R "resources". As a tradeoff Rscript cannot be specified on the commandline and will be found in the environment path. 2011-10-27 14:17:07 -04:00
sam Fixed AlignmentUtils bug for handling Ns in the CIGAR string. Added a UG integration test that calls a BAM with such reads (provided by a user on GetSatisfaction). 2012-07-31 15:37:22 -04:00
text Refactoring/fixing up UG HMM code: a) Make code use PairHMM class instead of having duplicated code. That way UG and HaplotypeCaller now use same core code. Changes to be able to do this: 1. Compute context-dependent GOP as a function of read, not of haplotype, b) Extracted code to initialize HMM arrays into separate method, c) Move PairHMM class and unit test to public, d) Reenable banded code in PairHMM, inverted sense of flag (true=enable feature) but leave off in HaplotypeCaller. 2012-04-17 14:22:48 -04:00
threading Removed GATK use of distributed parallelism framework. 2011-07-20 16:26:09 -04:00
variantcontext Update BCF2 to include a minor version number so we can rev (and report errors) with BCF2 2012-08-02 17:30:30 -04:00
wiggle Optimize imports run on the whole project, public and private. I just got too tired of all of the unused imports floating around. Confirmed that the system builds after the changes. 2011-07-17 20:29:58 -04:00
AminoAcid.java Removing the Genomic Annotator and its supporting classes 2011-07-25 15:10:25 -04:00
AminoAcidTable.java Removing the Genomic Annotator and its supporting classes 2011-07-25 15:10:25 -04:00
BaseUtils.java Initial checkpoint commit of VariantContext/Allele refactoring. There were just too many problems associated with the different representation of alleles in VCF (padded) vs. VariantContext (unpadded). We are moving VC to use the VCF representation. No more reference base for indels in VC and no more trimming and padding of alleles. Even reverse trimming has been stopped (the theory being that writers of VCF now know what they are doing and often want the reverse padding if they put it there; this has been requested on GetSatisfaction). Code compiles but presumably pretty much all tests with indels with fail at this point. 2012-07-26 01:50:39 -04:00
BitSetUtils.java Refactoring of BQSRv2 to use longs (and standard bit fiddling techniques) instead of Java BitSets for performance improvements. 2012-06-12 09:19:36 -04:00
ContigComparator.java Documented following the new gatkdoc framework 2011-07-25 00:25:08 -04:00
GenomeLoc.java make the size of a GenomeLoc int instead of long 2012-02-03 17:12:42 -05:00
GenomeLocComparator.java Optimized interval iteration 2011-09-28 16:07:34 -04:00
GenomeLocParser.java As a user pointed out, it is not valid for a GenomeLoc to have a start or stop equal to 0. 2012-07-17 22:18:43 -04:00
GenomeLocSortedSet.java Active region walkers can now see the reads in a buffer around thier active reigons. This buffer size is specified as a walker annotation. Intervals are internally extended by this buffer size so that the extra reads make their way through the traversal engine but the walker author only needs to see the original interval. Also, several corner case bug fixes in active region traversal. 2012-01-19 22:05:08 -05:00
Haplotype.java Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2012-08-06 11:41:28 -04:00
HasGenomeLocation.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
HeapSizeMonitor.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
IndelUtils.java Improvements to indel analysis capabilities of VariantEval 2012-04-06 16:07:46 -04:00
MannWhitneyU.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00
MathUtils.java First pass of implementation of Reduced Reads with HaplotypeCaller. Main changes: a) Active region: scale PL's by representative count to determine whether region is active. b) Scale per-read, per-haplotype likelihoods by read representative counts. A read representative count is (temporarily) defined as the average representative count over all bases in read, TBD whether this is good enough to avoid biases in GL's. c) DeBruijn assembler inserts kmers N times in graph, where N is min representative count of read over kmer span - TBD again whether this is the best approach. d) Bug fixes in FragmentUtils: logic to merge fragments was wrong in cases where there is discrepancy of overlaps between unclipped/soft clipped bases. Didn't affect things before but RR makes prevalence of hard-clipped bases in CIGARs more prevalent so this was exposed. e) Cache read representative counts along with read likelihoods associated with a Haplotype. Code can/should be cleaned up and unified with PairHMMIndelErrorModelCode, as well as refactored to support arbitrary ploidy in HaplotypeCaller 2012-08-03 12:24:23 -04:00
Median.java ReadGroupProperties walker and associated infrastructure 2012-03-01 15:01:11 -05:00
MendelianViolation.java Efficient Genotype object Intermediate commit 2012-06-14 16:42:24 -04:00
NGSPlatform.java Stabilized NGSPlatform code: don't assume all reads have read groups (e.g. artificial SAM records) 2012-06-06 15:17:30 -04:00
PairHMM.java Revert some bad merge changes 2012-04-18 16:35:09 -04:00
PathUtils.java GATKPerformanceOverTime script update 2012-01-02 09:58:46 -05:00
QualityUtils.java Extensive unit tests, contacts, and documentation for RecalDatum 2012-07-31 08:11:03 -04:00
ReservoirDownsampler.java Optimize imports run on the whole project, public and private. I just got too tired of all of the unused imports floating around. Confirmed that the system builds after the changes. 2011-07-17 20:29:58 -04:00
SWPairwiseAlignment.java Optimize imports run on the whole project, public and private. I just got too tired of all of the unused imports floating around. Confirmed that the system builds after the changes. 2011-07-17 20:29:58 -04:00
SampleUtils.java Phase I commit to get shadowBCFs passing tests 2012-06-21 15:16:26 -04:00
SequenceDictionaryUtils.java Cleanup Genotypes 2012-06-14 16:42:36 -04:00
SimpleTimer.java Removing contracts for the SimpleTimer 2011-11-06 22:22:49 -05:00
Utils.java Algorithmically faster version of DiffEngine 2012-06-14 16:42:30 -04:00
package-info.java Reorganized the codebase beneath top-level public and private directories, 2011-06-28 06:55:19 -04:00