gatk3的最后一个经典版本3.8
 
 
 
 
Go to file
delangel ca7810f11d First major update of indel genotyper:
a) Really fix this time strand bias computation for indels, previous version was a partial fix only.
b) Change way in which we deal with bad bases at the edge of reads. Even if a base is soft clipped in CIGAR string, there may still be dangling bases with Q=2 that may throw off QUAL computation in some sites. So, we're stricter and we also trim off those bases off read edges even if they are not soft-clipped officially.
c) First feeble-minded attempt at runtime optimization - don't compute log and 10^base_qual every time. Rather, cache 10^-k/10 and log(1-10^-k/10) for all k <=60. This speeds up code about 4x.
d) Further optimization: don't compute log(10^x+10^y) but rather use softMax function recently put into ExactAFCalculationModel.
e) Skip bad reads where all Q=2 (sic)
f) Avoid log to lin and back to log conversions of genotype likelihoods - this was legacy code from back when exact model did stuff in linear domain. This improves precision overall.




git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4802 348d0f76-0448-11de-a6fe-93d51630548a
2010-12-07 18:35:22 +00:00
R 1) 2010-11-30 21:08:25 +00:00
archive Fisher exact makes a return. Seems to be working properly. Current tagged as a work in progress. Needs to take the filtered context to be truly correct. 2010-10-22 20:35:53 +00:00
c Reduce file handle usage. 2010-01-05 18:03:01 +00:00
doc removing the custom reflections library from the libs, and adding a release version. Hopefully this will fix the problem Menachem has been seeing with random JVM crashes. Also 2010-08-19 00:42:37 +00:00
java First major update of indel genotyper: 2010-12-07 18:35:22 +00:00
matlab Another matlab script -- this time for making power and coverage plots over a specific gene region. Lots of fun file reading, string manipulation, and exploration of the set() function 2009-11-30 20:02:25 +00:00
packages Updated the Queue GATK generator and packaging to include more dependencies for fullCallingPipeline.q. 2010-11-30 15:29:40 +00:00
perl A helper script to merge two VCFs, run VariantEval, and the VariantReport.R script. 2010-11-28 00:45:21 +00:00
python Trival changes to data processing paper python 2010-12-01 14:57:14 +00:00
ruby accidentally commited an old tool 2010-08-25 15:42:02 +00:00
scala With multi-sample genotyping must come scatter+gather. Also Khalid informed me of the .group(size) method, so removing my useless (but pretty) code. 2010-12-06 20:12:23 +00:00
settings Added status email support with -statusTo. Will send emails on failure of an individual function or success/failure of the whole pipeline. 2010-10-14 15:58:52 +00:00
shell Useful script for me 2010-11-18 15:21:06 +00:00
testdata VQSR now operates on LOD scores in the INFO field directly, and doesn't adjust the QUAL field. New format for tranches file uses LOD score. Old file format no longer supported. log10sumlog10() function, a very useful utility in MathUtils. No more ExtendedPileupElement! Robust math calculations in GMM so that no infinities are generated! HaplotypeScore refactored to enable use of filtered context. Not yet enabled... InferredContext getDouble and getInteger arguments now parse values from Strings if necessary 2010-11-15 22:19:22 +00:00
LICENSE Adding a license to the root directory in case BOSC checks for one. Has the 2010-04-20 16:04:29 +00:00
build.xml Added the ability to test pipelines in dry or live mode via 'ant pipelinetest' and 'ant pipelinetest -Dpipeline.run=run'. 2010-11-22 22:59:42 +00:00
ivy.xml Updated PluginManager so that during testing Queue can dynamically compile and load separately multiple class directories into the same class loader. 2010-11-12 20:14:28 +00:00