gatk3的最后一个经典版本3.8
 
 
 
 
Go to file
delangel c604ed9440 Several improvements to new indel genotyper (more to come soon):
a) Turns out previous change of centering haplotype around indel was a bad idea. Context to the left of indel is important but not as important as right one, because by definition all alleles start at the same location, so haplotype is the same to the left of indel regardless of allele. So, go back to having a constant size window to the left of event.
b) Expand reference context so we can test larger haplotypes.
c) Optimize computation of read likelihoods by doing them in linear array instead of in a matrix - no difference in biallelic sites but could be significantly faster in multiallelic sites.
d) Bug fix: read alignment wasn't being computed correctly if, a) we were at an insertion, b) read started right at the insertion, c) read CIGAR didn't include insertion - more of these corner conditions are lurking, so a revamped computation of how reads align to candidate haplotypes is in the works.
e) Add debug option not to use prior haplotype likelihoods.
f) Don't hard-code NA12878 for genotyping, now sample name is a required input argument.
g) Bug fix: if there are no reads covering a candidate indel event, just output NO_CALL (didn't notice this in HiSeq, but in P1 data it happens all the time). I need to add a confidence threshold for calling later on.






git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4291 348d0f76-0448-11de-a6fe-93d51630548a
2010-09-15 21:53:08 +00:00
R improvements to the report code 2010-09-15 00:45:13 +00:00
archive Cleaning up playground utils and tests 2010-08-27 01:25:47 +00:00
c Reduce file handle usage. 2010-01-05 18:03:01 +00:00
doc removing the custom reflections library from the libs, and adding a release version. Hopefully this will fix the problem Menachem has been seeing with random JVM crashes. Also 2010-08-19 00:42:37 +00:00
java Several improvements to new indel genotyper (more to come soon): 2010-09-15 21:53:08 +00:00
matlab Another matlab script -- this time for making power and coverage plots over a specific gene region. Lots of fun file reading, string manipulation, and exploration of the set() function 2009-11-30 20:02:25 +00:00
packages Fix for DoC issue with multiplexer -- will retire use of multiplexer when 2010-09-07 00:44:07 +00:00
perl - Update DoC to support output to /dev/null. 2010-09-08 23:43:18 +00:00
python improvements to the report code 2010-09-15 00:45:13 +00:00
ruby accidentally commited an old tool 2010-08-25 15:42:02 +00:00
scala Fix for Queue 2010-09-12 15:18:08 +00:00
settings The battle is over. Picard is revved. 2010-09-03 05:28:01 +00:00
shell Added gsa-firehose2 2010-09-15 02:24:04 +00:00
testdata and add changes to the vcf used in testing 2010-06-25 02:56:02 +00:00
LICENSE Adding a license to the root directory in case BOSC checks for one. Has the 2010-04-20 16:04:29 +00:00
build.xml - Include the fasta index builder in the package. 2010-09-08 20:37:54 +00:00
ivy.xml Adding the --sample-metadata (-SM) command line argument and associated functionality. This is something Matt and I have been working on for a while. Basically, it allows you to integrate sample metadata into an analysis, by including a sample file. More detailed documentation is on the wiki: http://www.broadinstitute.org/gsa/wiki/index.php/Adding_Sample_data_to_an_analysis 2010-09-15 11:50:22 +00:00