gatk3的最后一个经典版本3.8
 
 
 
 
Go to file
Eric Banks cc175bad40 Improve the accuracy of dangling head merging in the HC assembler.
Dangling head merging (like with tails) in now enabled by default.
The --recoverDanglingHeads argument is now deprecated so that users know not to use it anymore.
We now also allow the user to set the minimum branch length for merging.  This will be different
for exomes and RNA (see below).

The other changes in the code itself:
1. We no longer allow an arbitrarily large number of mismatches in the dangling head for merging
2. The max number of mismatches allowed in a dangling head is proportional to the kmer size

There will be a difference in the RNA calling pipeline.  Instead of invoking '--recoverDanglingHeads'
the user will instead want to use '--minDanglingBranchLength 0'.

Below are the knowledgebase results of the master branch vs. this one.

For NA12878 DNA Exome:

master  SNPS         TRUE_POSITIVE                                36722
master  SNPS         CALLED_NOT_IN_DB_AT_ALL                       2699
master  SNPS         REASONABLE_FILTERS_WOULD_FILTER_FP_SITE        292
master  SNPS         FALSE_POSITIVE_SITE_IS_FP                       70

branch  SNPS         TRUE_POSITIVE                                36867
branch  SNPS         CALLED_NOT_IN_DB_AT_ALL                       2952
branch  SNPS         REASONABLE_FILTERS_WOULD_FILTER_FP_SITE        387
branch  SNPS         FALSE_POSITIVE_SITE_IS_FP                       94

As I discussed with Ryan in person, there are a good number of FPs that are called in the new
code, but they nearly all have bad strand bias and should be easily filtered by VQSR.
Note that there is no change for indels.

For NA12878 RNA from Ami:

master  SNPS         TRUE_POSITIVE                                11055
master  SNPS         CALLED_NOT_IN_DB_AT_ALL                        831
master  SNPS         REASONABLE_FILTERS_WOULD_FILTER_FP_SITE         44
master  SNPS         FALSE_POSITIVE_SITE_IS_FP                       96

branch  SNPS         TRUE_POSITIVE                                11113
branch  SNPS         CALLED_NOT_IN_DB_AT_ALL                        874
branch  SNPS         REASONABLE_FILTERS_WOULD_FILTER_FP_SITE         47
branch  SNPS         FALSE_POSITIVE_SITE_IS_FP                       92

Again, there's basically no change for indels.
2014-09-07 08:55:59 -04:00
licensing deleted old license files 2013-07-02 16:36:47 -04:00
protected Improve the accuracy of dangling head merging in the HC assembler. 2014-09-07 08:55:59 -04:00
public Moved arguments controlling options in output files into the engine 2014-09-05 21:18:11 -04:00
settings/helpTemplates changed the GATKDocs format to PHP 2014-08-18 18:04:07 -04:00
.gitignore Fixed bug using GraphBased due to infinite likelihoods resulting from the calculation of alignment cost of very long insertion or deletions (done in linear scale) 2014-04-01 16:14:52 -04:00
README.md Update README file for the 2.6 release 2013-06-20 13:08:29 -04:00
ant-bridge.sh Refactored maven directories and java packages replacing "sting" with "gatk". 2014-05-19 17:36:39 -04:00
intellij_example.tar.bz2 Removed the intellij files from the root and made an example package for new users. This allows users to start at the same page and then change it as they see fit without interfering with the repo (thanks guillermo!) 2012-09-27 11:04:56 -04:00
pom.xml Now passing in the path to the GATK directory to tests. 2014-09-02 01:40:59 +08:00

README.md

The Genome Analysis Toolkit

See http://www.broadinstitute.org/gatk/