Several improvements to the NA12878 knowledge base.

1. All NA12878DBWalkers that export/emit sites need to do so in order; also one should be able
to use -L with them and not have it iterate over all possible sites.
Updated ExportReviews and ExtractConsensusSites to adhere to these constraints.

2. Added the option to AssessNA12878 to have it ignore FNs that overlap with a provided VCF.
This is useful if you have a list of sites from reviews that are okay to be missed in
particular techs only (because for some reason there is coverage but no evidence of the
alternate allele in them) - intended to be used with Jenkins.

3. Hooked up the logic of complex events all the way through the KB.
Now the consensus incorporates whether a call is complex and the assessor does not penalize for them.

4. Fixed long-standing bug that I managed to find accidentally:
AssessNA12878 was closing its DB connection before its final call to includeMissingCalls().

5. Hooked up the per-call confidences through the KB.
We no longer have a 2-tiered priority system in the KB (reviews and everything else) but instead
use a quasi-Bayesian estimator (will update to proper Bayesian treatment if needed).
Now ImportCallset and ImportReviews assigns confidences as appropriate.
Also needed to fix up the consensus logic for calls with UNKNOWN status.
This commit is contained in:
Eric Banks 2013-07-12 11:46:34 -04:00
parent 6440f926d3
commit 5d1454c6b0

Diff Content Not Available