Added 2 new fields to the MongoVariantContext: confidence and isComplex.

IsComplex will be used to designate calls as representing complex events which have multiple
correct allele representations.  Then call sets can get points for including them but will
not get penalized for missing them (because they may have used a different representation).
This is currently the biggest bane when trying to characterize FNs these days.

The confidence will be used to refactor the consensus making algorithm for the truth status
of the NA12878 KB.  The previous version allowed for 2 tiers: reviews and everything else.
But that is problematic when some of the input sets are of higher quality than others
because when they disagree the calls become discordant and we lose that information.
The new framework will allow each call to have its own associated confidence.  Then when
determining the consensus truth status we probabilistically calculate it from the
various confidences, so that nothing is hard coded in anymore.

Note that I added some unit tests to ensure the outcome that I expect for various scenarios
and then implemented a very rough version of the estimator that successfully produced those
outcomes.

HOWEVER, THIS IS NOT COMPLETE AND NEITHER FUNCTIONALITY IS HOOKED UP AT ALL.
Rather, this is an interim commit.  The only goal here is to get these fields added to the MVC
for the upcoming release so that Jacob (who prefers to work with stable) can add the
necessary functionality to IGV for us.
This commit is contained in:
Eric Banks 2013-06-11 14:20:10 -04:00
parent 4151753718
commit 9ec71bba26

Diff Content Not Available