-- The previous approach in VQSR was to build a GMM with the same max. number of Gaussians for the positive and negative models. However, we usually have many more positive sites than negative, so we'd prefer to use a more detailed GMM for the positive model and a less well defined model using few sites for the negative model. -- Now the maxGaussians argument only applies to the positive model -- This update builds a GMM for the negative model with a default 4 max gaussians (though this can be controlled via command line parameter) -- Removes the percentBadVariants argument. The only way to control how many variants are included in the negative model is with minNumBad -- Reduced the minNumBad argument default to 1000 from 2500 -- Update MD5s for VQSR. md5s changed significantly due to underlying changes in the default GMM model. Only sites with NEGATIVE_TRAINING_LABELs and the resulting VQSLOD are different, as expected. -- minNumBad is now numBad -- Plot all negative training points as well, since this significantly changes our view of the GMM PDF |
||
|---|---|---|
| .. | ||
| gatk/walkers | ||
| utils | ||