Improvements to the Random Forest pipeline based on Marathon results.
-- We no longer use QUAL because it scales insidiously with AC. -- By default we exclude sites in which NA12878 is polymorphic to prevent overfitting to the knowledgebase. -- Tweaks to training parameters were required because of the QUAL change. -- We now test for model convergence instead of specifying the number of iterations at the command line.
This commit is contained in:
parent
6d58e61f23
commit
04ddbac585