a) Sites were numerically well behaved now, but another hard fact of life is that the AF iteration is defined in linear Pr space, not in log likelihood space, and the math doesn't work out in log space. So, we need to convert back and forth from lin to log space. b) As a consequence of a), the code got a major slowdown, and calling the 629 samples was about 15 times slower than before (sic). c) To solve b), log10 of integers are now cached at init, and numerical approximations are now made. Most importantly, I'm using the approximation that log(exp(a) + exp(b)) ~= max(a,b) which seems almost inconsequential in practical performance but reduces computation time to what it was before. More detailes analyses are forthcoming. This approximation can be refined further on to avoid expensive log-exp conversions if further profiling and analysis deems it necessary. Also, two other issues were solved: a) Strand bias computation was actually wrong in the case where the optimal AC was bigger than max(forward reads,reverse reads). Now the code is exactly as buggy as the grid search model (all bugs are equal, but some are more equal than others) b) Genotype likelihoods are now computed in a better way and if a likelihood < 0 we don't just cap to 0 but do something a bit smarter. git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4600 348d0f76-0448-11de-a6fe-93d51630548a |
||
|---|---|---|
| R | ||
| archive | ||
| c | ||
| doc | ||
| java | ||
| matlab | ||
| packages | ||
| perl | ||
| python | ||
| ruby | ||
| scala | ||
| settings | ||
| shell | ||
| testdata | ||
| LICENSE | ||
| build.xml | ||
| ivy.xml | ||