Commit Graph

3 Commits (c191d7de8c15fac2a23cbc7e86bd2cf369906a34)

Author SHA1 Message Date
Mauricio Carneiro 0de6f55660 PairHMM rework
The current implementation of the PairHMM had issues with the probabilities and the state machines. Probabilities were not adding up to one because:
   # Initial conditions were not being set properly
   # Emission probabilities in the last row were not adding up to 1

The following commit fixes both by
   # averaging all potential start locations (giving an equal prior to the state machine in it's first iteration -- allowing the read to start it's alignment anywhere in the haplotype with equal probability)
   # discounting all paths that end in deletions by not adding the last row of the deletion matrix and summing over all paths ending in matches and insertions (this saves us from a fourth matrix to represent the end state)

Summarized changes:
   * Fix LoglessCachingPairHMM and Log10PairHMM according to the new algorithm
   * Refactor probabilities check to throw exception if we ever encounter probabilities greater than 1.
   * Rename LoglessCachingPairHMM to LoglessPairHMM (this is the default implementation in the HC now)
   * Rename matrices to matchMatrix, insertionMatrix and deletionMatrix for clarity
   * Rename metric lengths to read and haplotype lengths for clarity
   * Rename private methods to initializePriors (distance) and initializeProbabilities (constants) for clarity
   * Eliminate first row constants (because they're not used anyway!) and directly assign initial conditions in the deletionMatrix
   * Remove unnecessary parameters from updateCell()
   * Fix the expected probabilities coming from the exact model in PairHMMUnitTest
   * Neatify PairHMM class (removed unused methods) and PairHMMUnitTest (removed unused variables)
   * Update MD5s: Probabilities have changed according to the new PairHMM model and as expected HC and UG integration tests have new MD5s.

[fix 47164949]
2013-03-30 10:50:06 -04:00
David Roazen 12a3d7ecad Fix licenses on files modified in 2.4-1 2013-02-26 21:53:17 -05:00
David Roazen 65d31ba4ad Fix runtime public -> protected dependencies in the test suite
-replace unnecessary uses of the UnifiedGenotyper by public integration tests
 with PrintReads

-move NanoSchedulerIntegrationTest to protected, since it's completely dependent
 on the UnifiedGenotyper
2013-02-26 21:19:12 -05:00