gatk-3.8

History

Eric Banks 379a9841ce Various bug fixes for recent Reduce Reads additions plus solution implemented for low MQ reads. 1. Using cumulative binomial probability was not working at high coverage sites (because p-values quickly got out of hand) so instead we use a hybrid system for determining significance: at low coverage sites use binomial prob and at high coverage sites revert to using the old base proportions. Then we get the best of both worlds. As a note, coverage refers to just the individual base counts and not the entire pileup. 2. Reads were getting lost because of the comparator being used in the SlidingWindow. When read pairs had the same alignment end position the 2nd one encountered would get dropped (but added to the header!). We now use a PriorityQueue instead of a TreeSet to allow for such cases. 3. Each consensus keeps track of its own number of softclipped bases. There was no reason that that number should be shared between them. 4. We output consensus filtered (i.e. low MQ) reads whenever they are present for now. Don't lose that information. Maybe we'll decide to change this in the future, but for now we are conservative. 5. Also implemented various small performance optimizations based on profiling. Added unit tests to cover these changes; systematic assessment now tests against low MQ reads too.	2013-04-24 18:18:50 -04:00
..
java	Various bug fixes for recent Reduce Reads additions plus solution implemented for low MQ reads.	2013-04-24 18:18:50 -04:00
scala/test/org/broadinstitute/sting/queue/pipeline/examples	Fix improper dependencies in QScripts used by pipeline tests, and attempt to fix the flawed MisencodedBaseQualityUnitTest	2013-02-27 04:45:53 -05:00

Eric Banks 379a9841ce Various bug fixes for recent Reduce Reads additions plus solution implemented for low MQ reads.

1. Using cumulative binomial probability was not working at high coverage sites (because p-values quickly
got out of hand) so instead we use a hybrid system for determining significance: at low coverage sites
use binomial prob and at high coverage sites revert to using the old base proportions.  Then we get the
best of both worlds.  As a note, coverage refers to just the individual base counts and not the entire pileup.

2. Reads were getting lost because of the comparator being used in the SlidingWindow. When read pairs had
the same alignment end position the 2nd one encountered would get dropped (but added to the header!). We
now use a PriorityQueue instead of a TreeSet to allow for such cases.

3. Each consensus keeps track of its own number of softclipped bases.  There was no reason that that number
should be shared between them.

4. We output consensus filtered (i.e. low MQ) reads whenever they are present for now.  Don't lose that
information.  Maybe we'll decide to change this in the future, but for now we are conservative.

5. Also implemented various small performance optimizations based on profiling.

Added unit tests to cover these changes; systematic assessment now tests against low MQ reads too.

2013-04-24 18:18:50 -04:00

java

Various bug fixes for recent Reduce Reads additions plus solution implemented for low MQ reads.

2013-04-24 18:18:50 -04:00

scala/test/org/broadinstitute/sting/queue/pipeline/examples

Fix improper dependencies in QScripts used by pipeline tests, and attempt to fix the flawed MisencodedBaseQualityUnitTest

2013-02-27 04:45:53 -05:00