We use a "manager" to keep track of observed splits and previous reads. This can be extended/modified in the
future to try to salvage those overhangs instead of hard-clipping them and/or try other possible strategies.
Added unit tests and more integration tests.
The GATK now fails with a user error if you try to run with a reduced bam.
(I added a unit test for that; everything else here is just the removal of all traces of RR)
PairHMMLikelihoodCalculationEngine.java to fall back to LOGLESS_CACHING
in case the native library could not be loaded. Made
VECTOR_LOGLESS_CACHING as the default implementation.
2. Updated the README with Mauricio's comments
3. baseline.cc is used within the library - if the machine supports
neither AVX nor SSE4.1, the native library falls back to un-vectorized
C++ in baseline.cc.
4. pairhmm-1-base.cc: This is not part of the library, but is being
heavily used for debugging/profiling. Can I request that we keep it
there for now? In the next release, we can delete it from the
repository.
5. I agree with Mauricio about the ifdefs. I am sure you already know,
but just to reassure you the debug code is not compiled into the library
(because of the ifdefs) and will not affect performance.
-- Keep a list of processed files in ArgumentTypeDescriptor.getRodBindingsCollection
-- Throw user exception if a file name duplicates one that was previously parsed
-- Throw user exception if the ROD list is empty
-- Added two unit tests to RodBindingCollectionUnitTest
public/VectorPairHMM/src/main/c++ as per Khalid's suggestion
2. Use java.home in public/VectorPairHMM/pom.xml to pass environment
variable JRE_HOME to the make process. This is needed because the
Makefile needs to compile JNI code with the flag -I<JRE_HOME>/../include (among
others). Assuming that the Maven build process uses a JDK (and not just
a JRE), the variable java.home points to the JRE inside maven.
3. Dropped all pretense at cross-platform compatibility. Removed Mac
profile from pom.xml for VectorPairHMM
Re-added import java.io.File for BamGatherFunction.
Other cleanup to resolve scala syntax warnings from intellij.
Moved Example UG script to from protected to public.
- This change means that BamGatherFunction will now have an @Output field for the BAM index, which will allow the bai to be deleted for intermediate functions
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
This commit consists of 2 main changes:
1. When the strand table gets too large, we normalize it down to values that are more reasonable.
2. We don't include a particular sample's contribution unless the total ref and alt counts are at least 2 each;
this is a heuristic method for dealing only with hets.
MD5s change as expected.
Hopefully we'll have a more robust implementation for GATK 3.1.
The slicePrefix method functionality was broken.
Story:
https://www.pivotaltracker.com/story/show/64595624
Changes:
1. Fixed the bug.
2. Added unit test to check on the method functionality.
3. Added a integration test to verify the bug has been fixed in a empirical data reprudible case.
Story:
https://www.pivotaltracker.com/s/projects/1007536
Changes:
1. HC's GenotypingEngine now invokes reverseAlleleTrimming on GVCF variant output lines.
2. GenotypeGVCFs also reverse trim after regenotyping as some alt. alleles are dropped (observed in real-data).
The writer was never resetting the pointer to the end of the last non-ref VariantContext that it saw.
This was fine except when it jumped to a new contig - and a lower position on that contig - where it
thought that it was still part of that previous non-ref VariantContext so wouldn't emit a reference
block. Therefore, ref blocks were missing from the beginnings of all chromosomes (except chr1).
Added unit test to cover this case.