Merge branch 'master' of /home/mghodrat/PairHMM/shared-repository into intel_pairhmm
Conflicts:
PairHMM_JNI/org_broadinstitute_sting_utils_pairhmm_VectorLoglessPairHMM.cc
2. Split into DebugJNILoglessPairHMM and VectorLoglessPairHMM with base
class JNILoglessPairHMM. DebugJNILoglessPairHMM can, in principle,
invoke any other child class of JNILoglessPairHMM.
3. Added more profiling code for Java parts of LoglessPairHMM
2. Wrapped _mm_empty() with ifdef SIMD_TYPE_SSE
3. OpenMP disabled
4. Added code for initializing PairHMM's data inside initializePairHMM -
not used yet
SSE compilation warning.
2. Added code to dynamically select between AVX, SSE4.2 and normal C++ (in
that order)
3. Created multiple files to compile with different compilation flags:
avx_function_prototypes.cc is compiled with -xAVX while
sse_function_instantiations.cc is compiled with -xSSE4.2 flag.
4. Added jniClose() and support in Java (HaplotypeCaller,
PairHMMLikelihoodCalculationEngine) to call this function at the end of
the program.
5. Removed debug code, kept assertions and profiling in C++
6. Disabled OpenMP for now.
1. Moved computeLikelihoods from PairHMM to native implementation
2. Disabled debug - debug code still left (hopefully, not part of
bytecode)
3. Added directory PairHMM_JNI in the root which holds the C++
library that contains the PairHMM AVX implementation. See
PairHMM_JNI/JNI_README first
2. Disabled debug - debug code still left (hopefully, not part of
bytecode)
3. Added directory PairHMM_JNI in the root which holds the C++ library
that contains the PairHMM AVX implementation. See PairHMM_JNI/JNI_README
first
-- New -a argument in the VQSR for specifying additional data to be used in the clustering
-- New NA12878KB walker which creates ROC curves by partitioning the data along VQSLOD and calculating how many KB TP/FP's are called.
For example, this tool can be used for processing bowtie RNA-seq data.
Each read with k N-cigar elemments is plit to k+1 reads. The split is done by hard clipping the bases rest of the bases.
In order to do it, few changes were introduced to some other clipping methods:
- make a segnificant change in ClippingOp.hardClip() that prevent the spliting of read with cigar: 1M2I1N1M3I.
- change getReadCoordinateForReferenceCoordinate in ReadUtil to recognize Ns
create unitTests for that walker:
- change ReadClipperTestUtils to be more general in order to use its code and avoid code duplication
- move some useful methods from ReadClipperTestUtils to CigarUtils
create integration test for that class
small change in a comment in FullProcessingPipeline
last commit:
Address review comments:
- move to protected under walkers/rnaseq
- change the read splitting methods to be more readable and more efficiant
- change (minor changes) some methods in ReadClipper to allow the changes in split reads
- add (minor change) one method to CigarUtils to allow the changes in split reads
- change ReadUtils.getReadCoordinateForReferenceCoordinate to include possible N in the cigar
- address the rest of the review comments (minor changes)
- fix ReadUtilsUnitTest.testReadWithNs acoording to the defult behaviour of getReadCoordinateForReferenceCoordinate (in case of refernce index that fall into deletion, return the read index of the base before the deletion).
- add another test to ReadUtilsUnitTest.testReadWithNs
- Allow the user to print the split positions (not working proparly currently)
This is a tool that we use internally validate the ReduceReads development. I think it should be
private. There is no need to improve docs.
[delivers #54703398]
* add overall walker GATKDocs
* add explanation for skip parameter and make it advanced
* reverse the logic on exculding unmapped reads for clarity
* fix read length calculation to no longer include indels
ps: I am not sure how useful this walker is (I didn't write it) but the skip logic is poor and
calculates the entire statistic for the reads it is eventually going to skip. This would be an easy
fix, but only worth our time if people actually use this.