For example, this tool can be used for processing bowtie RNA-seq data. Each read with k N-cigar elemments is plit to k+1 reads. The split is done by hard clipping the bases rest of the bases. In order to do it, few changes were introduced to some other clipping methods: - make a segnificant change in ClippingOp.hardClip() that prevent the spliting of read with cigar: 1M2I1N1M3I. - change getReadCoordinateForReferenceCoordinate in ReadUtil to recognize Ns create unitTests for that walker: - change ReadClipperTestUtils to be more general in order to use its code and avoid code duplication - move some useful methods from ReadClipperTestUtils to CigarUtils create integration test for that class small change in a comment in FullProcessingPipeline last commit: Address review comments: - move to protected under walkers/rnaseq - change the read splitting methods to be more readable and more efficiant - change (minor changes) some methods in ReadClipper to allow the changes in split reads - add (minor change) one method to CigarUtils to allow the changes in split reads - change ReadUtils.getReadCoordinateForReferenceCoordinate to include possible N in the cigar - address the rest of the review comments (minor changes) - fix ReadUtilsUnitTest.testReadWithNs acoording to the defult behaviour of getReadCoordinateForReferenceCoordinate (in case of refernce index that fall into deletion, return the read index of the base before the deletion). - add another test to ReadUtilsUnitTest.testReadWithNs - Allow the user to print the split positions (not working proparly currently) |
||
|---|---|---|
| .. | ||
| AlignmentUtilsUnitTest.java | ||
| ArtificialBAMBuilderUnitTest.java | ||
| ArtificialPatternedSAMIteratorUnitTest.java | ||
| ArtificialSAMFileWriterUnitTest.java | ||
| ArtificialSAMQueryIteratorUnitTest.java | ||
| ArtificialSAMUtilsUnitTest.java | ||
| ArtificialSingleSampleReadStreamUnitTest.java | ||
| GATKSAMRecordUnitTest.java | ||
| MisencodedBaseQualityUnitTest.java | ||
| ReadUtilsUnitTest.java | ||