The previous behavior is to process reads with N CIGAR operators as they are despite that many of the tools do not actually support such operator and results become unpredictible. Now if the there is some read with the N operator, the engine returns a user exception. The error message indicates what is the problem (including the offending read and mapping position) and give a couple of alternatives that the user can take in order to move forward: a) ask for those reads to be filtered out (with --filter_reads_with_N_cigar or -filterRNC) b) keep them in as before (with -U ALLOW_N_CIGAR_READS or -U ALL) Notice that (b) does not have any effect if (a) is enacted; i.e. filtering overrides ignoring. Implementation: * Added filterReadsWithMCigar argument to MalformedReadFilter with the corresponding changes in the code to get it to work. * Added ALLOW_N_CIGAR_READS unsafe flag so that N cigar containing reads can be processed as they are if that is what the user wants. * Added ReadFilterTest class commont parent for ReadFilter test cases. * Refactor ReadGroupBlackListFilterUnitTest to extend ReadFilterTest and push up some functionality to that class. * Modified MalformedReadFilterUnitTest to extend ReadFilterTest and to test the new filter functionality. * Added AllowNCigarMalformedReadFilterUnittest to check on the behavior when the unsafe ALLOW_N_CIGAR_READS flag is used. * Added UnsafeNCigarMalformedReadFilterUnittest to check on the behavior when the unsafe ALL flag is used. * Updated a broken test case in UnifiedGenotyperIntegrationTest resulting from the new behavior. * Updated EngineFeaturesIntegrationTest testdata to be compliant with new behavior |
||
|---|---|---|
| .. | ||
| net/sf/samtools | ||
| org/broadinstitute/sting | ||