-- For the high NT tests the total runtime may be too short to really assess nt efficiency vs. start up costs. Reworked underlying test data and intervals so that most tests run in 10-20 hrs for -nt 1.
-- Now uses new tagging capabilities so that 2.x runs will tag their logs as GATKPerformanceOverTime
-- Update bamboo runs, reverting back to gsa4 (it's slower but the results are less variable -- you were right david!).
-- Queue will incrementally now write out its jobReport.txt file whenever jobs finish running (FAIL or DONE)
-- This makes it far easier to track what's going on, or to analyze incrementally performance results coming out of Queue
-- Generally cleaned up the QJobsReporting code, creating a new clean class QJobsReporter that holds all of the information on what to do log and where to put into, which was previously scattered in QCommandLine and QJobReport
- Fix for M_Trieb's error report on the forum, and addition of integration tests to cover the walker.
- Addition of StructuralIndel as a class of variation within the VariantContext. These are for variants with a full alt allele that's >150bp in length.
- Adaptation of the MVLikelihoodRatio to work for a set of trios (takes the max over the trios of the MVLR)
- InsertSizeDistribution changed to use the new gatk report output (it was previously broken)
- RetrogeneDiscovery changed to be compatible with the new gatk report
- A maxIndelSize argument added to SelectVariants
- ByTranscriptEvaluator rewritten for cleanliness
- VariantRecalibrator modified to not exclude structural indels from recalibration if the mode is INDEL
- Documentation added to DepthOfCoverageIntegrationTest (no, don't yell at chartl ;_; )
Also sorry for the long commit history behind this that is the result of fixing merge conflicts. Because this *also* fixes a conflict (from git stash apply), for some reason I can't rebase all of them away. I'm pretty sure some of the commit notes say "this note isn't important because I'm going to rebase it anyway".
-- When merging multiple VCF records at a site, the combined VCF record has the QUAL of the first VCF record with a non-MISSING QUAL value. The previous behavior was to take the max QUAL, which resulted in sometime strange downstream confusion.
* No reads with Hard/Soft clips in the middle of the cigar
* No reads starting with deletions (with or without preceding clips)
* No reads ending in deletions (with or without follow-up clips)
* No reads that are fully hard or soft clipped
* No reads that have consecutive indels in the cigar (II, DD, ID or DI)
Also added systematic test for good cigars and iterative test for bad cigars.
-- Removed REFERENCE_BASES option. You only have REFERENCE now. There's no efficiency savings for the REFERENCE_BASES option any longer, since the reference bases are loaded lazy so if you don't use them there's effectively no cost to making the RefContext that could load them.