Exclude MQ0BySample
Move SD and TRA to new StandardUGAnnotation interface
There is now annotation interface (StandardUGAnnotation) holding annots that are standard in UG but should't be used as they are now with HC. This allows us to not have to exclude these annotations explicitly in HC, but still be able to use them for development purposes.
Fairly minor if plentiful fixes to various gatkdocs. Merging this without formal review since all tests pass, the gatkdocs build, and no one really wants to review corrections to grammar, typos and layout for 120+ documents. Review will be done by users in production ;-)
Build a ReferenceContext in ActiveRegionWalkers to pass in to annotation engine so we can call the TandemRepeatAnnotator from M2
Make TandemRepeatAnnotator default annotation for M2.
Setup (but don't use yet) HC-style contamination downsampling.
New HC integration test with TandemRepeatAnnotator
- ASEReadCounter (public tool) replce Tuuli's script to produce the input to Manny's tool.
It count the number of reads that support the ref allele and the alt allele, filtereing low qual reads and bases and keep only properPaired reads
- ASECaller (private tool) take both RNA and DNA, and produce ontingencyTables ** still under development **
minor changes in other tools:
- update RNA HC variant calling scala script
- expose FS method pValueForContingencyTable to be able to call it from ASEcaller
In ASEReadCounter:
- allow different option to deal with overlaping read from the same fragment
- add option to ignore or include indels in the pileups
- add option to disabled DuplicateRead
add ASEReadCounterIntegrationTest.java and files for the test
* Removed unused annotations (CCC and HWP)
* Renamed one of the two GC annotations to "IGC" (for Interval GC)
* Revved picard & htsjdk (GATK constants are now removed from htsjdk)
* PT 82046038
Add multi-allele test for info field annotations
Fix to process all types of INFO annotations
roll back to previous version, removes INFO and FORMAT
Correct @return for VariantAnnotatorEngine.getNonReferenceAlleles()
Enhance comments and clean up multi-allelic logic, handle header info number = R
only parse counts of A & R
Add INFO for AC
update MD5
Performance enhancement, only parse multiallelic with a count A or R
Make argument final in getNonReferenceAlleles()
Code cleanup, add exceptions for bad expression/allele size mismatch and missing header info for an expression
Change exception to warning for expression value/number of alleles check
remove adevertised exceptions
* PT 84242218
* Note that FORMAT fields behave the same as INFO fields - if the annotation has a count of A (one entry per Alt Allele), it is split across the multiple output lines. Otherwise, the entire list is output with each field
Add more logging to annotators, change loggers from info to warn
Add comments to testStrandBiasBySample()
Clarify comments in testStrandBiasBySample
remove logic for not prcossing an indel if strand bias (SB) was not computed
remove per variant warnings in annotate()
Log warnings if using the wrong annotator or missing a pedgree file
Log test failures once in annotate(), because HaplotypeCaller does not call initialize(). Avoid using exceptions
Fix so only log once in annotate(), Hardey-Weinberg does not require pedigree files, fix test MD5s so pass
Check if founderIds == null
Update MD5s from HaplotypeCaller integrations tests and clean up code
Change logic so SnpEff does not throw excpetions, change engine to utils in imports
Update test MD5s, return immediately if cannot annotate in SnpEff.initialization()
Post peer review, add more logging warnings
Update MD5 for testHaplotypeCallerMultiSampleComplex1, return null if PossibleDeNovo.annotate() is not called by VariantAnnotator
Fixed off by one error in size calculation IntervalUtils.scatterContigIntervals().
In test for fewer files than intervals, adjusted expected intervals.
In test for more files than intervals, adjusted expected exception.
remove TODO comment after activeProbThreshold
recover static ACTIVE_PROB_THRESHOLD for unit tests
Add min/max values for active_probability_threshold parameter
Move activeProbThreshold parameter to GATKArguemtnCollection
define ACTIVE_PROB_THRESHOLD in unit tests
add construction of argCollection in in ctor
Move arguments from GATKArgumentCollection to ActiveRegionWalker
Throw exception if threshold < 0 or > 1 in ActivityProfile ctor
max propogation distance parameter to ActiveRegionWalker for AcrtivityProfile
Use polymorphic getMaxProbPropagationDistance() so BandPassActivityProfile computes the crrect region size cutoff
Get the maxProbPropagationDistance from the super class's method, instead of directly, this is safer
Removed extraneous command line imports and make maxProbPropagationDistance a hidden argument
remove limit check for activeProbThreshold, not necessary because the check is made when imput as a command line arg
Remove extra 'region' in the doxygen param description for maxProbPropagationDistance
* This argument forces GATK to always write every record in the VCF format field, even if some records at the end are missing and could be removed
* Revved htsjdk and picard
* PT 70993484
Changes:
-------
* Updated current unit and integration test to use the new API components.
* Added unit tests for new classes AFPriorProvider and AFCalculatorProviders.
* Added integration test for mixed ploidy GenotypeGVCFs and CombineGVCFs
Changes:
-------
* GenotypingEngine uses now a AFCalc provider instead of
its own thread-local with one-time initialized and fixed
AF calculator.
* All walkers that use a GenotypingEngine now are passing
the appropiate AF calculator provider. For now most
just use a fix calculator (FixedAFCalculatorProvider)
except GenotypeGVCFs as this one now can cope with
mixture of ploidies failing-over to a general-ploidy
calculator when the preferred implementation is not
capable to handle a site's analysis.