Story:
-----
- https://www.pivotaltracker.com/story/show/83800586
Changes:
-------
- In GVCFWriter GQ is now recalculated out of the fianl PL array for the block.
Testing:
-------
- Updated affected integration test md5s
Add more logging to annotators, change loggers from info to warn
Add comments to testStrandBiasBySample()
Clarify comments in testStrandBiasBySample
remove logic for not prcossing an indel if strand bias (SB) was not computed
remove per variant warnings in annotate()
Log warnings if using the wrong annotator or missing a pedgree file
Log test failures once in annotate(), because HaplotypeCaller does not call initialize(). Avoid using exceptions
Fix so only log once in annotate(), Hardey-Weinberg does not require pedigree files, fix test MD5s so pass
Check if founderIds == null
Update MD5s from HaplotypeCaller integrations tests and clean up code
Change logic so SnpEff does not throw excpetions, change engine to utils in imports
Update test MD5s, return immediately if cannot annotate in SnpEff.initialization()
Post peer review, add more logging warnings
Update MD5 for testHaplotypeCallerMultiSampleComplex1, return null if PossibleDeNovo.annotate() is not called by VariantAnnotator
Story:
-----
https://www.pivotaltracker.com/story/show/80684230
Changes:
-------
- Corrected the bug: AlignmentUtils#createReadAlignedToRef was
not realigning against the reference but the best haplotype for
the read.
Test:
----
- Added integration test in HaplotypeCallerIntegrationTest to check
that the bug has been fixed.
- Fixed md5s modified by this change; these are cause due to small
changes in the state of the random-number generator and read vs
variant site overlapping.
CombineGVCFs now outputs ref conf for the duration of deletions so that SNPs occuring in other samples aligned with those deletions will be genotyped correctly
Reading the multiple GATKText files as a single stream, especially with new top level target executable jar files pointing to a lib folder.
Don't dirty the build with a new GATKText.properties if input files are unmodified.
Stop warning on undocumented abstract classes.
Fixed ClassNotFoundException/NoClassDefFoundError by fixing ResourceBundleExtractorDoclet artifact.
Excluding Exceptions from documentation.
Removed custom log4j dependency from ResourceBundleExtractorDoclet.
Stop generating the dependency reduced pom during shade.
Stop regenerating gsalib when the files are already up to date.
Disabled mvn site generation from external-example.
Moved top level target symlinks to package jar files to under target/package.
Executable jar files are placed under target/executable with the new target[/lib] directories.
Under top level target, symlinks to *either* the package *or* the executable jars replace what was a symlink to the package jar path.
Allow disabling of the shade package.
ant-bridge.sh by default only builds executable jars, and doesn't package by default, as did the old ant build.xml.
Added a new package_path.sh utility script for other scripts to use instead of anything in the target folder.
remove final keyword before refMap and altMap, constructHaplotype() changes their values
return ArtificialHaplotype from constructHaplotype instaed of passing as an argument
Add logic so arraycopy does not throw an IndexOutOfBoundsException, add test for a long insert
* This argument is intended to be used in conjunction with -bamout, and disable early-exit optimizations to allow reference regions to be contained in the output bam
* Also forcibly includes the reference haplotype in the set of haplotypes given to the BAMWriter
* Made -dontTrimActiveRegions visible, as it is likely also desirable in this use case
* Addresses PT 77731660
remove TODO comment after activeProbThreshold
recover static ACTIVE_PROB_THRESHOLD for unit tests
Add min/max values for active_probability_threshold parameter
Move activeProbThreshold parameter to GATKArguemtnCollection
define ACTIVE_PROB_THRESHOLD in unit tests
add construction of argCollection in in ctor
Move arguments from GATKArgumentCollection to ActiveRegionWalker
Throw exception if threshold < 0 or > 1 in ActivityProfile ctor
max propogation distance parameter to ActiveRegionWalker for AcrtivityProfile
Use polymorphic getMaxProbPropagationDistance() so BandPassActivityProfile computes the crrect region size cutoff
Get the maxProbPropagationDistance from the super class's method, instead of directly, this is safer
Removed extraneous command line imports and make maxProbPropagationDistance a hidden argument
remove limit check for activeProbThreshold, not necessary because the check is made when imput as a command line arg
Remove extra 'region' in the doxygen param description for maxProbPropagationDistance
Rename parameters using camel case and add to integration test
Correct documentation for maxReadsInRegionPerSample and minReadsPerAlignmentStart
Change the argument--minReadsPerAlignmentStart in the integration test from 50 to 5
'each genomic location' only pertains to minReadsPerAlignmentStart, not maxReadsInRegionPerSample
The QUAL value calculated by this Exact AF Calculator is very underestimated when
there are more than one alternative allele (non-biallelic sites). The reason is
that the QUAL was roughly calculated by adding the QUALs resulting of each alternative
alleles vs all other alleles, reference and alts, collapsed. This is ok for MLEAC
calculations but not for QUAL.
Now, for calculating the QUAL we collapse all the alternatives as only one. This change
improves sensitivy with a cost of additional false positives, but this is naturally expected.
The resulting QUAL column is much closer to the one returned by the reference implementation.
Story:
https://www.pivotaltracker.com/story/show/75926368.
Changes:
Changed the QUAL calculation as described above.
Updated MD5s.
Fixed MD5s
The problem whas that the MLE table calculation aborted "unlikely"
genotype combinations to aggresively.
This also uncovered another bug where GeneralPloidyExactAFCalculation
makes a slightly different use of StateTracker
as compared to DiploidExactAFCalculation. We have changed StateTracker
generalizing it to be able to work with both using code behaviors.
Story:
-----
* https://www.pivotaltracker.com/story/show/78920568
Changes:
-------
* Fixes in GeneralPloidyExactAFCalculator.
* Needed changes in StateTracker API and its consequences in DiploidExactAFCalculation.
* Updated affected integrated tests' MD5s after fixing the GeneralPloidyExactAF.
Changes:
-------
* Updated current unit and integration test to use the new API components.
* Added unit tests for new classes AFPriorProvider and AFCalculatorProviders.
* Added integration test for mixed ploidy GenotypeGVCFs and CombineGVCFs
Changes:
-------
* GenotypingEngine uses now a AFCalc provider instead of
its own thread-local with one-time initialized and fixed
AF calculator.
* All walkers that use a GenotypingEngine now are passing
the appropiate AF calculator provider. For now most
just use a fix calculator (FixedAFCalculatorProvider)
except GenotypeGVCFs as this one now can cope with
mixture of ploidies failing-over to a general-ploidy
calculator when the preferred implementation is not
capable to handle a site's analysis.
to the total-ploidy (added ploidy accross samples).
Changes:
--------
* Instead of calculate a fixed log10 prior array with a fix
total likelihood we use a new component, the AFPriorProvider
to generate the priors for different total plodies on
demand; these are cached however so there is no unecessary
recompute involved.
with mixed ploidies and max-alt-allele number changes dynamically.
Changes:
--------
* Moved the AFCalcFactory.Calculation enum in a top level class
AFCalculatorImplementation.
* Given more reponsabilities to the enum like resolving the constructor
method once per implementation and the best-model selection algorithm.
* Removed test-code only fields and methods from AFCalc; just used to perform
unit-testing and not any actual functionality of this component.
* Removed the fixed ploidy constraint of GeneralPloidyExactAFCalc
implementation... now can deal with mixed ploidies that may change
per site and sample.
* Removed the fixed maxAltAllele restriction by allowing resizing of
the stateTracker structures.
* Due to previous two points now call the the AFCalc object are passed
the default-ploidy to assume in case some genotype in the input
VC does not have it and the max-alt-allele.
* Also due to those changes, removed the now totally useless 3 int
parameters from all AFCalc constructors.
* Cleaned the code a bit from no further used components and methods.