For example, when the input is Haploid it is considered ok to have a FN if the actual genotype is 0/1 as there is 50% chance to not call it at all.
Also it considers that the genotype call is concordant as long as the AC is as close as it can be to the 50% percent given the ploidy. So for a 0/1 true call is it ok
to have a 0 or 1 call in haploids and also 0/0/1/1 in tetraploid, and also 0/0/1 or 0/1/1 with triploid input, but it is not a 0/0/0/1 in tetraploids or 0/0/0/0/1/1 with hextaploid input.
Story:
http://www.pivotaltracker.com/story/show/72090992
Changes:
AssessNA12878 has a new argument (-ploidy / --inputPloidy) to indicate the expected ploidy of the input.
By default this is the obvious choice of 2 as NA12878 is human.
In the input has calls with a different ploidy it will complain with an user exception.
Also some refactoring has been done to make the code a bit more concise in some parts.
-- Global mismapping penalty was only applied to the reference haplotype. This led to problems with overlapping events, mostly STR haplotypes. Now the penalty is applied to every haplotype.
-- We subset the reads down to only those which overlap the event (after assembly based realignment) for likelihood calculations.
In these cases, where the alignment contains multiple indels, we output a single complex
variant instead of the multiple partial indels.
We also re-enable dangling tail recovery by default.
-- AD,DP will now correspond directly to the reads that were used to construct the PLs
-- RankSumTests, etc. will use the bases from the realigned reads instead of the original alignments
-- There is now no additional runtime cost to realign the reads when using bamout or GVCF mode
-- bamout mode no longer sets the mapping quality to zero for uninformative reads, instead the read will not be given an HC tag
(Right now it only works if all members of the trio are called.)
Takes posteriors as input, defaulting to PLs
Added annotations for possible de novos for us in full genotype refinement pipeline
Added family priors to CGP integration test.
Changed CGP to use PP tag instead of GP tag because posteriors are Phred-scaled. Updated CGP integration test md5s to reflect change.
- New arguments are nda, hets, indelHeterozygosity, stand_call_conf, stand_emit_conf, ploidy, and maxAltAlleles
- Addresses PT 70110918
- To do this, moved those arguments out of the StandardCallerArgumentCollection into a new GenotypeCalculationArgumentCollection, which is now included as a member of SCAC
-When parental genotypes are available, implements an HMM on genotype observations in the quartet.
-Outputs IBD regions as well as per-site posterior probabilities of being in each IBD state.
-Includes an experimental heuristic based mode for when parental genotypes are not available.
-Made a method in MendelianViolation public static to reuse code.
-Added the mockito library to private/gatk-tools-private/pom.xml