ValidationStringency was moved from htsjdk.samtools.SAMFileReader to htsjdk.samtools
samtools find BAM index file method was also moved (and made public!)
Story:
https://www.pivotaltracker.com/story/show/73440292
Changes:
- Just add the conditional in HaplotypeCaller#initialize
Testing:
- Nothing added, checked locally, trivial change that would eventually be removed anyway.
Story:
https://www.pivotaltracker.com/story/show/75028590
Changes:
Added the possibility of indicating the genotype type to consider (with argument [-gtType TYPE]*, where TYPE is HET, HOM_REF or HOM_VAR)
Removed conditional Het evaluation based on input ploidy; now you need to use -gtType explictly.
Tests:
Added integration tests to check on new argument (-gtType) behaviour in AssessNA12878KnowledgeBaseTest
Don't expand out source nodes for tail merging, since that's a head merging action only.
This shows up as a bug only because we now allow merging tails against non-reference paths.
- Edited intervals merging docs for correctness & clarity
- Edited VQSR arg docs and made mode required (+added -mode SNP to VQSR tests)
- Moved PaperGenotyper to Toy Walkers to declutter the actually useful docs
- Moved GenotypeGVCFs to Variant Discovery category and clarified a few points
- Clarified that the -resource argument depends on using the -V:tag format
- Clarified how the pcr indel model works
- Added caveat for -U ALLOW_N_CIGAR_READS
- Added MathJax support for displaying equations in GATKDocs
- Updated HC example commands and caveats
This is useful for e.g. cases where there are SNPs on insertions. Before tails were forced to be merged
(incorrectly) only to a reference node, but now they can be merged to any path in the graph from which they
directly branch.
Also, I've transferred over Ryan's code to refuse to process kmer sizes such that there are non-unique kmers
in the reference sequence with them.
For example, when the input is Haploid it is considered ok to have a FN if the actual genotype is 0/1 as there is 50% chance to not call it at all.
Also it considers that the genotype call is concordant as long as the AC is as close as it can be to the 50% percent given the ploidy. So for a 0/1 true call is it ok
to have a 0 or 1 call in haploids and also 0/0/1/1 in tetraploid, and also 0/0/1 or 0/1/1 with triploid input, but it is not a 0/0/0/1 in tetraploids or 0/0/0/0/1/1 with hextaploid input.
Story:
http://www.pivotaltracker.com/story/show/72090992
Changes:
AssessNA12878 has a new argument (-ploidy / --inputPloidy) to indicate the expected ploidy of the input.
By default this is the obvious choice of 2 as NA12878 is human.
In the input has calls with a different ploidy it will complain with an user exception.
Also some refactoring has been done to make the code a bit more concise in some parts.
-- Global mismapping penalty was only applied to the reference haplotype. This led to problems with overlapping events, mostly STR haplotypes. Now the penalty is applied to every haplotype.
-- We subset the reads down to only those which overlap the event (after assembly based realignment) for likelihood calculations.
In these cases, where the alignment contains multiple indels, we output a single complex
variant instead of the multiple partial indels.
We also re-enable dangling tail recovery by default.