Fixing empty group case
Fixing MD5s
First comments addressed
Added permutation test
Adding new RankSum to AS_RankSum
Speeding up permutation algorithm and updating MD5s
Missed a few tests
Addressing comments
Changing md5s
Intermediate commit for tests
Adding tests
Fixing tests after rebase
Fixing one MD5
Fixing documentation
Removing annotation from standard group
Adding documentation
Renamed M2 to MuTect2
Renamed ContaminationWalker to ContEst
Refactored related tests and usages (including in Queue scripts)
Moved M2 and ContEst + accompanying classes from private to protected
Made QSS a StandardSomaticAnnotation (new annotation group/interface) to prevent it from being sucked in with the rest of the StandardAnnotation group
ReadAdaptorTrimmer (unsound and untested)
BaseCoverageDistribution (redundant with DiagnoseTargets)
CoveredByNSamplesSites (redundant with DiagnoseTargets)
FindCoveredIntervals (redundant with DiagnoseTargets)
VariantValidationAssessor (has a scary TODO -- REWRITE THIS TO WORK WITH VARIANT CONTEXT comment and zero tests)
LiftOverVariants, FilterLiftedVariants and liftOverVCF.pl (in #1106) (use Picard liftover tool)
sortByRef.pl (use Picard SortVCF)
ListAnnotations (useless)
Also deleted the java archive from the private repository (old junk we never use)
Grouped default output annotations to keep them from getting dropped when -A is specified; addresses #918
Also refactored code shared by ExcessHet and InbreedingCoeff
Integration Tests
Updated test
Changed method
Minor changes
Changed whitespace
Fixed uncalled counts and 0 in R
Fixed ReadBackedPileUp
Removed imports and changed MD5
Fixed failing test
Adding vqslod color
Updating script to create KB
Fixing integration test now that the KB is bigger
Adressing comments
The ParallelShell job runner will run jobs locally on one node concurrently as specified by the DAG, with the option to limit the maximum number of concurrently running jobs using the flag `maximumNumberOfJobsToRunConcurrently`.
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
Updated other IntelliJ IDEA warnings in GATKBAMIndex.
Updated example .cram files to match versions generated by current GATK/HTSJDK.
Bumped HTSJDK and Picard to 1.139 releases.
Added support for using `-SNAPSHOT` of HTSJDK in the future.
This change doesn't affect the performance of the Indel Realigner at all (as per tests).
This is just a request from the Picard side (where further testing is happening).
Make MQ threshold a parameter (compare to M1 by setting to zero)
Add logic for multiple alternate alleles in tumor
Exclude MQ0 normal reads from normal LOD calculation
Fix path errors in Dream_Evaluations.md
Move M2 eval scripts out of walkers package so they run
Previous version of OverclippedReadFilter would only filter a read if both ends of a read had a soft-clipped block.
This adds a boolean option to relax that requirement, and only require 1 soft-clipped block, while also filtering on read length - softclipped length
CRAM now requires .bai index, just like BAM.
Test updates:
- Updated existing MD5s, as TLEN has changed.
- Tests multiple contigs.
- Tests several intervals per contig.
- Tests when `.cram.bai` is missing, even when `.cram.crai` is present.
Updated gatk docs for CRAM support, including:
- Arguments that work for both BAM and CRAM listed as such.
- Arguments that don't work for CRAM either explicitly say "BAM" or "doesn't work for CRAM".
- Instructions on how to recreate a `.cram.bai` using cramtools.
Cleaned up IntelliJ IDEA warnings regarding `Arrays.asList()` -> `Collections.singletonList()`.
Changed a division by -10.0 to a multiplication by -.1 in QualUtils (typically multiplication is faster than division).
Addresses performance issue #1081.
When using CatVariants, VCF files were being sorted solely on the base
pair position of the first record, ignoring the chromosome. This can
become problematic when merging files from different chromosomes,
espeically if you have multiple VCFs per chromosome.
As an example, assume the following 3 lines are all in separate files:
1 10
1 100
2 20
The merged VCF from CatVariants (without -assumeSorted) would read:
1 10
2 20
1 100
This has the potential to break tools that expect chromosomes to be
contiguous within a VCF file.
This commit changes the comparator from one of Pair<Integer, File> to
one of Pair<VariantContext, File>. We construct a
VariantContextComparator from the provided reference, which will sort
the first record by chromosome and position properly. Additionally, if
-assumeSorted is given, we simply use a null VariantContext as the first
record, which will all be equal (as all will be null)