Commit Graph

4696 Commits (bc3b3ac0ec4b4fd72a9e856470edaeb4c7566a06)

Author SHA1 Message Date
Yossi Farjoun b471884d97 Added run-time information to log output
Also updated copyright date
2016-03-11 16:43:03 -05:00
Geraldine Van der Auwera 4990ed706a Fixup for licensing update 2016-03-11 16:23:02 -05:00
ldgauthier d0432713e0 Merge pull request #1311 from broadinstitute/ldg_BetaTestAnnotationsGroup
Add classes from "annotation party" to BetaTesting group
2016-03-11 08:13:32 -05:00
Geraldine Van der Auwera 16ef36088e Merge pull request #1308 from broadinstitute/gvda_fix_license_quotes_#1307
Update licenses
2016-03-10 12:37:53 -05:00
Laura Gauthier d9f9bd1d56 Add classes from "annotation party" to BetaTesting group 2016-03-09 08:17:44 -05:00
Ron Levine 244a217ee7 Fix sample_gene_summary reports header order 2016-03-08 22:21:51 -05:00
ldgauthier dcc6c0f2aa Merge pull request #1306 from broadinstitute/rhl_doc_overlapping_genes
Output coverage for all overlapping genes in DepthOfCoverage
2016-03-08 13:28:30 -05:00
Geraldine Van der Auwera 9a306ca221 Update licenses 2016-03-05 01:09:43 -08:00
Geraldine Van der Auwera 2b70f14740 Misc documentation improvements
Added caveat to VariantFiltration documentation
  Fixed PON creation example in M2 doc
  Improved MalformedReadFilter doc
  Updated N CIGAR error message
2016-03-03 15:48:54 -08:00
seru71 4d203b895a added support for overlapping exons/genes in DepthOfCoverage 2016-03-03 15:09:54 -05:00
Ron Levine 5e2ffc188b Merge pull request #1295 from broadinstitute/rhl_sv_error_output_1194
Correct error messages and error handling in multiple tools
2016-02-29 17:05:24 -05:00
Ron Levine 40a5adf767 Change error output to use the correct argument 2016-02-29 13:21:03 -05:00
meganshand c7e0f5b225 Removes Dithering from Rank Sum Test
Fixing empty group case

Fixing MD5s

First comments addressed

Added permutation test

Adding new RankSum to AS_RankSum

Speeding up permutation algorithm and updating MD5s

Missed a few tests

Addressing comments

Changing md5s
2016-02-29 11:45:27 -05:00
Yossi Farjoun 7896055be3 - Fixed bug in GenomeLoc parser
- Added a warning when two contigs are too similar that it might cause problems with parsing
- Added tests of modified parser and of warning.
2016-02-02 06:53:22 -05:00
ldgauthier 8ab2eef6f0 Merge pull request #1282 from broadinstitute/ts_annotation
Allele specific insert size ranks sum annotation
2016-01-29 18:13:46 -05:00
Takuto Sato 243a0fcb74 Allele-specific insert size ranksum annotation 2016-01-28 16:03:57 -05:00
Geraldine Van der Auwera 1e4a98827c Merge pull request #1278 from broadinstitute/gvda-expose_queue_setting
Expose time between checks as CLI argument (Queue)
2016-01-28 11:00:36 -05:00
Laura Gauthier 5592e4ead0 Add new -AS mode to run VQSR (both VariantRecalibrator and ApplyRecalibration) in an allele-specific manner 2016-01-22 13:18:21 -05:00
Geraldine Van der Auwera a46a4a6175 Expose time between checks for whether new jobs can be submitted as a user-settable parameter on CLi. Useful when testing pipelines to make idle time shorter. Contributed by @dakl (Daniel Klevebring on GATK forum). 2016-01-22 12:13:07 -05:00
Geraldine Van der Auwera c93a611ea3 Remove unneeded dependency
Addresses https://github.com/broadgsa/gatk/pull/15 for Guillermo
2016-01-21 16:51:01 -05:00
Ron Levine ed933013fe Remove variant contig order check 2016-01-16 19:32:28 -05:00
Laura Gauthier 593c9ddf01 Allow VariantsToTable to evaluate the type of each split variant when -F TYPE and -SMA are specified 2016-01-12 08:12:29 -05:00
Ron Levine d16ed98c9e Backport maxNoCall functionality from GATK4 2016-01-06 11:09:38 -05:00
meganshand eb6bdb2a62 MQ of Mate RankSum annotation
Intermediate commit for tests

Adding tests

Fixing tests after rebase

Fixing one MD5

Fixing documentation

Removing annotation from standard group

Adding documentation
2015-12-23 10:24:40 -05:00
Ron Levine 9c8f035780 LeftAlignAndTrimVariants --splitMultiallelics keeps GT if valid 2015-12-14 10:42:32 -05:00
Geraldine Van der Auwera 4767a83d8a Update pom versions to mark the start of GATK 3.6 development 2015-11-25 01:52:51 -05:00
Geraldine Van der Auwera 9749adf22a Merge pull request #1236 from broadinstitute/gvda_prep_M2_release_1201
Prep MuTect2 for release
2015-11-24 20:12:57 -05:00
Geraldine Van der Auwera bf875974d1 Prep MuTect2 and ContEst for release
Renamed M2 to MuTect2
    Renamed ContaminationWalker to ContEst
    Refactored related tests and usages (including in Queue scripts)
    Moved M2 and ContEst + accompanying classes from private to protected
    Made QSS a StandardSomaticAnnotation (new annotation group/interface) to prevent it from being sucked in with the rest of the StandardAnnotation group
2015-11-24 16:43:20 -05:00
Geraldine Van der Auwera 88a0514ec7 Fix bug where gatkdocs of RodWalkers reported default LocusWalker downsampling settings 2015-11-23 17:53:19 -05:00
Geraldine Van der Auwera 46ba0e519e Restore FindCoveredIntervals + add docs 2015-11-22 10:19:04 -05:00
Ron Levine 08a9c80559 Make the header sequence dictionary match reference 2015-11-21 19:12:37 -05:00
Geraldine Van der Auwera 22fa1511be Merge pull request #1235 from broadinstitute/gvda_deprecate_useless_tools_1192
Deprecate tools that were outdated or redundant
2015-11-21 14:58:00 -05:00
Geraldine Van der Auwera 1cf66addaa Deprecate tools that were outdated or redundant
ReadAdaptorTrimmer (unsound and untested)
BaseCoverageDistribution (redundant with DiagnoseTargets)
CoveredByNSamplesSites (redundant with DiagnoseTargets)
FindCoveredIntervals (redundant with DiagnoseTargets)
VariantValidationAssessor (has a scary TODO -- REWRITE THIS TO WORK WITH VARIANT CONTEXT comment and zero tests)
LiftOverVariants, FilterLiftedVariants and liftOverVCF.pl (in #1106) (use Picard liftover tool)
sortByRef.pl (use Picard SortVCF)
ListAnnotations (useless)

Also deleted the java archive from the private repository (old junk we never use)
2015-11-20 22:49:40 -05:00
meganshand 2570cab24c Assorted documentation fixes, enhancements and reorganization.
See issues referenced by the pull request for details.
2015-11-20 22:44:46 -05:00
Ron Levine 0b3b09e3ff Move htsjdk & picard to version 1.141 2015-11-20 16:26:26 -05:00
Ron Levine ccaddefa19 Validate VCF with sequence dictionary 2015-11-20 09:23:24 -05:00
Yossi Farjoun 4da0d1300c adding fraction informative reads annotation. 2015-11-18 08:39:47 -05:00
Laura Gauthier 25b8ba45f4 More allele-specific annotations: AS_QD and AS_InbreedingCoeff
Grouped default output annotations to keep them from getting dropped when -A is specified; addresses #918
Also refactored code shared by ExcessHet and InbreedingCoeff
2015-11-09 16:38:31 -05:00
meganshand e4627ed5c3 Addressing comments 2015-11-04 11:00:01 -05:00
meganshand b5165b8d30 Fix for out of date VCF version output 2015-11-03 17:35:47 -05:00
ldgauthier 3d1dc303b3 Merge pull request #1197 from broadinstitute/ts_ve_nullPointer
Prevent null pointer exception in PrintMissingComp module
2015-11-02 14:42:50 -05:00
Takuto Sato 33462c7b50 Removed the line that caused a null pointer, as the information it logged was not useful. Updated docs and added an integration test to ensure the code no longer throws the exception. 2015-11-02 12:45:09 -05:00
Laura Gauthier f7eb5d3082 Enable family-level stratification (if a ped file is provided) 2015-10-28 09:55:04 -04:00
ldgauthier 4fbcfc2e36 Merge pull request #1173 from broadinstitute/ldg_AS_annotations
New map-combine-finalize annotation framework
2015-10-27 12:04:56 -04:00
Mark Fleharty 19af5724c5 Fixed an NA12878 Knowledgebase test, and made the RDQ option for BQSR binning hidden 2015-10-27 09:44:26 -04:00
Laura Gauthier fcaf37279c Finished draft of code for new map-combine-reduce annotation framework
All VQSR annotations can be generated in allele-specific mode
Pull out allele-specific annotations in AS_Standard annotation group
2015-10-27 09:23:29 -04:00
Ron Levine 36ca9fe898 Allow LeftAlignAndTrimVariants to handle alleles longer than the default processing window 2015-10-25 20:33:56 -04:00
meganshand 0d936b28c4 Merge pull request #1178 from broadinstitute/ms_ROCCurve
ROCCurve High Confidence Mode
2015-10-22 09:19:13 -04:00
Ron Levine 795fe75886 Update doc for multiallelics, trimming is the default behavior 2015-10-22 04:04:09 -04:00
meganshand a57500b2fc ROCCurve High Confidence Mode
Integration Tests

Updated test

Changed method

Minor changes

Changed whitespace

Fixed uncalled counts and 0 in R

Fixed ReadBackedPileUp

Removed imports and changed MD5

Fixed failing test

Adding vqslod color

Updating script to create KB

Fixing integration test now that the KB is bigger

Adressing comments
2015-10-21 21:30:54 -04:00
Takuto Sato df7a482335 VariantAnnotator now supports annotating FILTER field from an external resource.
Updated the docs.
2015-10-14 14:26:21 -04:00
Chris Norman e776502c49 Fix Sample mergeValues failure when merging identical string values (#1156). 2015-10-12 09:56:05 -04:00
Ron Levine 2bcded11cb VariantAnnotator checks alleles when annotationg with external resource 2015-10-08 17:01:30 -04:00
Ron Levine 033115eae0 Move htsjdk & picard to version 1.140 2015-10-08 10:42:05 -04:00
Eric Banks 622ec352bb Fix for combining records in which one has a spanning deletion and needs a padded reference allele.
This was erroring out and not working.
2015-10-02 16:28:16 -04:00
Kate Noblett 506958a0b7 Implemented a new VariantEval evaulation module, MetricsCollection. Fixed null pointer exception, updated tests. 2015-09-30 17:21:30 -04:00
Ron Levine 792142ec50 Implement BaseCounts per-sample 2015-09-30 08:59:11 -04:00
Khalid Shakir 384a09e991 Minor updates to previous ParallelShell commit.
Changed `--maximumNumberOfJobsToRunConcurrently`/`-maxConcurrentRun` to `Option[Int]`.
Updated licenses.
Added basic tests.
Removed some IntelliJ warnings.
2015-09-29 09:36:37 -03:00
Johan Dahlberg b045f2d4aa ParallelShell added as a new JobRunner
The ParallelShell job runner will run jobs locally on one node concurrently as specified by the DAG, with the option to limit the maximum number of concurrently running jobs using the flag `maximumNumberOfJobsToRunConcurrently`.

Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2015-09-29 09:36:37 -03:00
samuelklee 302a69d685 Merge pull request #1165 from broadinstitute/sl_fix_no_calls
Changed calls for RGQ=0 from 0/0 to ./. in output of GenotypeGVCFs.
2015-09-28 12:26:18 -04:00
Geraldine Van der Auwera 118c559278 Trivial doc typo fix 2015-09-25 18:15:29 -04:00
Samuel Lee 0dacf60012 Changed calls for RGQ=0 from 0/0 to ./. in output of GenotypeGVCFs. 2015-09-23 15:35:09 -04:00
Ami Levy Moonshine 1ad00cc9d4 fix typo in the ASEReadCounter document 2015-09-21 15:30:06 -04:00
meganshand cdfe0d7b7c Adding PER_TARGET_COVERAGE option
Comments addressed
2015-09-18 09:34:51 -04:00
Ron Levine 3ecabf7e45 Allow overriding ValidateVariants' hard-coded cutoff for allele length 2015-09-17 10:49:14 -04:00
ldgauthier 5870225f83 Merge pull request #1153 from broadinstitute/ms_excess_het
Excess Het P-value
2015-09-15 11:52:25 -04:00
Khalid Shakir 24e24b9468 Using `SamIndexes.asBaiSeekableStreamOrNull()` to support `.cram.crai`.
Updated other IntelliJ IDEA warnings in GATKBAMIndex.
Updated example .cram files to match versions generated by current GATK/HTSJDK.
Bumped HTSJDK and Picard to 1.139 releases.
Added support for using `-SNAPSHOT` of HTSJDK in the future.
2015-09-14 12:20:36 -04:00
meganshand d767e1722e Excess Het P-value
Added input exception

Added header line

Updated MD5s

Changing more MD5s

Made edge case clearer

Fixed formatting

Changed mid-point to mode
2015-09-14 12:00:44 -04:00
Laura Gauthier 53b506a0b8 Make sure inputPriors get used if they are specified
Fix usage of AF prior (i.e. theta) in probability of non-reference calculation
Refactored duplicate functions
Updated docs for heterozygosity
2015-09-10 10:08:03 -04:00
Ron Levine 83a7012d69 Mask snps with --snpmask 2015-09-09 16:20:48 -04:00
Eric Banks 5f76ae6a37 Don't have the Indel Realigner change IUPAC reference bases.
This change doesn't affect the performance of the Indel Realigner at all (as per tests).
This is just a request from the Picard side (where further testing is happening).
2015-09-04 13:42:23 -04:00
Ron Levine 29ac64f6ce Calculate GenotypeAnnotations before InfoFieldAnnotations 2015-09-03 09:22:46 -04:00
Laura Gauthier a86f3909ca Update md5s for BAM header version change in Queue test output 2015-08-28 14:19:25 -04:00
Laura Gauthier 3dc68732fb Little changes to M2 code and docs
Make MQ threshold a parameter (compare to M1 by setting to zero)
Add logic for multiple alternate alleles in tumor
Exclude MQ0 normal reads from normal LOD calculation
Fix path errors in Dream_Evaluations.md
Move M2 eval scripts out of walkers package so they run
2015-08-27 15:31:27 -04:00
Mark Fleharty daeb55429e Adding Static Binning to BQSR 2015-08-24 13:36:17 -04:00
Ron Levine 2afe3f7a21 Make GenotypeGVCFs subset Strand Allele Counts intelligently 2015-08-22 08:33:09 -04:00
Bertrand Haas 158477ea6c Re-ran the updateAllLicenses.sh script 2015-08-21 11:32:51 -04:00
Ron Levine 900fe3f675 Merge pull request #1132 from broadinstitute/rhl_rev_htsjdk
Move htsjdk & picard to rev 1.138
2015-08-20 11:58:41 -04:00
Bertrand Haas eae4c875a9 Logistic transform of MQ + jitter to capped MQ in VariantDataManager 2015-08-20 11:10:45 -04:00
Ron Levine beec624a63 Move htsjdk & picard to rev 1.138 2015-08-20 10:42:25 -04:00
meganshand 5c9935ba10 Adding CollectWgsMetrics wrapper for queue
Fix license

Fixed IncludeBQHistogram
2015-08-14 10:18:12 -04:00
Yossi Farjoun 69fd4af15a Merge pull request #1111 from jsilter/overclippedreadfilter_endsoption
Add additional option to OverclippedReadFilter
2015-08-12 10:43:52 -04:00
Jacob Silterra 62625b4bc6 Add option to not require soft-clips on both ends
Previous version of OverclippedReadFilter would only filter a read if both ends of a read had a soft-clipped block.
This adds a boolean option to relax that requirement, and only require 1 soft-clipped block, while also filtering on read length - softclipped length
2015-08-12 10:38:27 -04:00
Khalid Shakir 9bee183f6c Switched to using CRAM's SamReader.Indexing implementation.
CRAM now requires .bai index, just like BAM.
Test updates:
- Updated existing MD5s, as TLEN has changed.
- Tests multiple contigs.
- Tests several intervals per contig.
- Tests when `.cram.bai` is missing, even when `.cram.crai` is present.
Updated gatk docs for CRAM support, including:
- Arguments that work for both BAM and CRAM listed as such.
- Arguments that don't work for CRAM either explicitly say "BAM" or "doesn't work for CRAM".
- Instructions on how to recreate a `.cram.bai` using cramtools.
Cleaned up IntelliJ IDEA warnings regarding `Arrays.asList()` -> `Collections.singletonList()`.
2015-08-11 17:52:49 -03:00
Geraldine Van der Auwera 19bbe45cbc Updated licenses for 2015 2015-08-06 15:23:11 -04:00
David Benjamin ddb01058d3 moved DiffObjects 2015-08-05 21:19:02 -04:00
Geraldine Van der Auwera 875c7ffa1a Fixed typos and made some argument docs improvements 2015-07-29 23:06:19 -04:00
Louis Bergelson 9d9827f176 Merge pull request #1031 from broadinstitute/lb_update_for_java8
Updated gatk so it compiles with java 8
2015-07-28 11:09:19 -04:00
vruano 8f6daf70db Refactoring of ReferenceConfidenceModel likelihood calculation in non variant sites
Changed a division by -10.0 to a multiplication by -.1 in QualUtils (typically multiplication is faster than division).

Addresses performance issue #1081.
2015-07-26 08:33:46 -04:00
David Roazen 5fd3d2be76 Move swapExt() methods to QScriptUtils, have versions in QScript class call into the util versions 2015-07-23 10:23:55 -04:00
Valentin Ruano Rubio 66cf22b28f Merge pull request #1069 from broadinstitute/vrr_ad_genotype_gvcfs_bugfix
Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
2015-07-22 18:53:43 -04:00
vruano 315e193e51 Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
Addresses issue #913.

Also remove some commented out code and toxic debugging code that uses System.out/err.println.
2015-07-22 17:08:13 -04:00
Joseph White 3bd988825f Removed walkers for handling Beagle data
Added deprecation statements to DeprecatedToolChecks.java
    Removed integration test for Beagle walker
    Added URL for Beagle documentation
2015-07-21 18:36:08 -04:00
Eric Banks 178bf12b27 Merge pull request #1046 from broadinstitute/rhl_catvariants_sort
Fix for mis-sorted VCF files in CatVariants
2015-07-21 17:37:27 -04:00
Valentin Ruano Rubio 9360e1d293 Merge pull request #1059 from broadinstitute/vrr_true_false_list_removal
More efficient implementation of the indel read qualities recalculati…
2015-07-21 17:13:45 -04:00
vruano 82f1236633 More efficient implementation of the indel read qualities recalculation for the PCR error model.
Addresses #1054.
2015-07-21 14:25:11 -04:00
Ron Levine 6e46b3696e Merge contiguous intervals properly 2015-07-14 15:23:37 -04:00
John Wallace 8fc631b7ae Fix for mis-sorted VCF files in CatVariants
When using CatVariants, VCF files were being sorted solely on the base
pair position of the first record, ignoring the chromosome.  This can
become problematic when merging files from different chromosomes,
espeically if you have multiple VCFs per chromosome.

As an example, assume the following 3 lines are all in separate files:
1       10
1       100
2       20

The merged VCF from CatVariants (without -assumeSorted) would read:
1       10
2       20
1       100

This has the potential to break tools that expect chromosomes to be
contiguous within a VCF file.

This commit changes the comparator from one of Pair<Integer, File> to
one of Pair<VariantContext, File>.  We construct a
VariantContextComparator from the provided reference, which will sort
the first record by chromosome and position properly.  Additionally, if
-assumeSorted is given, we simply use a null VariantContext as the first
record, which will all be equal (as all will be null)
2015-07-14 14:12:31 -04:00
Louis Bergelson e1c41b2c38 Updated gatk so it compiles on java 8
updated cofoja to 1.2 from 1.0
added explicit type casts in places that java 8 required them
2015-06-26 15:59:46 -04:00
Ron Levine 09686f4595 Make VQSLOD definition accurate 2015-06-25 16:47:50 -04:00