Commit Graph

1556 Commits (00a434a38bd7d74484bfbb858deecc091ee928f2)

Author SHA1 Message Date
Ron Levine abc4d5b7b3 Bypass spanning deletions in Rank Sum tests 2016-08-17 14:02:22 -04:00
Peter Fan 3510906c7f addresses issue #1280 now interval padding works for exclude intervals 2016-08-10 13:45:45 -04:00
Samuel Lee 49507faaa3 Changed maximum allowed GQB value to 100. 2016-08-05 13:06:31 -04:00
Andrii Nikitiuk a465c87ff8 Added support for directly reading SRA runs 2016-08-02 15:21:14 -04:00
Samuel Lee 832a383acd Fixed MD5 broken by PR #1440. 2016-07-27 13:51:20 -04:00
samuelklee 9a6ce7a347 Merge pull request #1440 from broadinstitute/sl_issue_1345
Added exception for GQB values greater than MAX_GENOTYPE_QUAL and tests.
2016-07-26 14:55:59 -04:00
Valentin Ruano Rubio fef63ce6a8 Make sure that multi-alleleic uninformative PLs (0,0,...,0) stay uninformative after biallelization.
Addresses issue #1439 (thus #1437).

Fixes a bug where non informative PLs were not handled appropriatelly when calculating multi-allelic site QUAL values.
This was resulting in long execution times for very large datasets (~200,000 samples in the case of ExAC2).
2016-07-25 17:19:03 -04:00
Samuel Lee 3daed9e5a1 Added exception for GQB values greater than MAX_GENOTYPE_QUAL and tests. 2016-07-20 16:48:59 -04:00
Ron Levine 7392c4d1b0 Removed spanning deletions if the deletion was removed 2016-07-19 12:23:49 -04:00
Laura Gauthier 641382eb8b Fix BetaTestingAnnotation group
Add test
2016-07-13 16:05:21 -04:00
Takuto Sato d6d0678b50 Build on Laura's code and finish porting MuTect1 clustered read position filter. 2016-07-11 17:33:08 -04:00
Samuel Lee 9b32cf5291 Fixed merging of GVCF blocks by fixing rounding of GQ values in ReferenceConfidenceModel. 2016-07-06 10:08:08 -04:00
Takuto Sato 2c94f74a95 Merge pull request #1404 from broadinstitute/ldg_M2_addM1filters
MuTect 2: port strand artifact filter from MuTect 1
2016-07-05 13:27:35 -04:00
Valentin Ruano Rubio 45607d1b30 RCM Variant sites merger won't output PL when there are too many alleles in order to avoid memory issues with large cohort runs.
Small additional "cosmetic" changes to the code
Addresses issue #1419.
2016-07-01 11:33:07 -04:00
Steve Huang 1ff234e7dd remove alt alleles, when genotype count is explosively large, based on alleles' highest supporting haplotype score; max tolerable genotype count is controlled by a default value overridable by user
remove alt alleles, when genotype count is explosively large, based on alleles' highest supporting haplotype score; max tolerable genotype count is controlled by a default value overridable by user
2016-06-30 22:36:49 -04:00
Takuto Sato 63e0865491 Built on Laura's code to port the strand bias filter from M1 and refactored code aroud SomaticGenotypingEngine. Added a new integration test. 2016-06-29 22:46:40 -04:00
Laura Gauthier 4066bcd75c Add new annotator for M1 clustered read position filter and M1 strand bias filter. 2016-06-29 22:46:37 -04:00
meganshand 1b921666a7 Change to max value of ExcessHet 2016-06-29 16:33:50 -04:00
meganshand 556cc69185 Fix for int overflow in RankSum calculation 2016-06-29 12:02:13 -04:00
Valentin Ruano Rubio 07052ba8ea Changes to use the median rather than the second best likelihood for the NON_REF allele
Addresses issue #1378 following the first proposal using the 'median' rather than the 'mean'.
2016-06-28 13:10:22 -04:00
Samuel Lee 76bb8fd9e5 Allows GatherBqsrReports to accept a .list file as input. 2016-06-22 12:39:18 -04:00
Ron Levine 427645162b SelectVariants works with non-diploids 2016-06-21 12:26:13 -04:00
Valentin Ruano Rubio 857459e420 Silly mistake '<' for a '<='. It was causing the exception when the exact number of alleles to drop was matching MAX_DROPPED_ALTERNATIVE_ALLELES_TO_LOG exactly (fixed to 20).
I change the code to impose a maximum allele list message length instead and in the process I fixed the bug.
2016-06-17 15:22:59 -04:00
Samuel Lee e119feee61 Added regression test for genotyping of spanning deletions in GenotypeGCVFs. 2016-06-15 09:48:26 -04:00
Ron Levine ba2e7be05b Add integration test using -maxNumPLValues for GenotypeGVCFs 2016-06-07 14:38:12 -04:00
Geraldine Van der Auwera 85dce75f3f Update pom versions to mark the start of GATK 3.7 development 2016-06-01 17:21:48 -04:00
Geraldine Van der Auwera f185a75e1c Update pom versions for the 3.6 release 2016-06-01 17:08:17 -04:00
Geraldine Van der Auwera b95b76b0e2 Merge pull request #1394 from broadinstitute/gvda_add_colt_dependency
Add colt > cern.jet.normal dependency
2016-06-01 14:16:57 -04:00
Geraldine Van der Auwera bd2626bea2 Add colt > cern.jet.normal dependency 2016-06-01 13:24:50 -04:00
Ron Levine 30665c7dbc Move Move htsjdk and picard to version 2.4.1 2016-05-31 22:36:38 -04:00
Geraldine Van der Auwera a76cb052e2 Ability to retry building VQSR model (contributed by mdp) 2016-05-31 18:57:55 -04:00
Geraldine Van der Auwera d87345cd1d GATKDocs overhaul
- Fixed displaying of default values
    - Removed code cruft
    - Reorganized tooldoc categories and improved names
    - Reorganized tools within categories where applicable
    - Touched up various tool docs
    - Switched default gatkdocs output to html
    - Added parameter in agrregator pom to control output type
    - Set gatkdocs publishing script to output php
    - Deprecated GenotypeAndValidate walker
    - Added back PhoneHome arguments with @Deprecated annotations
2016-05-29 16:35:08 -04:00
Geraldine Van der Auwera efbbbb1bd9 Add M2 to the HC annotations check 2016-05-27 13:49:31 -04:00
Geraldine Van der Auwera c4a06ad20a Move indel realignment to public 2016-05-27 12:39:58 -04:00
Valentin Ruano Rubio 9d32dec9cd Fix for the sum(AD) > DP bug.
Closes issue #1340
2016-05-26 15:04:52 -04:00
Yossi Farjoun 25fa25b618 Added option to validate gvcf (for ValidateVariants) (#1379)
* with option --gvcf CLP will now put extra checks that a gvcf must adhere to (existance of <NON_REF> allele at every variant, and that the variants in total cover the entire requested intervals, or the whole genome if no intervals have been specified)
* works on gvcf produced by HC when using either GVCF or BP_RESOLUTION mode
* added positive and negative tests
2016-05-26 06:42:45 -04:00
Steve Huang e1fadae139 Fix error in InfiniteRandomMatingPopulationModel.getLikelihoodsCalculator
Same issue noticed in GATK4 [here](https://github.com/broadinstitute/gatk/issues/1856)
2016-05-25 17:23:26 -04:00
Geraldine Van der Auwera 2c8356519c Merge pull request #1375 from broadinstitute/gvda_fix_offbyone_maxAltAlls_M2_#1297
Fixed M2 max alt alleles threshold evaluation error
2016-05-19 13:59:02 -04:00
samuelklee 7fdd3c2a0c Merge pull request #1358 from broadinstitute/sl_issue_1327
Changed calls with GQ=0 to no-call for HaplotypeCaller in normal mode.
2016-05-19 12:26:35 -04:00
Laura Gauthier 644076b1e1 Add fix and test for finalizing MQ annotation at BP resolution for variant and ref samples
Addresses issue #1356
2016-05-19 08:15:30 -04:00
Geraldine Van der Auwera f5456a3761 Fixed M2 max alt alleles threshold evaluation error
Also clarified some argument docs
2016-05-18 21:54:30 -04:00
Samuel Lee bf4b1a5421 Changed calls for GQ=0 from 0/0 to ./. for HaplotypeCaller in normal mode. 2016-05-18 13:17:27 -04:00
Ron Levine 35a06879f1 Move htsjdk and picard to version 2.3.0 2016-05-16 14:50:00 -04:00
David Benjamin 8623830267 Fixed bug in which consecutive SPAN_DELS were merged into a ** MNP.# 2016-05-06 01:36:14 -04:00
samuelklee aa0c76a166 Merge pull request #1326 from broadinstitute/sl_issue_1293
Added maxNumPLValues argument to allow users to set maximum number of PL values in output.
2016-04-28 10:52:27 -04:00
David Benjamin aecaa6d38e Allow GenotypeGVCFs to emit ref sites. 2016-04-27 15:53:44 -04:00
Geraldine Van der Auwera 14fe8b1e0e Moved BQSRGatherer and dependencies to the public module 2016-04-27 07:15:28 -04:00
Samuel Lee e08940a5a8 Added maxNumPLValues argument to allow users to set maximum number of PL values in output. 2016-04-26 23:30:25 -04:00
Ron Levine f337b45724 Move htsjdk and picard to version 2.0.0
Conflicts:
	protected/gatk-tools-protected/src/test/java/org/broadinstitute/gatk/tools/walkers/genotyper/UnifiedGenotyperGeneralPloidySuite1IntegrationTest.java
	protected/gatk-tools-protected/src/test/java/org/broadinstitute/gatk/tools/walkers/genotyper/UnifiedGenotyperGeneralPloidySuite2IntegrationTest.java
	protected/gatk-tools-protected/src/test/java/org/broadinstitute/gatk/tools/walkers/genotyper/UnifiedGenotyperIndelCallingIntegrationTest.java
	protected/gatk-tools-protected/src/test/java/org/broadinstitute/gatk/tools/walkers/haplotypecaller/HaplotypeCallerIntegrationTest.java
2016-04-25 14:51:25 -04:00
meganshand 509400495b Changes edge case calculation for RankSumTest #1341 2016-04-22 14:41:05 -04:00
David Benjamin c040da427d Replace string literals for annotation groups. Closes #1216. 2016-04-19 15:54:44 -04:00
Ron Levine e2828104b1 SelectVariants and VariantFiltration not updating AC, AN and AF for --setFilteredGtToNocall 2016-04-17 10:24:05 -04:00
Ron Levine 0eba8822e2 Change HashMap to LinkedHashMap for predictable iteration 2016-04-10 20:10:38 -04:00
Laura Gauthier d573fc4adf Add some comments to AFCalculationResult
Add note that VC may be null (in this case because there are too many alts)
Add todo for possible inefficient code
2016-04-04 09:35:54 -04:00
Geraldine Van der Auwera 68b068d2b3 Merge pull request #1329 from broadinstitute/cn_murmur_hash
Use older version of murmur hash implementation included in gatk…
2016-03-31 13:00:43 -04:00
Ron Levine edc1b20132 Output a summary of WARN messages 2016-03-29 11:39:18 -04:00
Chris Norman 017828e4b7 Use the older murmur hash implementation that is included in the gatk jar. 2016-03-29 10:20:44 -04:00
Chris Norman c81795cf50 Fix license in DroppedReadsTracker. 2016-03-22 13:39:13 -04:00
Chris Norman a8ebf21ac3 Implementation of dropped read tracking. 2016-03-21 18:36:29 -04:00
Ron Levine b873467756 Merge pull request #1321 from broadinstitute/rhl_fix_hr_logger_name
Change logger class to HomopolymerRun
2016-03-16 11:44:40 -04:00
Ron Levine 5c0e97def4 Change logger class to HomopolymerRun 2016-03-16 10:01:21 -04:00
Laura Gauthier 9ffdfeccd5 Add test case for fix 2016-03-15 11:06:48 -04:00
Laura Gauthier 31fc64f82c To address weird case with all hom-refs, but alt allele is present, skip AS_QD for all-ref sites and remove raw annotations whether or not they can be successfully finalized 2016-03-15 09:34:47 -04:00
Ron Levine 6d7ec7377c Set --reference_window_stop if homopolymer is greater than window size 2016-03-14 18:09:41 -04:00
Geraldine Van der Auwera c81d5b898e Clarify VQSR inputs documentation 2016-03-11 16:38:33 -05:00
Geraldine Van der Auwera 4990ed706a Fixup for licensing update 2016-03-11 16:23:02 -05:00
Takuto Sato 6308b0f036 Add MuTect2 Tumor-only test. 2016-03-11 09:10:43 -05:00
ldgauthier d0432713e0 Merge pull request #1311 from broadinstitute/ldg_BetaTestAnnotationsGroup
Add classes from "annotation party" to BetaTesting group
2016-03-11 08:13:32 -05:00
Geraldine Van der Auwera 16ef36088e Merge pull request #1308 from broadinstitute/gvda_fix_license_quotes_#1307
Update licenses
2016-03-10 12:37:53 -05:00
Laura Gauthier 28cfb06513 Add AS_culprit and AS_VQSLOD to VCF header in ApplyRecalibration so output passes VCF validation 2016-03-09 08:22:58 -05:00
Laura Gauthier d9f9bd1d56 Add classes from "annotation party" to BetaTesting group 2016-03-09 08:17:44 -05:00
Geraldine Van der Auwera 9a306ca221 Update licenses 2016-03-05 01:09:43 -08:00
Geraldine Van der Auwera 2b70f14740 Misc documentation improvements
Added caveat to VariantFiltration documentation
  Fixed PON creation example in M2 doc
  Improved MalformedReadFilter doc
  Updated N CIGAR error message
2016-03-03 15:48:54 -08:00
Ron Levine e5c5804141 Modify MD5s to correct RankSum annotations 2016-03-01 14:05:51 -05:00
Ron Levine 625941dc50 Merge pull request #1290 from broadinstitute/rhl_sac_nonref
StrandAlleleCountsBySample selects most likely only from VCF alleles
2016-02-29 17:05:57 -05:00
Ron Levine 5e2ffc188b Merge pull request #1295 from broadinstitute/rhl_sv_error_output_1194
Correct error messages and error handling in multiple tools
2016-02-29 17:05:24 -05:00
Ron Levine 40a5adf767 Change error output to use the correct argument 2016-02-29 13:21:03 -05:00
Ron Levine 80a22aad77 StrandAlleleCountsBySample selects most likely only from VCF alleles 2016-02-29 12:41:55 -05:00
meganshand c7e0f5b225 Removes Dithering from Rank Sum Test
Fixing empty group case

Fixing MD5s

First comments addressed

Added permutation test

Adding new RankSum to AS_RankSum

Speeding up permutation algorithm and updating MD5s

Missed a few tests

Addressing comments

Changing md5s
2016-02-29 11:45:27 -05:00
Takuto Sato 243a0fcb74 Allele-specific insert size ranksum annotation 2016-01-28 16:03:57 -05:00
Laura Gauthier 5592e4ead0 Add new -AS mode to run VQSR (both VariantRecalibrator and ApplyRecalibration) in an allele-specific manner 2016-01-22 13:18:21 -05:00
Ron Levine ed933013fe Remove variant contig order check 2016-01-16 19:32:28 -05:00
Eric Banks c57c32b915 Merge pull request #1270 from broadinstitute/eb_small_genotyping_optimizations
Small optimizations to the joint calling code.
2016-01-13 08:46:57 -05:00
Eric Banks ab2f541d1f Small optimizations to the joint calling code.
Thanks to profiling I noticed that the determineCoefficient() method was being called too often.
Because it returns a constant result in half of the invocations, its value should be cached when possible.
Also, the various calls to getLog10Likelihoods() showed up in the profiler, so I pulled those out too.

All told, it speeds up the genotyping by about 10 percent according to the profiler.
2016-01-12 21:48:20 -05:00
Laura Gauthier 593c9ddf01 Allow VariantsToTable to evaluate the type of each split variant when -F TYPE and -SMA are specified 2016-01-12 08:12:29 -05:00
Laura Gauthier 204cad3646 Remove "mem_free" from resident memory request params for Queue because it doesn't work and wouldn't actually reserve memory anyway 2016-01-08 10:27:56 -05:00
Ron Levine d16ed98c9e Backport maxNoCall functionality from GATK4 2016-01-06 11:09:38 -05:00
ldgauthier 1d72ab099c Merge pull request #1247 from broadinstitute/ldg_VQSRmodelOutput
Add optional argument for VQSR to output the model to a file as a GAT…
2016-01-05 08:35:18 -05:00
Ron Levine fa1d90d236 Merge consecutive SNPs on the same read 2016-01-04 13:48:59 -05:00
Laura Gauthier 2ddc48914e Add optional argument for VQSR to output the model to a file as a GATKReport
GATKReport output also has mean and variance for annotation normalization info
2016-01-04 08:37:08 -05:00
ldgauthier 71c6709765 Merge pull request #1145 from broadinstitute/ldg_M2_HapMapSensitivity
Fix no-normal bug; add HapMap sensitivity benchmarking
2016-01-04 08:27:37 -05:00
Ron Levine aa5e88a393 Fix exception when writing gVCF to stdout 2015-12-29 15:30:53 -05:00
meganshand eb6bdb2a62 MQ of Mate RankSum annotation
Intermediate commit for tests

Adding tests

Fixing tests after rebase

Fixing one MD5

Fixing documentation

Removing annotation from standard group

Adding documentation
2015-12-23 10:24:40 -05:00
Laura Gauthier f9e9d2e273 Fix no-normal bug; add HapMap sensitivity benchmarking 2015-12-22 08:29:01 -05:00
Ron Levine 9c8f035780 LeftAlignAndTrimVariants --splitMultiallelics keeps GT if valid 2015-12-14 10:42:32 -05:00
Geraldine Van der Auwera 4767a83d8a Update pom versions to mark the start of GATK 3.6 development 2015-11-25 01:52:51 -05:00
Geraldine Van der Auwera bf875974d1 Prep MuTect2 and ContEst for release
Renamed M2 to MuTect2
    Renamed ContaminationWalker to ContEst
    Refactored related tests and usages (including in Queue scripts)
    Moved M2 and ContEst + accompanying classes from private to protected
    Made QSS a StandardSomaticAnnotation (new annotation group/interface) to prevent it from being sucked in with the rest of the StandardAnnotation group
2015-11-24 16:43:20 -05:00
Mark Fleharty 5a3756410c Merge pull request #1231 from broadinstitute/mf_fixBQSRIntegrationTest
Fixes testPRWithConflictingArguments_qqAndSQQ to use -ql rather than -q1
2015-11-23 17:08:30 -05:00
Geraldine Van der Auwera b0730c2b81 Merge pull request #1239 from broadinstitute/gvda_straggler_doc_fixes_1237
Improve doc block of GatherBqsrReports
Annotation doc enhancements (QD, InbreedingCoeff, ExcessHet and AS versions where applicable)
2015-11-22 13:58:20 -05:00
Geraldine Van der Auwera a7748368f8 Yet more doc improvements prior to 3.5 release
Improve doc block of GatherBqsrReports
    Annotation doc enhancements (QD, InbreedingCoeff, ExcessHet and AS versions where applicable)
2015-11-22 10:59:24 -05:00