Geraldine Van der Auwera
68b068d2b3
Merge pull request #1329 from broadinstitute/cn_murmur_hash
...
Use older version of murmur hash implementation included in gatk…
2016-03-31 13:00:43 -04:00
Ron Levine
edc1b20132
Output a summary of WARN messages
2016-03-29 11:39:18 -04:00
Chris Norman
017828e4b7
Use the older murmur hash implementation that is included in the gatk jar.
2016-03-29 10:20:44 -04:00
Chris Norman
c81795cf50
Fix license in DroppedReadsTracker.
2016-03-22 13:39:13 -04:00
Chris Norman
a8ebf21ac3
Implementation of dropped read tracking.
2016-03-21 18:36:29 -04:00
Ron Levine
b873467756
Merge pull request #1321 from broadinstitute/rhl_fix_hr_logger_name
...
Change logger class to HomopolymerRun
2016-03-16 11:44:40 -04:00
Ron Levine
5c0e97def4
Change logger class to HomopolymerRun
2016-03-16 10:01:21 -04:00
Laura Gauthier
9ffdfeccd5
Add test case for fix
2016-03-15 11:06:48 -04:00
Laura Gauthier
31fc64f82c
To address weird case with all hom-refs, but alt allele is present, skip AS_QD for all-ref sites and remove raw annotations whether or not they can be successfully finalized
2016-03-15 09:34:47 -04:00
Ron Levine
6d7ec7377c
Set --reference_window_stop if homopolymer is greater than window size
2016-03-14 18:09:41 -04:00
Geraldine Van der Auwera
c81d5b898e
Clarify VQSR inputs documentation
2016-03-11 16:38:33 -05:00
Geraldine Van der Auwera
4990ed706a
Fixup for licensing update
2016-03-11 16:23:02 -05:00
Takuto Sato
6308b0f036
Add MuTect2 Tumor-only test.
2016-03-11 09:10:43 -05:00
ldgauthier
d0432713e0
Merge pull request #1311 from broadinstitute/ldg_BetaTestAnnotationsGroup
...
Add classes from "annotation party" to BetaTesting group
2016-03-11 08:13:32 -05:00
Geraldine Van der Auwera
16ef36088e
Merge pull request #1308 from broadinstitute/gvda_fix_license_quotes_#1307
...
Update licenses
2016-03-10 12:37:53 -05:00
Laura Gauthier
28cfb06513
Add AS_culprit and AS_VQSLOD to VCF header in ApplyRecalibration so output passes VCF validation
2016-03-09 08:22:58 -05:00
Laura Gauthier
d9f9bd1d56
Add classes from "annotation party" to BetaTesting group
2016-03-09 08:17:44 -05:00
Geraldine Van der Auwera
9a306ca221
Update licenses
2016-03-05 01:09:43 -08:00
Geraldine Van der Auwera
2b70f14740
Misc documentation improvements
...
Added caveat to VariantFiltration documentation
Fixed PON creation example in M2 doc
Improved MalformedReadFilter doc
Updated N CIGAR error message
2016-03-03 15:48:54 -08:00
Ron Levine
e5c5804141
Modify MD5s to correct RankSum annotations
2016-03-01 14:05:51 -05:00
Ron Levine
625941dc50
Merge pull request #1290 from broadinstitute/rhl_sac_nonref
...
StrandAlleleCountsBySample selects most likely only from VCF alleles
2016-02-29 17:05:57 -05:00
Ron Levine
5e2ffc188b
Merge pull request #1295 from broadinstitute/rhl_sv_error_output_1194
...
Correct error messages and error handling in multiple tools
2016-02-29 17:05:24 -05:00
Ron Levine
40a5adf767
Change error output to use the correct argument
2016-02-29 13:21:03 -05:00
Ron Levine
80a22aad77
StrandAlleleCountsBySample selects most likely only from VCF alleles
2016-02-29 12:41:55 -05:00
meganshand
c7e0f5b225
Removes Dithering from Rank Sum Test
...
Fixing empty group case
Fixing MD5s
First comments addressed
Added permutation test
Adding new RankSum to AS_RankSum
Speeding up permutation algorithm and updating MD5s
Missed a few tests
Addressing comments
Changing md5s
2016-02-29 11:45:27 -05:00
Takuto Sato
243a0fcb74
Allele-specific insert size ranksum annotation
2016-01-28 16:03:57 -05:00
Laura Gauthier
5592e4ead0
Add new -AS mode to run VQSR (both VariantRecalibrator and ApplyRecalibration) in an allele-specific manner
2016-01-22 13:18:21 -05:00
Ron Levine
ed933013fe
Remove variant contig order check
2016-01-16 19:32:28 -05:00
Eric Banks
c57c32b915
Merge pull request #1270 from broadinstitute/eb_small_genotyping_optimizations
...
Small optimizations to the joint calling code.
2016-01-13 08:46:57 -05:00
Eric Banks
ab2f541d1f
Small optimizations to the joint calling code.
...
Thanks to profiling I noticed that the determineCoefficient() method was being called too often.
Because it returns a constant result in half of the invocations, its value should be cached when possible.
Also, the various calls to getLog10Likelihoods() showed up in the profiler, so I pulled those out too.
All told, it speeds up the genotyping by about 10 percent according to the profiler.
2016-01-12 21:48:20 -05:00
Laura Gauthier
593c9ddf01
Allow VariantsToTable to evaluate the type of each split variant when -F TYPE and -SMA are specified
2016-01-12 08:12:29 -05:00
Laura Gauthier
204cad3646
Remove "mem_free" from resident memory request params for Queue because it doesn't work and wouldn't actually reserve memory anyway
2016-01-08 10:27:56 -05:00
Ron Levine
d16ed98c9e
Backport maxNoCall functionality from GATK4
2016-01-06 11:09:38 -05:00
ldgauthier
1d72ab099c
Merge pull request #1247 from broadinstitute/ldg_VQSRmodelOutput
...
Add optional argument for VQSR to output the model to a file as a GAT…
2016-01-05 08:35:18 -05:00
Ron Levine
fa1d90d236
Merge consecutive SNPs on the same read
2016-01-04 13:48:59 -05:00
Laura Gauthier
2ddc48914e
Add optional argument for VQSR to output the model to a file as a GATKReport
...
GATKReport output also has mean and variance for annotation normalization info
2016-01-04 08:37:08 -05:00
ldgauthier
71c6709765
Merge pull request #1145 from broadinstitute/ldg_M2_HapMapSensitivity
...
Fix no-normal bug; add HapMap sensitivity benchmarking
2016-01-04 08:27:37 -05:00
Ron Levine
aa5e88a393
Fix exception when writing gVCF to stdout
2015-12-29 15:30:53 -05:00
meganshand
eb6bdb2a62
MQ of Mate RankSum annotation
...
Intermediate commit for tests
Adding tests
Fixing tests after rebase
Fixing one MD5
Fixing documentation
Removing annotation from standard group
Adding documentation
2015-12-23 10:24:40 -05:00
Laura Gauthier
f9e9d2e273
Fix no-normal bug; add HapMap sensitivity benchmarking
2015-12-22 08:29:01 -05:00
Ron Levine
9c8f035780
LeftAlignAndTrimVariants --splitMultiallelics keeps GT if valid
2015-12-14 10:42:32 -05:00
Geraldine Van der Auwera
4767a83d8a
Update pom versions to mark the start of GATK 3.6 development
2015-11-25 01:52:51 -05:00
Geraldine Van der Auwera
bf875974d1
Prep MuTect2 and ContEst for release
...
Renamed M2 to MuTect2
Renamed ContaminationWalker to ContEst
Refactored related tests and usages (including in Queue scripts)
Moved M2 and ContEst + accompanying classes from private to protected
Made QSS a StandardSomaticAnnotation (new annotation group/interface) to prevent it from being sucked in with the rest of the StandardAnnotation group
2015-11-24 16:43:20 -05:00
Mark Fleharty
5a3756410c
Merge pull request #1231 from broadinstitute/mf_fixBQSRIntegrationTest
...
Fixes testPRWithConflictingArguments_qqAndSQQ to use -ql rather than -q1
2015-11-23 17:08:30 -05:00
Geraldine Van der Auwera
b0730c2b81
Merge pull request #1239 from broadinstitute/gvda_straggler_doc_fixes_1237
...
Improve doc block of GatherBqsrReports
Annotation doc enhancements (QD, InbreedingCoeff, ExcessHet and AS versions where applicable)
2015-11-22 13:58:20 -05:00
Geraldine Van der Auwera
a7748368f8
Yet more doc improvements prior to 3.5 release
...
Improve doc block of GatherBqsrReports
Annotation doc enhancements (QD, InbreedingCoeff, ExcessHet and AS versions where applicable)
2015-11-22 10:59:24 -05:00
Geraldine Van der Auwera
46ba0e519e
Restore FindCoveredIntervals + add docs
2015-11-22 10:19:04 -05:00
Geraldine Van der Auwera
22fa1511be
Merge pull request #1235 from broadinstitute/gvda_deprecate_useless_tools_1192
...
Deprecate tools that were outdated or redundant
2015-11-21 14:58:00 -05:00
Geraldine Van der Auwera
1cf66addaa
Deprecate tools that were outdated or redundant
...
ReadAdaptorTrimmer (unsound and untested)
BaseCoverageDistribution (redundant with DiagnoseTargets)
CoveredByNSamplesSites (redundant with DiagnoseTargets)
FindCoveredIntervals (redundant with DiagnoseTargets)
VariantValidationAssessor (has a scary TODO -- REWRITE THIS TO WORK WITH VARIANT CONTEXT comment and zero tests)
LiftOverVariants, FilterLiftedVariants and liftOverVCF.pl (in #1106 ) (use Picard liftover tool)
sortByRef.pl (use Picard SortVCF)
ListAnnotations (useless)
Also deleted the java archive from the private repository (old junk we never use)
2015-11-20 22:49:40 -05:00
meganshand
2570cab24c
Assorted documentation fixes, enhancements and reorganization.
...
See issues referenced by the pull request for details.
2015-11-20 22:44:46 -05:00
Mark Fleharty
1443ee8c7f
Fixes testPRWithConflictingArguments_qqAndSQQ to use -ql rather than -q1
2015-11-20 11:23:02 -05:00
Ron Levine
ccaddefa19
Validate VCF with sequence dictionary
2015-11-20 09:23:24 -05:00
Yossi Farjoun
4da0d1300c
adding fraction informative reads annotation.
2015-11-18 08:39:47 -05:00
David Roazen
9d5be24778
Move GatherBqsrReports from private to protected
2015-11-10 17:40:58 -05:00
Laura Gauthier
25b8ba45f4
More allele-specific annotations: AS_QD and AS_InbreedingCoeff
...
Grouped default output annotations to keep them from getting dropped when -A is specified; addresses #918
Also refactored code shared by ExcessHet and InbreedingCoeff
2015-11-09 16:38:31 -05:00
vruano
e3d5d96076
Added the AF indepdent calculator for any ploidy but seems that is not doing a good job for haploid
...
Addresses issue #1078 by implementing a any-ploidy version of the independent-allele-exact-ac-calculator already available for diploids.
Notice that this will change result somewhat when dealing with noisy data (low GQs).
2015-11-07 16:17:30 -05:00
Mark Fleharty
8857bc9b3f
Resolves issue #1061 to use testid1 rather than testid in two integrationtests.
2015-11-05 22:20:20 -05:00
Eric Banks
975f8a8502
Merge pull request #1206 from broadinstitute/eb_suppress_alt_allele_warnings
...
Suppress emission of the scary warning message from genotyping to no …
2015-11-05 16:12:50 -05:00
Eric Banks
2cc7de4886
Suppress emission of the scary warning message from genotyping to no more than once in
...
anything but DEBUG logging mode. Otherwise it fills up our output logs.
2015-11-05 14:19:21 -05:00
meganshand
e4627ed5c3
Addressing comments
2015-11-04 11:00:01 -05:00
meganshand
b5165b8d30
Fix for out of date VCF version output
2015-11-03 17:35:47 -05:00
ldgauthier
3d1dc303b3
Merge pull request #1197 from broadinstitute/ts_ve_nullPointer
...
Prevent null pointer exception in PrintMissingComp module
2015-11-02 14:42:50 -05:00
Takuto Sato
33462c7b50
Removed the line that caused a null pointer, as the information it logged was not useful. Updated docs and added an integration test to ensure the code no longer throws the exception.
2015-11-02 12:45:09 -05:00
Laura Gauthier
f7eb5d3082
Enable family-level stratification (if a ped file is provided)
2015-10-28 09:55:04 -04:00
Laura Gauthier
68a2f1243d
Finished draft of code for new map-combine-reduce annotation framework
...
All VQSR annotations can be generated in allele-specific mode
Pull out allele-specific annotations in AS_Standard annotation group
2015-10-27 09:44:49 -04:00
Laura Gauthier
fcaf37279c
Finished draft of code for new map-combine-reduce annotation framework
...
All VQSR annotations can be generated in allele-specific mode
Pull out allele-specific annotations in AS_Standard annotation group
2015-10-27 09:23:29 -04:00
Ron Levine
36ca9fe898
Allow LeftAlignAndTrimVariants to handle alleles longer than the default processing window
2015-10-25 20:33:56 -04:00
Ron Levine
795fe75886
Update doc for multiallelics, trimming is the default behavior
2015-10-22 04:04:09 -04:00
Takuto Sato
df7a482335
VariantAnnotator now supports annotating FILTER field from an external resource.
...
Updated the docs.
2015-10-14 14:26:21 -04:00
Ron Levine
2bcded11cb
VariantAnnotator checks alleles when annotationg with external resource
2015-10-08 17:01:30 -04:00
Eric Banks
622ec352bb
Fix for combining records in which one has a spanning deletion and needs a padded reference allele.
...
This was erroring out and not working.
2015-10-02 16:28:16 -04:00
Kate Noblett
506958a0b7
Implemented a new VariantEval evaulation module, MetricsCollection. Fixed null pointer exception, updated tests.
2015-09-30 17:21:30 -04:00
Ron Levine
792142ec50
Implement BaseCounts per-sample
2015-09-30 08:59:11 -04:00
Samuel Lee
c7f76b945e
addressing PR comments
2015-09-24 15:42:51 -04:00
Samuel Lee
0dacf60012
Changed calls for RGQ=0 from 0/0 to ./. in output of GenotypeGVCFs.
2015-09-23 15:35:09 -04:00
Ron Levine
3ecabf7e45
Allow overriding ValidateVariants' hard-coded cutoff for allele length
2015-09-17 10:49:14 -04:00
meganshand
2507bf8d17
Fixed 7-PL genotypes in InbreedingCoeff tests
2015-09-14 12:00:45 -04:00
meganshand
d767e1722e
Excess Het P-value
...
Added input exception
Added header line
Updated MD5s
Changing more MD5s
Made edge case clearer
Fixed formatting
Changed mid-point to mode
2015-09-14 12:00:44 -04:00
Laura Gauthier
53b506a0b8
Make sure inputPriors get used if they are specified
...
Fix usage of AF prior (i.e. theta) in probability of non-reference calculation
Refactored duplicate functions
Updated docs for heterozygosity
2015-09-10 10:08:03 -04:00
Ron Levine
83a7012d69
Mask snps with --snpmask
2015-09-09 16:20:48 -04:00
Eric Banks
b0dea2ccca
Merge pull request #1150 from broadinstitute/eb_keep_iupac_in_IR
...
Don't have the Indel Realigner change IUPAC reference bases.
2015-09-04 13:43:34 -04:00
Eric Banks
5f76ae6a37
Don't have the Indel Realigner change IUPAC reference bases.
...
This change doesn't affect the performance of the Indel Realigner at all (as per tests).
This is just a request from the Picard side (where further testing is happening).
2015-09-04 13:42:23 -04:00
ldgauthier
cad81a6181
Merge pull request #1149 from broadinstitute/ldg_fixCGPbugForAndrea
...
Fix bug when using --ignoreInputSamples
2015-09-04 11:17:15 -04:00
Ron Levine
29ac64f6ce
Calculate GenotypeAnnotations before InfoFieldAnnotations
2015-09-03 09:22:46 -04:00
Laura Gauthier
4769ef8dad
Fix bug when using --ignoreInputSamples
2015-09-02 09:27:06 -04:00
Samuel Lee
41256e1405
Added file-extension--dependent interval-list output to RealignerTargetCreator.
2015-08-31 11:22:18 -04:00
Mark Fleharty
daeb55429e
Adding Static Binning to BQSR
2015-08-24 13:36:17 -04:00
Ron Levine
2afe3f7a21
Make GenotypeGVCFs subset Strand Allele Counts intelligently
2015-08-22 08:33:09 -04:00
Bertrand Haas
f61529d254
Logit transform to MQ + jitter MQ capped improves VQSR
2015-08-20 17:53:01 -04:00
Ron Levine
900fe3f675
Merge pull request #1132 from broadinstitute/rhl_rev_htsjdk
...
Move htsjdk & picard to rev 1.138
2015-08-20 11:58:41 -04:00
Bertrand Haas
eae4c875a9
Logistic transform of MQ + jitter to capped MQ in VariantDataManager
2015-08-20 11:10:45 -04:00
Ron Levine
beec624a63
Move htsjdk & picard to rev 1.138
2015-08-20 10:42:25 -04:00
Geraldine Van der Auwera
5a875cb841
Fixed missing code tag
2015-08-14 14:58:28 -04:00
Geraldine Van der Auwera
19bbe45cbc
Updated licenses for 2015
2015-08-06 15:23:11 -04:00
David Benjamin
5fcc3788bd
UnifiedGenotypingEngine queries VariantContext for model if not given
2015-08-05 15:30:37 -04:00
Eric Banks
df033f674d
Patch for the incorrect "fixing" of mates when supplementary alignments are present.
...
Note that this patch involves ignoring supplementary alignments. Ideally we would want
to fix their mates properly but that would require a major refactoring of this soon-to-be
deprecated tool.
2015-08-05 12:55:39 -04:00
vruano
604fb7aaf8
Faster implementation of the active state profile value calculation when running HC with a single sample.
...
Find out about a dev-bug and added TODOs (reported in #1096 ).
Addresses issue #1095 .
Conflicts:
protected/gatk-tools-protected/src/main/java/org/broadinstitute/gatk/tools/walkers/haplotypecaller/HaplotypeCaller.java
2015-07-30 10:56:05 -04:00
Valentin Ruano Rubio
bb4c9fa1d3
Merge pull request #1099 from broadinstitute/vrr_magic_numbers
...
Extracted some constant expressions involved HC variation discovery a…
2015-07-29 13:38:23 -04:00
vruano
02c7876c72
Extracted some constant expressions involved HC variation discovery and genotyping.
...
Addreses issue #1092 .
2015-07-29 11:58:13 -04:00
meganshand
4d4de27ba3
Removes unique(int maxSize) from KBestHaplotypeFinder
2015-07-28 15:54:21 -04:00
Louis Bergelson
9d9827f176
Merge pull request #1031 from broadinstitute/lb_update_for_java8
...
Updated gatk so it compiles with java 8
2015-07-28 11:09:19 -04:00
Valentin Ruano Rubio
3a3ff558c4
Merge pull request #1085 from broadinstitute/vrr_path_builder
...
ReferenceConfidenceModel likelihood calculation in non…
2015-07-28 10:48:03 -04:00
Geraldine Van der Auwera
43a37fc746
Merge pull request #1075 from broadinstitute/ldg_bamoutDocs
...
Add info about multiple input samples (as relevant for M2)
2015-07-27 16:56:36 -04:00
Geraldine Van der Auwera
5939b4c100
Merge pull request #1073 from broadinstitute/ldg_SV-MVtestNameFix
...
Fix logging name on SelectVariantsIntegrationTest::testInvertMendelia…
2015-07-27 16:54:59 -04:00
vruano
8f6daf70db
Refactoring of ReferenceConfidenceModel likelihood calculation in non variant sites
...
Changed a division by -10.0 to a multiplication by -.1 in QualUtils (typically multiplication is faster than division).
Addresses performance issue #1081 .
2015-07-26 08:33:46 -04:00
vruano
047aea9707
Address performance issue #1077
2015-07-23 13:44:10 -04:00
Laura Gauthier
4fefedfb0b
Fix logging name on SelectVariantsIntegrationTest::testInvertMendelianViolationSelection()
2015-07-23 09:48:15 -04:00
Laura Gauthier
85b340caed
Add info about multiple input samples (as relevant for M2)
...
Also generalize references to the tool/caller since this code is now shared by HC and M2
2015-07-23 09:46:10 -04:00
Valentin Ruano Rubio
66cf22b28f
Merge pull request #1069 from broadinstitute/vrr_ad_genotype_gvcfs_bugfix
...
Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
2015-07-22 18:53:43 -04:00
vruano
315e193e51
Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
...
Addresses issue #913 .
Also remove some commented out code and toxic debugging code that uses System.out/err.println.
2015-07-22 17:08:13 -04:00
Geraldine Van der Auwera
75081bee2b
Merge pull request #1068 from broadinstitute/gvda_remove_beagle_walkers_971
...
Removed walkers for handling Beagle data
2015-07-22 15:47:19 -04:00
Joseph White
3bd988825f
Removed walkers for handling Beagle data
...
Added deprecation statements to DeprecatedToolChecks.java
Removed integration test for Beagle walker
Added URL for Beagle documentation
2015-07-21 18:36:08 -04:00
Geraldine Van der Auwera
ca082bfb76
Updated license text and fixed a couple of typos in doc block
2015-07-21 17:55:48 -04:00
Valentin Ruano Rubio
9360e1d293
Merge pull request #1059 from broadinstitute/vrr_true_false_list_removal
...
More efficient implementation of the indel read qualities recalculati…
2015-07-21 17:13:45 -04:00
vruano
82f1236633
More efficient implementation of the indel read qualities recalculation for the PCR error model.
...
Addresses #1054 .
2015-07-21 14:25:11 -04:00
Geraldine Van der Auwera
a4dde8f500
Merge pull request #1040 from broadinstitute/rhl_fasta_ref_maker
...
Merge contiguous intervals properly, closes #1035
2015-07-21 14:19:09 -04:00
Geraldine Van der Auwera
da0c8c73fb
Merge pull request #1055 from broadinstitute/ldg_TRAdocs
...
Updated TandemRepeatAnnotator docs
2015-07-21 14:16:20 -04:00
Laura Gauthier
8c18ead5e4
Clarify VCF version for supporting population alleles files
...
Clarify DeNovoPrior definition on PbyT
2015-07-20 13:42:57 -04:00
Laura Gauthier
7b29c55eb6
Updated TandemRepeatAnnotator docs
2015-07-17 17:26:56 -04:00
vruano
7f74303f2b
Removes a very inefficient way to iterate in ReferenceConfidenceModel.isReadInformativeAboutIndelsOfSize(...)
...
Addresses performance issue #1048 .
2015-07-16 12:04:12 -04:00
Ron Levine
6e46b3696e
Merge contiguous intervals properly
2015-07-14 15:23:37 -04:00
Geraldine Van der Auwera
c109a953f8
Merge pull request #1029 from broadinstitute/rhl_vqslod_definition
...
Make VQSLOD definition accurate
2015-07-06 19:52:15 -04:00
Ron Levine
1a7e83fa50
Merge if both GT are phased
2015-06-30 13:03:16 -04:00
Eric Banks
f994220617
Update the allele remapping code to handle the new spanning deletion allele.
...
Now that Ron updated the GATK so that we use star to represent spanning
deletions, we need to catch those cases in the code that remaps alleles.
Otherwise, we try to pad the stars and that's just bad.
Added test from actual failing data.
2015-06-29 17:58:22 -04:00
Louis Bergelson
e1c41b2c38
Updated gatk so it compiles on java 8
...
updated cofoja to 1.2 from 1.0
added explicit type casts in places that java 8 required them
2015-06-26 15:59:46 -04:00
Ron Levine
09686f4595
Make VQSLOD definition accurate
2015-06-25 16:47:50 -04:00
Geraldine Van der Auwera
719bb15340
Merge pull request #1019 from broadinstitute/rhl_var_index_param_gz
...
Indexing parameters not required if output file has the g.vcf.gz exte…
2015-06-17 14:30:20 -04:00
Geraldine Van der Auwera
697c4b0cf1
Added else clause to handle symbolic alleles
...
Add test for createAlleleMapping
2015-06-17 10:52:56 -04:00
Eric Banks
29ebfc32c3
Merge pull request #1020 from broadinstitute/eb_handle_multiple_spanning_dels
...
Handle cases where a given sample has multiple spanning deletions.
2015-06-16 14:20:46 -04:00
Eric Banks
fe0b5e0fbe
Handle cases where a given sample has multiple spanning deletions.
...
When a sample has multiple spanning deletions and we are asked to assign
likelihoods to the spanning deletion allele, we currently choose the first
deletion. Valentin pointed out that this isn't desired behavior. I
promised Valentin that I would address this issue, so here it is.
I do not believe that the correct thing to do is to sum the likelihoods
over all spanning deletions (I came up with problematic cases where this
breaks down).
So instead I'm using a simple heuristic approach: using the hom alt PLs, find
the most likely spanning deletion for this position and use its likelihoods.
In the 10K-sample VCF from Monkol there were only 2 cases that this problem
popped up. In both cases the heuristic approach works well.
2015-06-16 12:20:43 -04:00
Laura Gauthier
ce5ecf1383
Enable contamination correction via downsampling (as for HaplotypeCaller), added test
...
Add oxoG read count annotation and add as default annotation
Add ##SAMPLE VCF header line in accordance with TCGA VCF spec, specifying "File" line in sample header with BAM file name and "SampleName" with BAM sample name (Don't print sample file path if --no_cmdline_in_header is specified to help with test consistency)
Turn on active region assembly-based physical phasing for M2
Clean up M2-related annotations so UG doesn't crash if M2 annotations are called
2015-06-15 07:59:15 -04:00
Ron Levine
b35085ca28
Indexing parameters not required if output file has the g.vcf.gz extensionv
2015-06-13 11:46:56 -04:00
Ron Levine
dbed660183
Add spannning deletions allele
2015-06-12 16:43:06 -04:00
Geraldine Van der Auwera
526f7c0d07
Merge pull request #985 from broadinstitute/sa_refactor_cleansing_hack_negative_zeros_973_depends_on_841
...
removed in-line conditional (hack) that changed the result from 0.0 to -0.0; see issue #841
2015-05-23 00:02:52 -04:00
Sheila Chandran
dac0b8ddfc
Added QD calculation
2015-05-22 11:59:10 -04:00
Ron Levine
a6ca97ef14
Site-level selection based on genotype filter status
2015-05-21 11:27:20 -04:00
melonistic
8d25b2ba40
removed in-line conditional (hack) that changed the result from 0.0 to -0.0; see issue #841
...
removed irrelevant -0 comments as specified in issue #841 but committed in #973
2015-05-16 23:12:09 -04:00
Geraldine Van der Auwera
d1a7edd796
Update pom versions to mark the start of GATK 3.5 development
2015-05-15 00:44:54 -04:00
Geraldine Van der Auwera
f19618653a
Update pom versions for the 3.4 release
2015-05-15 00:40:39 -04:00
David Roazen
caafe84e74
Rev htsjdk to version 1.132 and picard to version 1.131, and switch to using the versions in maven central
...
-We now pull htsjdk and picard from maven central.
-Updated the GATK codebase as necessary to adapt to changes in the Feature
interface.
-Since VCFHeader now requires that all header lines have unique keys, uniquified
the keys of GVCFBlock header lines by including the min/max GQ in the key.
Updated MD5s accordingly.
-Other MD5s changed as a result of an htsjdk fix to eliminate "-0" in VCF output.
2015-05-14 15:26:23 -04:00
Geraldine Van der Auwera
f6b3d8e862
Merge pull request #947 from broadinstitute/rhl_invert_selection
...
Added --invert_selection flag for variant selection queries
2015-05-13 13:40:32 -04:00
Eric Banks
c752b9bca6
Fixed a small feature/bug that I introduced with the spanning deletions genotyping.
...
In the case where there's a low quality SNP under a spanning deletion in the gvcfs:
if the SNP is not genotyped by GenotypeGVCFs (because it's just noise) we were still
emitting a record with just the symbolic DEL allele (because that allele is high quality).
We no longer do that.
2015-05-13 11:19:40 -04:00
Ron Levine
4a75d54e65
Added invert and exclude flags for variant selection queries
2015-05-12 15:08:28 -04:00
Geraldine Van der Auwera
7a75f4ae79
Merge pull request #974 from broadinstitute/jw_Var2BinPEDSwap
...
Correct errant array element swap in FAM file output.
2015-05-12 08:49:16 -04:00
Eric Banks
53a34cea4a
Merge pull request #938 from broadinstitute/eb_fix_spanning_deletions_in_genotyping
...
Added a fix for genotyping positions over spanning deletions.
2015-05-11 23:11:47 -04:00
Joseph White
abb6bc6f57
Correct errant array element swap in FAM file output.
...
dad and mom are swapped; paternal first, then maternal
updated MD5 chksums for test files
remove commented lines
2015-05-11 20:45:50 -04:00
Eric Banks
530e0e5ea6
Added a fix for combining/genotyping positions over spanning deletions.
...
Previously, if a SNP occurred in sample A at a position that was in the middle of a deletion for sample B,
sample B would be genotyped as homozygous reference there (but it's NOT reference - there's a deletion).
Now, sample B is genotyped as having a symbolic DEL allele.
Minor cleanup added. Note that I also removed Laura's previous fix for this problem.
Existing integration tests change because I've added a new header line to the VCF being output.
I also added several tests for the new functionality showing:
1. genotyping from separate and already combined gvcfs give the same output
2. genotyping over multiple spanning deletions works
3. combining works too
Existing unit tests also cover this case.
2015-05-11 15:11:16 -04:00
Joseph White
5be8bc5dfc
Deprecate --mergeVariantsViaLD in HC
...
New unit test for deprecated mergeVariantsViaLD
Update HaplotypeCallerIntegrationTest.java
Delete duplicate testHaplotypeCallerMergeVariantsViaLDException test.
2015-05-08 17:50:25 -04:00
Geraldine Van der Auwera
5d8b9a7c20
Moved MQ0 out of HC exclusion and into StandardUGAnnotation
2015-05-03 01:04:49 +02:00
Geraldine Van der Auwera
071d82d1bf
Un-exclude SD and TRA from HC annotators; resolves #966
...
Exclude MQ0BySample
Move SD and TRA to new StandardUGAnnotation interface
There is now annotation interface (StandardUGAnnotation) holding annots that are standard in UG but should't be used as they are now with HC. This allows us to not have to exclude these annotations explicitly in HC, but still be able to use them for development purposes.
2015-05-03 00:45:53 +02:00