Commit Graph

93 Commits (2baf7d8c4e5f8a9e975be8dd4f0b71f7e118bd7e)

Author SHA1 Message Date
Geraldine Van der Auwera 9a306ca221 Update licenses 2016-03-05 01:09:43 -08:00
Ron Levine 5e2ffc188b Merge pull request #1295 from broadinstitute/rhl_sv_error_output_1194
Correct error messages and error handling in multiple tools
2016-02-29 17:05:24 -05:00
Ron Levine 40a5adf767 Change error output to use the correct argument 2016-02-29 13:21:03 -05:00
meganshand c7e0f5b225 Removes Dithering from Rank Sum Test
Fixing empty group case

Fixing MD5s

First comments addressed

Added permutation test

Adding new RankSum to AS_RankSum

Speeding up permutation algorithm and updating MD5s

Missed a few tests

Addressing comments

Changing md5s
2016-02-29 11:45:27 -05:00
Yossi Farjoun 7896055be3 - Fixed bug in GenomeLoc parser
- Added a warning when two contigs are too similar that it might cause problems with parsing
- Added tests of modified parser and of warning.
2016-02-02 06:53:22 -05:00
Takuto Sato 243a0fcb74 Allele-specific insert size ranksum annotation 2016-01-28 16:03:57 -05:00
Laura Gauthier 5592e4ead0 Add new -AS mode to run VQSR (both VariantRecalibrator and ApplyRecalibration) in an allele-specific manner 2016-01-22 13:18:21 -05:00
Geraldine Van der Auwera c93a611ea3 Remove unneeded dependency
Addresses https://github.com/broadgsa/gatk/pull/15 for Guillermo
2016-01-21 16:51:01 -05:00
Ron Levine ed933013fe Remove variant contig order check 2016-01-16 19:32:28 -05:00
meganshand eb6bdb2a62 MQ of Mate RankSum annotation
Intermediate commit for tests

Adding tests

Fixing tests after rebase

Fixing one MD5

Fixing documentation

Removing annotation from standard group

Adding documentation
2015-12-23 10:24:40 -05:00
Ron Levine 9c8f035780 LeftAlignAndTrimVariants --splitMultiallelics keeps GT if valid 2015-12-14 10:42:32 -05:00
Geraldine Van der Auwera 4767a83d8a Update pom versions to mark the start of GATK 3.6 development 2015-11-25 01:52:51 -05:00
Geraldine Van der Auwera 46ba0e519e Restore FindCoveredIntervals + add docs 2015-11-22 10:19:04 -05:00
Ron Levine 08a9c80559 Make the header sequence dictionary match reference 2015-11-21 19:12:37 -05:00
Geraldine Van der Auwera 22fa1511be Merge pull request #1235 from broadinstitute/gvda_deprecate_useless_tools_1192
Deprecate tools that were outdated or redundant
2015-11-21 14:58:00 -05:00
Geraldine Van der Auwera 1cf66addaa Deprecate tools that were outdated or redundant
ReadAdaptorTrimmer (unsound and untested)
BaseCoverageDistribution (redundant with DiagnoseTargets)
CoveredByNSamplesSites (redundant with DiagnoseTargets)
FindCoveredIntervals (redundant with DiagnoseTargets)
VariantValidationAssessor (has a scary TODO -- REWRITE THIS TO WORK WITH VARIANT CONTEXT comment and zero tests)
LiftOverVariants, FilterLiftedVariants and liftOverVCF.pl (in #1106) (use Picard liftover tool)
sortByRef.pl (use Picard SortVCF)
ListAnnotations (useless)

Also deleted the java archive from the private repository (old junk we never use)
2015-11-20 22:49:40 -05:00
meganshand 2570cab24c Assorted documentation fixes, enhancements and reorganization.
See issues referenced by the pull request for details.
2015-11-20 22:44:46 -05:00
Ron Levine ccaddefa19 Validate VCF with sequence dictionary 2015-11-20 09:23:24 -05:00
Yossi Farjoun 4da0d1300c adding fraction informative reads annotation. 2015-11-18 08:39:47 -05:00
Laura Gauthier 25b8ba45f4 More allele-specific annotations: AS_QD and AS_InbreedingCoeff
Grouped default output annotations to keep them from getting dropped when -A is specified; addresses #918
Also refactored code shared by ExcessHet and InbreedingCoeff
2015-11-09 16:38:31 -05:00
Laura Gauthier fcaf37279c Finished draft of code for new map-combine-reduce annotation framework
All VQSR annotations can be generated in allele-specific mode
Pull out allele-specific annotations in AS_Standard annotation group
2015-10-27 09:23:29 -04:00
meganshand a57500b2fc ROCCurve High Confidence Mode
Integration Tests

Updated test

Changed method

Minor changes

Changed whitespace

Fixed uncalled counts and 0 in R

Fixed ReadBackedPileUp

Removed imports and changed MD5

Fixed failing test

Adding vqslod color

Updating script to create KB

Fixing integration test now that the KB is bigger

Adressing comments
2015-10-21 21:30:54 -04:00
Ron Levine 2bcded11cb VariantAnnotator checks alleles when annotationg with external resource 2015-10-08 17:01:30 -04:00
Eric Banks 622ec352bb Fix for combining records in which one has a spanning deletion and needs a padded reference allele.
This was erroring out and not working.
2015-10-02 16:28:16 -04:00
Ron Levine 792142ec50 Implement BaseCounts per-sample 2015-09-30 08:59:11 -04:00
Samuel Lee 0dacf60012 Changed calls for RGQ=0 from 0/0 to ./. in output of GenotypeGVCFs. 2015-09-23 15:35:09 -04:00
ldgauthier 5870225f83 Merge pull request #1153 from broadinstitute/ms_excess_het
Excess Het P-value
2015-09-15 11:52:25 -04:00
Khalid Shakir 24e24b9468 Using `SamIndexes.asBaiSeekableStreamOrNull()` to support `.cram.crai`.
Updated other IntelliJ IDEA warnings in GATKBAMIndex.
Updated example .cram files to match versions generated by current GATK/HTSJDK.
Bumped HTSJDK and Picard to 1.139 releases.
Added support for using `-SNAPSHOT` of HTSJDK in the future.
2015-09-14 12:20:36 -04:00
meganshand d767e1722e Excess Het P-value
Added input exception

Added header line

Updated MD5s

Changing more MD5s

Made edge case clearer

Fixed formatting

Changed mid-point to mode
2015-09-14 12:00:44 -04:00
Laura Gauthier 53b506a0b8 Make sure inputPriors get used if they are specified
Fix usage of AF prior (i.e. theta) in probability of non-reference calculation
Refactored duplicate functions
Updated docs for heterozygosity
2015-09-10 10:08:03 -04:00
Eric Banks 5f76ae6a37 Don't have the Indel Realigner change IUPAC reference bases.
This change doesn't affect the performance of the Indel Realigner at all (as per tests).
This is just a request from the Picard side (where further testing is happening).
2015-09-04 13:42:23 -04:00
Laura Gauthier 3dc68732fb Little changes to M2 code and docs
Make MQ threshold a parameter (compare to M1 by setting to zero)
Add logic for multiple alternate alleles in tumor
Exclude MQ0 normal reads from normal LOD calculation
Fix path errors in Dream_Evaluations.md
Move M2 eval scripts out of walkers package so they run
2015-08-27 15:31:27 -04:00
Ron Levine 2afe3f7a21 Make GenotypeGVCFs subset Strand Allele Counts intelligently 2015-08-22 08:33:09 -04:00
Bertrand Haas 158477ea6c Re-ran the updateAllLicenses.sh script 2015-08-21 11:32:51 -04:00
Ron Levine beec624a63 Move htsjdk & picard to rev 1.138 2015-08-20 10:42:25 -04:00
Khalid Shakir 9bee183f6c Switched to using CRAM's SamReader.Indexing implementation.
CRAM now requires .bai index, just like BAM.
Test updates:
- Updated existing MD5s, as TLEN has changed.
- Tests multiple contigs.
- Tests several intervals per contig.
- Tests when `.cram.bai` is missing, even when `.cram.crai` is present.
Updated gatk docs for CRAM support, including:
- Arguments that work for both BAM and CRAM listed as such.
- Arguments that don't work for CRAM either explicitly say "BAM" or "doesn't work for CRAM".
- Instructions on how to recreate a `.cram.bai` using cramtools.
Cleaned up IntelliJ IDEA warnings regarding `Arrays.asList()` -> `Collections.singletonList()`.
2015-08-11 17:52:49 -03:00
Geraldine Van der Auwera 19bbe45cbc Updated licenses for 2015 2015-08-06 15:23:11 -04:00
vruano 8f6daf70db Refactoring of ReferenceConfidenceModel likelihood calculation in non variant sites
Changed a division by -10.0 to a multiplication by -.1 in QualUtils (typically multiplication is faster than division).

Addresses performance issue #1081.
2015-07-26 08:33:46 -04:00
Valentin Ruano Rubio 66cf22b28f Merge pull request #1069 from broadinstitute/vrr_ad_genotype_gvcfs_bugfix
Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
2015-07-22 18:53:43 -04:00
vruano 315e193e51 Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
Addresses issue #913.

Also remove some commented out code and toxic debugging code that uses System.out/err.println.
2015-07-22 17:08:13 -04:00
Joseph White 3bd988825f Removed walkers for handling Beagle data
Added deprecation statements to DeprecatedToolChecks.java
    Removed integration test for Beagle walker
    Added URL for Beagle documentation
2015-07-21 18:36:08 -04:00
vruano 82f1236633 More efficient implementation of the indel read qualities recalculation for the PCR error model.
Addresses #1054.
2015-07-21 14:25:11 -04:00
Ron Levine 09686f4595 Make VQSLOD definition accurate 2015-06-25 16:47:50 -04:00
Geraldine Van der Auwera 697c4b0cf1 Added else clause to handle symbolic alleles
Add test for createAlleleMapping
2015-06-17 10:52:56 -04:00
Laura Gauthier ce5ecf1383 Enable contamination correction via downsampling (as for HaplotypeCaller), added test
Add oxoG read count annotation and add as default annotation
Add ##SAMPLE VCF header line in accordance with TCGA VCF spec, specifying "File" line in sample header with BAM file name and "SampleName" with BAM sample name (Don't print sample file path if --no_cmdline_in_header is specified to help with test consistency)
Turn on active region assembly-based physical phasing for M2
Clean up M2-related annotations so UG doesn't crash if M2 annotations are called
2015-06-15 07:59:15 -04:00
Ron Levine dbed660183 Add spannning deletions allele 2015-06-12 16:43:06 -04:00
Joseph White 398dc7a123 Changed error message for Contigs Out of Order
Changed confusing error message for out of order contigs

Updated Exception message.
2015-06-11 21:46:06 -04:00
Ron Levine 40d8fb99a3 Built VectorLoglessPairHMM lib with icc with gcc 4.4.7 2015-06-05 15:38:25 -04:00
Ron Levine 3b0cb028e6 Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version 2015-05-22 14:16:00 -04:00
Kristian Cibulskis 3b1ee17727 added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)
refactored out VCF constants and added descriptions

added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)

added new dream evaulation markdown

added results for SMC 4

fixed up documentation, moved location to /dsde/working/mutect/dream_smc, and checked in scala script

added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)

fixed bug which would overwrite germline_risk filter errors
updated "how to" documents and records

fixed license text

thinned down FP regression test from 700 sites to 100.  we have better ways (DREAM, NN) to check accuracy of the method and 100 is good enough to catch regressions

why oh why do the MD5-based unit tests produce different results on different machine architectures?  I hate that :/

Thanks to GG, LDG and DR -- test should now produce the same results regardless of machine architecture

disabled downsampling... hopefully in the final attempt to make this work cross architecture!

enforced LOGLESS_CACHING... hopefully in the final final attempt to make this work cross architecture!

refactored out VCF constants and added descriptions
2015-05-15 07:14:33 -04:00