gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Ron Levine	ccaddefa19	Validate VCF with sequence dictionary	2015-11-20 09:23:24 -05:00
Yossi Farjoun	4da0d1300c	adding fraction informative reads annotation.	2015-11-18 08:39:47 -05:00
Laura Gauthier	25b8ba45f4	More allele-specific annotations: AS_QD and AS_InbreedingCoeff Grouped default output annotations to keep them from getting dropped when -A is specified; addresses #918 Also refactored code shared by ExcessHet and InbreedingCoeff	2015-11-09 16:38:31 -05:00
Laura Gauthier	fcaf37279c	Finished draft of code for new map-combine-reduce annotation framework All VQSR annotations can be generated in allele-specific mode Pull out allele-specific annotations in AS_Standard annotation group	2015-10-27 09:23:29 -04:00
meganshand	a57500b2fc	ROCCurve High Confidence Mode Integration Tests Updated test Changed method Minor changes Changed whitespace Fixed uncalled counts and 0 in R Fixed ReadBackedPileUp Removed imports and changed MD5 Fixed failing test Adding vqslod color Updating script to create KB Fixing integration test now that the KB is bigger Adressing comments	2015-10-21 21:30:54 -04:00
Ron Levine	2bcded11cb	VariantAnnotator checks alleles when annotationg with external resource	2015-10-08 17:01:30 -04:00
Eric Banks	622ec352bb	Fix for combining records in which one has a spanning deletion and needs a padded reference allele. This was erroring out and not working.	2015-10-02 16:28:16 -04:00
Ron Levine	792142ec50	Implement BaseCounts per-sample	2015-09-30 08:59:11 -04:00
Samuel Lee	0dacf60012	Changed calls for RGQ=0 from 0/0 to ./. in output of GenotypeGVCFs.	2015-09-23 15:35:09 -04:00
ldgauthier	5870225f83	Merge pull request #1153 from broadinstitute/ms_excess_het Excess Het P-value	2015-09-15 11:52:25 -04:00
Khalid Shakir	24e24b9468	Using `SamIndexes.asBaiSeekableStreamOrNull()` to support `.cram.crai`. Updated other IntelliJ IDEA warnings in GATKBAMIndex. Updated example .cram files to match versions generated by current GATK/HTSJDK. Bumped HTSJDK and Picard to 1.139 releases. Added support for using `-SNAPSHOT` of HTSJDK in the future.	2015-09-14 12:20:36 -04:00
meganshand	d767e1722e	Excess Het P-value Added input exception Added header line Updated MD5s Changing more MD5s Made edge case clearer Fixed formatting Changed mid-point to mode	2015-09-14 12:00:44 -04:00
Laura Gauthier	53b506a0b8	Make sure inputPriors get used if they are specified Fix usage of AF prior (i.e. theta) in probability of non-reference calculation Refactored duplicate functions Updated docs for heterozygosity	2015-09-10 10:08:03 -04:00
Eric Banks	5f76ae6a37	Don't have the Indel Realigner change IUPAC reference bases. This change doesn't affect the performance of the Indel Realigner at all (as per tests). This is just a request from the Picard side (where further testing is happening).	2015-09-04 13:42:23 -04:00
Laura Gauthier	3dc68732fb	Little changes to M2 code and docs Make MQ threshold a parameter (compare to M1 by setting to zero) Add logic for multiple alternate alleles in tumor Exclude MQ0 normal reads from normal LOD calculation Fix path errors in Dream_Evaluations.md Move M2 eval scripts out of walkers package so they run	2015-08-27 15:31:27 -04:00
Ron Levine	2afe3f7a21	Make GenotypeGVCFs subset Strand Allele Counts intelligently	2015-08-22 08:33:09 -04:00
Bertrand Haas	158477ea6c	Re-ran the updateAllLicenses.sh script	2015-08-21 11:32:51 -04:00
Ron Levine	beec624a63	Move htsjdk & picard to rev 1.138	2015-08-20 10:42:25 -04:00
Khalid Shakir	9bee183f6c	Switched to using CRAM's SamReader.Indexing implementation. CRAM now requires .bai index, just like BAM. Test updates: - Updated existing MD5s, as TLEN has changed. - Tests multiple contigs. - Tests several intervals per contig. - Tests when `.cram.bai` is missing, even when `.cram.crai` is present. Updated gatk docs for CRAM support, including: - Arguments that work for both BAM and CRAM listed as such. - Arguments that don't work for CRAM either explicitly say "BAM" or "doesn't work for CRAM". - Instructions on how to recreate a `.cram.bai` using cramtools. Cleaned up IntelliJ IDEA warnings regarding `Arrays.asList()` -> `Collections.singletonList()`.	2015-08-11 17:52:49 -03:00
Geraldine Van der Auwera	19bbe45cbc	Updated licenses for 2015	2015-08-06 15:23:11 -04:00
vruano	8f6daf70db	Refactoring of ReferenceConfidenceModel likelihood calculation in non variant sites Changed a division by -10.0 to a multiplication by -.1 in QualUtils (typically multiplication is faster than division). Addresses performance issue #1081.	2015-07-26 08:33:46 -04:00
Valentin Ruano Rubio	66cf22b28f	Merge pull request #1069 from broadinstitute/vrr_ad_genotype_gvcfs_bugfix Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.	2015-07-22 18:53:43 -04:00
vruano	315e193e51	Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF. Addresses issue #913. Also remove some commented out code and toxic debugging code that uses System.out/err.println.	2015-07-22 17:08:13 -04:00
Joseph White	3bd988825f	Removed walkers for handling Beagle data Added deprecation statements to DeprecatedToolChecks.java Removed integration test for Beagle walker Added URL for Beagle documentation	2015-07-21 18:36:08 -04:00
vruano	82f1236633	More efficient implementation of the indel read qualities recalculation for the PCR error model. Addresses #1054.	2015-07-21 14:25:11 -04:00
Ron Levine	09686f4595	Make VQSLOD definition accurate	2015-06-25 16:47:50 -04:00
Geraldine Van der Auwera	697c4b0cf1	Added else clause to handle symbolic alleles Add test for createAlleleMapping	2015-06-17 10:52:56 -04:00
Laura Gauthier	ce5ecf1383	Enable contamination correction via downsampling (as for HaplotypeCaller), added test Add oxoG read count annotation and add as default annotation Add ##SAMPLE VCF header line in accordance with TCGA VCF spec, specifying "File" line in sample header with BAM file name and "SampleName" with BAM sample name (Don't print sample file path if --no_cmdline_in_header is specified to help with test consistency) Turn on active region assembly-based physical phasing for M2 Clean up M2-related annotations so UG doesn't crash if M2 annotations are called	2015-06-15 07:59:15 -04:00
Ron Levine	dbed660183	Add spannning deletions allele	2015-06-12 16:43:06 -04:00
Joseph White	398dc7a123	Changed error message for Contigs Out of Order Changed confusing error message for out of order contigs Updated Exception message.	2015-06-11 21:46:06 -04:00
Ron Levine	40d8fb99a3	Built VectorLoglessPairHMM lib with icc with gcc 4.4.7	2015-06-05 15:38:25 -04:00
Ron Levine	3b0cb028e6	Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version	2015-05-22 14:16:00 -04:00
Kristian Cibulskis	3b1ee17727	added "artifact detection mode" for PON creation added "str_contraction" artifact filter (improves specificity, especially in exomes) refactored out VCF constants and added descriptions added "artifact detection mode" for PON creation added "str_contraction" artifact filter (improves specificity, especially in exomes) added new dream evaulation markdown added results for SMC 4 fixed up documentation, moved location to /dsde/working/mutect/dream_smc, and checked in scala script added "artifact detection mode" for PON creation added "str_contraction" artifact filter (improves specificity, especially in exomes) fixed bug which would overwrite germline_risk filter errors updated "how to" documents and records fixed license text thinned down FP regression test from 700 sites to 100. we have better ways (DREAM, NN) to check accuracy of the method and 100 is good enough to catch regressions why oh why do the MD5-based unit tests produce different results on different machine architectures? I hate that :/ Thanks to GG, LDG and DR -- test should now produce the same results regardless of machine architecture disabled downsampling... hopefully in the final attempt to make this work cross architecture! enforced LOGLESS_CACHING... hopefully in the final final attempt to make this work cross architecture! refactored out VCF constants and added descriptions	2015-05-15 07:14:33 -04:00
Geraldine Van der Auwera	d1a7edd796	Update pom versions to mark the start of GATK 3.5 development	2015-05-15 00:44:54 -04:00
Geraldine Van der Auwera	f19618653a	Update pom versions for the 3.4 release	2015-05-15 00:40:39 -04:00
David Roazen	caafe84e74	Rev htsjdk to version 1.132 and picard to version 1.131, and switch to using the versions in maven central -We now pull htsjdk and picard from maven central. -Updated the GATK codebase as necessary to adapt to changes in the Feature interface. -Since VCFHeader now requires that all header lines have unique keys, uniquified the keys of GVCFBlock header lines by including the min/max GQ in the key. Updated MD5s accordingly. -Other MD5s changed as a result of an htsjdk fix to eliminate "-0" in VCF output.	2015-05-14 15:26:23 -04:00
Ron Levine	4a75d54e65	Added invert and exclude flags for variant selection queries	2015-05-12 15:08:28 -04:00
Eric Banks	53a34cea4a	Merge pull request #938 from broadinstitute/eb_fix_spanning_deletions_in_genotyping Added a fix for genotyping positions over spanning deletions.	2015-05-11 23:11:47 -04:00
Eric Banks	530e0e5ea6	Added a fix for combining/genotyping positions over spanning deletions. Previously, if a SNP occurred in sample A at a position that was in the middle of a deletion for sample B, sample B would be genotyped as homozygous reference there (but it's NOT reference - there's a deletion). Now, sample B is genotyped as having a symbolic DEL allele. Minor cleanup added. Note that I also removed Laura's previous fix for this problem. Existing integration tests change because I've added a new header line to the VCF being output. I also added several tests for the new functionality showing: 1. genotyping from separate and already combined gvcfs give the same output 2. genotyping over multiple spanning deletions works 3. combining works too Existing unit tests also cover this case.	2015-05-11 15:11:16 -04:00
Geraldine Van der Auwera	e49f6dfd0f	Merge pull request #970 from broadinstitute/gg_minor_docfixes Fairly minor if plentiful fixes to various gatkdocs. Merging this without formal review since all tests pass, the gatkdocs build, and no one really wants to review corrections to grammar, typos and layout for 120+ documents. Review will be done by users in production ;-)	2015-05-03 00:36:12 +02:00
Geraldine Van der Auwera	919c3eaa2e	Numerous doc fixes; mostly formatting and clarifications	2015-05-03 00:28:46 +02:00
Geraldine Van der Auwera	f2b34d0823	Clamp the HMM window starting coordinate to 1 instead of 0	2015-04-30 01:37:20 +02:00
Ron Levine	d5f98e99f0	Bypass reads with a bad CIGAR length	2015-04-21 11:55:56 -04:00
Yossi Farjoun	d30a6258bc	added the missing file to the error message	2015-04-06 08:21:55 -04:00
Phillip Dexheimer	c97c253ec8	Added keepOriginalDP argument to SelectVariants Fixes #830	2015-03-25 22:45:31 -04:00
Eric Banks	1ff9463285	Added the RGQ format annotation to monomorphic sites in the VCF output of GenotypeGVCFs. Now, instead of stripping out the GQs for mono sites, we transfer them to the RGQ. This is extremely useful for people who want to know how confident the hom ref genotype calls are. Perhaps this is just what CRSP needs for pertinent negatives. Note that I also changed the tool to no longer use the GenotypeSummaries annotation by default since it was adding some seemingly unnecessary annotations (like mean GQ now that we keep the GQ around and number of no-calls). Let me know if this was a mistake (although Laura gave me a thumbs up).	2015-03-13 10:27:20 -04:00
Ron Levine	44e5965a4b	Change GC Content value type from Integer to Float	2015-02-25 13:56:42 -05:00
rpoplin	b8b23b931e	Merge pull request #807 from broadinstitute/rhl_handle_cigar Process X and = CIGAR operators	2015-02-01 11:09:52 -05:00
Phillip Dexheimer	3354c07b1c	Added optional element "includeUnmapped" to the PartitionBy annotation * The value of this element (default true) determines whether Queue will explicitly run this walker over unmapped reads * This patch fixes a runtime error when FindCoveredIntervals was used with Queue * PT 81777160	2015-01-31 15:47:57 -05:00
Ron Levine	9d4b876ccd	Process X and = CIGAR operators Add simple BaseRecalibrator integration test for CIGAR = and X operators	2015-01-29 17:00:00 -05:00

1 2

76 Commits (dc0e32e3aa1ea823bd5fe4e07a518f3f65df70de)