gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Takuto Sato	df7a482335	VariantAnnotator now supports annotating FILTER field from an external resource. Updated the docs.	2015-10-14 14:26:21 -04:00
Chris Norman	e776502c49	Fix Sample mergeValues failure when merging identical string values (#1156 ).	2015-10-12 09:56:05 -04:00
Ron Levine	2bcded11cb	VariantAnnotator checks alleles when annotationg with external resource	2015-10-08 17:01:30 -04:00
Ron Levine	033115eae0	Move htsjdk & picard to version 1.140	2015-10-08 10:42:05 -04:00
Eric Banks	622ec352bb	Fix for combining records in which one has a spanning deletion and needs a padded reference allele. This was erroring out and not working.	2015-10-02 16:28:16 -04:00
Kate Noblett	506958a0b7	Implemented a new VariantEval evaulation module, MetricsCollection. Fixed null pointer exception, updated tests.	2015-09-30 17:21:30 -04:00
Ron Levine	792142ec50	Implement BaseCounts per-sample	2015-09-30 08:59:11 -04:00
Khalid Shakir	384a09e991	Minor updates to previous ParallelShell commit. Changed `--maximumNumberOfJobsToRunConcurrently`/`-maxConcurrentRun` to `Option[Int]`. Updated licenses. Added basic tests. Removed some IntelliJ warnings.	2015-09-29 09:36:37 -03:00
Johan Dahlberg	b045f2d4aa	ParallelShell added as a new JobRunner The ParallelShell job runner will run jobs locally on one node concurrently as specified by the DAG, with the option to limit the maximum number of concurrently running jobs using the flag `maximumNumberOfJobsToRunConcurrently`. Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>	2015-09-29 09:36:37 -03:00
samuelklee	302a69d685	Merge pull request #1165 from broadinstitute/sl_fix_no_calls Changed calls for RGQ=0 from 0/0 to ./. in output of GenotypeGVCFs.	2015-09-28 12:26:18 -04:00
Geraldine Van der Auwera	118c559278	Trivial doc typo fix	2015-09-25 18:15:29 -04:00
Samuel Lee	0dacf60012	Changed calls for RGQ=0 from 0/0 to ./. in output of GenotypeGVCFs.	2015-09-23 15:35:09 -04:00
Ami Levy Moonshine	1ad00cc9d4	fix typo in the ASEReadCounter document	2015-09-21 15:30:06 -04:00
meganshand	cdfe0d7b7c	Adding PER_TARGET_COVERAGE option Comments addressed	2015-09-18 09:34:51 -04:00
Ron Levine	3ecabf7e45	Allow overriding ValidateVariants' hard-coded cutoff for allele length	2015-09-17 10:49:14 -04:00
ldgauthier	5870225f83	Merge pull request #1153 from broadinstitute/ms_excess_het Excess Het P-value	2015-09-15 11:52:25 -04:00
Khalid Shakir	24e24b9468	Using `SamIndexes.asBaiSeekableStreamOrNull()` to support `.cram.crai`. Updated other IntelliJ IDEA warnings in GATKBAMIndex. Updated example .cram files to match versions generated by current GATK/HTSJDK. Bumped HTSJDK and Picard to 1.139 releases. Added support for using `-SNAPSHOT` of HTSJDK in the future.	2015-09-14 12:20:36 -04:00
meganshand	d767e1722e	Excess Het P-value Added input exception Added header line Updated MD5s Changing more MD5s Made edge case clearer Fixed formatting Changed mid-point to mode	2015-09-14 12:00:44 -04:00
Laura Gauthier	53b506a0b8	Make sure inputPriors get used if they are specified Fix usage of AF prior (i.e. theta) in probability of non-reference calculation Refactored duplicate functions Updated docs for heterozygosity	2015-09-10 10:08:03 -04:00
Ron Levine	83a7012d69	Mask snps with --snpmask	2015-09-09 16:20:48 -04:00
Eric Banks	5f76ae6a37	Don't have the Indel Realigner change IUPAC reference bases. This change doesn't affect the performance of the Indel Realigner at all (as per tests). This is just a request from the Picard side (where further testing is happening).	2015-09-04 13:42:23 -04:00
Ron Levine	29ac64f6ce	Calculate GenotypeAnnotations before InfoFieldAnnotations	2015-09-03 09:22:46 -04:00
Laura Gauthier	a86f3909ca	Update md5s for BAM header version change in Queue test output	2015-08-28 14:19:25 -04:00
Laura Gauthier	3dc68732fb	Little changes to M2 code and docs Make MQ threshold a parameter (compare to M1 by setting to zero) Add logic for multiple alternate alleles in tumor Exclude MQ0 normal reads from normal LOD calculation Fix path errors in Dream_Evaluations.md Move M2 eval scripts out of walkers package so they run	2015-08-27 15:31:27 -04:00
Mark Fleharty	daeb55429e	Adding Static Binning to BQSR	2015-08-24 13:36:17 -04:00
Ron Levine	2afe3f7a21	Make GenotypeGVCFs subset Strand Allele Counts intelligently	2015-08-22 08:33:09 -04:00
Bertrand Haas	158477ea6c	Re-ran the updateAllLicenses.sh script	2015-08-21 11:32:51 -04:00
Ron Levine	900fe3f675	Merge pull request #1132 from broadinstitute/rhl_rev_htsjdk Move htsjdk & picard to rev 1.138	2015-08-20 11:58:41 -04:00
Bertrand Haas	eae4c875a9	Logistic transform of MQ + jitter to capped MQ in VariantDataManager	2015-08-20 11:10:45 -04:00
Ron Levine	beec624a63	Move htsjdk & picard to rev 1.138	2015-08-20 10:42:25 -04:00
meganshand	5c9935ba10	Adding CollectWgsMetrics wrapper for queue Fix license Fixed IncludeBQHistogram	2015-08-14 10:18:12 -04:00
Yossi Farjoun	69fd4af15a	Merge pull request #1111 from jsilter/overclippedreadfilter_endsoption Add additional option to OverclippedReadFilter	2015-08-12 10:43:52 -04:00
Jacob Silterra	62625b4bc6	Add option to not require soft-clips on both ends Previous version of OverclippedReadFilter would only filter a read if both ends of a read had a soft-clipped block. This adds a boolean option to relax that requirement, and only require 1 soft-clipped block, while also filtering on read length - softclipped length	2015-08-12 10:38:27 -04:00
Khalid Shakir	9bee183f6c	Switched to using CRAM's SamReader.Indexing implementation. CRAM now requires .bai index, just like BAM. Test updates: - Updated existing MD5s, as TLEN has changed. - Tests multiple contigs. - Tests several intervals per contig. - Tests when `.cram.bai` is missing, even when `.cram.crai` is present. Updated gatk docs for CRAM support, including: - Arguments that work for both BAM and CRAM listed as such. - Arguments that don't work for CRAM either explicitly say "BAM" or "doesn't work for CRAM". - Instructions on how to recreate a `.cram.bai` using cramtools. Cleaned up IntelliJ IDEA warnings regarding `Arrays.asList()` -> `Collections.singletonList()`.	2015-08-11 17:52:49 -03:00
Geraldine Van der Auwera	19bbe45cbc	Updated licenses for 2015	2015-08-06 15:23:11 -04:00
David Benjamin	ddb01058d3	moved DiffObjects	2015-08-05 21:19:02 -04:00
Geraldine Van der Auwera	875c7ffa1a	Fixed typos and made some argument docs improvements	2015-07-29 23:06:19 -04:00
Louis Bergelson	9d9827f176	Merge pull request #1031 from broadinstitute/lb_update_for_java8 Updated gatk so it compiles with java 8	2015-07-28 11:09:19 -04:00
vruano	8f6daf70db	Refactoring of ReferenceConfidenceModel likelihood calculation in non variant sites Changed a division by -10.0 to a multiplication by -.1 in QualUtils (typically multiplication is faster than division). Addresses performance issue #1081.	2015-07-26 08:33:46 -04:00
David Roazen	5fd3d2be76	Move swapExt() methods to QScriptUtils, have versions in QScript class call into the util versions	2015-07-23 10:23:55 -04:00
Valentin Ruano Rubio	66cf22b28f	Merge pull request #1069 from broadinstitute/vrr_ad_genotype_gvcfs_bugfix Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.	2015-07-22 18:53:43 -04:00
vruano	315e193e51	Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF. Addresses issue #913. Also remove some commented out code and toxic debugging code that uses System.out/err.println.	2015-07-22 17:08:13 -04:00
Joseph White	3bd988825f	Removed walkers for handling Beagle data Added deprecation statements to DeprecatedToolChecks.java Removed integration test for Beagle walker Added URL for Beagle documentation	2015-07-21 18:36:08 -04:00
Eric Banks	178bf12b27	Merge pull request #1046 from broadinstitute/rhl_catvariants_sort Fix for mis-sorted VCF files in CatVariants	2015-07-21 17:37:27 -04:00
Valentin Ruano Rubio	9360e1d293	Merge pull request #1059 from broadinstitute/vrr_true_false_list_removal More efficient implementation of the indel read qualities recalculati…	2015-07-21 17:13:45 -04:00
vruano	82f1236633	More efficient implementation of the indel read qualities recalculation for the PCR error model. Addresses #1054.	2015-07-21 14:25:11 -04:00
Ron Levine	6e46b3696e	Merge contiguous intervals properly	2015-07-14 15:23:37 -04:00
John Wallace	8fc631b7ae	Fix for mis-sorted VCF files in CatVariants When using CatVariants, VCF files were being sorted solely on the base pair position of the first record, ignoring the chromosome. This can become problematic when merging files from different chromosomes, espeically if you have multiple VCFs per chromosome. As an example, assume the following 3 lines are all in separate files: 1 10 1 100 2 20 The merged VCF from CatVariants (without -assumeSorted) would read: 1 10 2 20 1 100 This has the potential to break tools that expect chromosomes to be contiguous within a VCF file. This commit changes the comparator from one of Pair<Integer, File> to one of Pair<VariantContext, File>. We construct a VariantContextComparator from the provided reference, which will sort the first record by chromosome and position properly. Additionally, if -assumeSorted is given, we simply use a null VariantContext as the first record, which will all be equal (as all will be null)	2015-07-14 14:12:31 -04:00
Louis Bergelson	e1c41b2c38	Updated gatk so it compiles on java 8 updated cofoja to 1.2 from 1.0 added explicit type casts in places that java 8 required them	2015-06-26 15:59:46 -04:00
Ron Levine	09686f4595	Make VQSLOD definition accurate	2015-06-25 16:47:50 -04:00
Geraldine Van der Auwera	719bb15340	Merge pull request #1019 from broadinstitute/rhl_var_index_param_gz Indexing parameters not required if output file has the g.vcf.gz exte…	2015-06-17 14:30:20 -04:00
Geraldine Van der Auwera	697c4b0cf1	Added else clause to handle symbolic alleles Add test for createAlleleMapping	2015-06-17 10:52:56 -04:00
Laura Gauthier	ce5ecf1383	Enable contamination correction via downsampling (as for HaplotypeCaller), added test Add oxoG read count annotation and add as default annotation Add ##SAMPLE VCF header line in accordance with TCGA VCF spec, specifying "File" line in sample header with BAM file name and "SampleName" with BAM sample name (Don't print sample file path if --no_cmdline_in_header is specified to help with test consistency) Turn on active region assembly-based physical phasing for M2 Clean up M2-related annotations so UG doesn't crash if M2 annotations are called	2015-06-15 07:59:15 -04:00
Ron Levine	b35085ca28	Indexing parameters not required if output file has the g.vcf.gz extensionv	2015-06-13 11:46:56 -04:00
Ron Levine	dbed660183	Add spannning deletions allele	2015-06-12 16:43:06 -04:00
Joseph White	398dc7a123	Changed error message for Contigs Out of Order Changed confusing error message for out of order contigs Updated Exception message.	2015-06-11 21:46:06 -04:00
Geraldine Van der Auwera	2a7f95eddb	Merge pull request #1009 from broadinstitute/gg_patch_depthofcoverage_#1002 User (mnw21cam) patch to fix DoC slowdown in 3.4	2015-06-10 11:16:08 -04:00
droazen	5e3f3d69db	Merge pull request #1012 from broadinstitute/rhl_build_vec_pairhmm_lib Built VectorLoglessPairHMM lib with icc with gcc 4.4.7	2015-06-08 15:25:57 -04:00
Geraldine Van der Auwera	95f2899f05	User (mnw21cam) patch to fix DoC slowdown in 3.4	2015-06-05 21:12:46 -04:00
Louis Bergelson	ebdda72c88	fix typo in queue arguments	2015-06-05 17:06:23 -04:00
Ron Levine	40d8fb99a3	Built VectorLoglessPairHMM lib with icc with gcc 4.4.7	2015-06-05 15:38:25 -04:00
droazen	847c832ef9	Merge pull request #999 from broadinstitute/rhl_load_vector_pair_hmm Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version	2015-06-04 12:54:59 -04:00
Eric Banks	27d3bafcbd	Merge pull request #997 from broadinstitute/eb_add_foreign_read_filter Added a new filter that can be used to remove reads that are too smal…	2015-05-22 14:34:28 -04:00
Eric Banks	8c81e7df95	Added a new filter that can be used to remove reads that are too small and overly clipped.	2015-05-22 14:33:35 -04:00
Ron Levine	3b0cb028e6	Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version	2015-05-22 14:16:00 -04:00
Ron Levine	a6ca97ef14	Site-level selection based on genotype filter status	2015-05-21 11:27:20 -04:00
Kristian Cibulskis	3b1ee17727	added "artifact detection mode" for PON creation added "str_contraction" artifact filter (improves specificity, especially in exomes) refactored out VCF constants and added descriptions added "artifact detection mode" for PON creation added "str_contraction" artifact filter (improves specificity, especially in exomes) added new dream evaulation markdown added results for SMC 4 fixed up documentation, moved location to /dsde/working/mutect/dream_smc, and checked in scala script added "artifact detection mode" for PON creation added "str_contraction" artifact filter (improves specificity, especially in exomes) fixed bug which would overwrite germline_risk filter errors updated "how to" documents and records fixed license text thinned down FP regression test from 700 sites to 100. we have better ways (DREAM, NN) to check accuracy of the method and 100 is good enough to catch regressions why oh why do the MD5-based unit tests produce different results on different machine architectures? I hate that :/ Thanks to GG, LDG and DR -- test should now produce the same results regardless of machine architecture disabled downsampling... hopefully in the final attempt to make this work cross architecture! enforced LOGLESS_CACHING... hopefully in the final final attempt to make this work cross architecture! refactored out VCF constants and added descriptions	2015-05-15 07:14:33 -04:00
Geraldine Van der Auwera	d1a7edd796	Update pom versions to mark the start of GATK 3.5 development	2015-05-15 00:44:54 -04:00
Geraldine Van der Auwera	f19618653a	Update pom versions for the 3.4 release	2015-05-15 00:40:39 -04:00
Geraldine Van der Auwera	8b20523f5e	Merge pull request #979 from broadinstitute/ami-fixASE-bug solve bug - now work also when the reads does not have mate	2015-05-14 21:09:52 -04:00
David Roazen	caafe84e74	Rev htsjdk to version 1.132 and picard to version 1.131, and switch to using the versions in maven central -We now pull htsjdk and picard from maven central. -Updated the GATK codebase as necessary to adapt to changes in the Feature interface. -Since VCFHeader now requires that all header lines have unique keys, uniquified the keys of GVCFBlock header lines by including the min/max GQ in the key. Updated MD5s accordingly. -Other MD5s changed as a result of an htsjdk fix to eliminate "-0" in VCF output.	2015-05-14 15:26:23 -04:00
Ami Levy-Moonshine	536d550794	solve bug - now work also when the reads does not have mate reads with no mate will be counted as valid reads	2015-05-12 17:51:01 -04:00
Ron Levine	4a75d54e65	Added invert and exclude flags for variant selection queries	2015-05-12 15:08:28 -04:00
Geraldine Van der Auwera	7a75f4ae79	Merge pull request #974 from broadinstitute/jw_Var2BinPEDSwap Correct errant array element swap in FAM file output.	2015-05-12 08:49:16 -04:00
Eric Banks	53a34cea4a	Merge pull request #938 from broadinstitute/eb_fix_spanning_deletions_in_genotyping Added a fix for genotyping positions over spanning deletions.	2015-05-11 23:11:47 -04:00
Joseph White	abb6bc6f57	Correct errant array element swap in FAM file output. dad and mom are swapped; paternal first, then maternal updated MD5 chksums for test files remove commented lines	2015-05-11 20:45:50 -04:00
Eric Banks	530e0e5ea6	Added a fix for combining/genotyping positions over spanning deletions. Previously, if a SNP occurred in sample A at a position that was in the middle of a deletion for sample B, sample B would be genotyped as homozygous reference there (but it's NOT reference - there's a deletion). Now, sample B is genotyped as having a symbolic DEL allele. Minor cleanup added. Note that I also removed Laura's previous fix for this problem. Existing integration tests change because I've added a new header line to the VCF being output. I also added several tests for the new functionality showing: 1. genotyping from separate and already combined gvcfs give the same output 2. genotyping over multiple spanning deletions works 3. combining works too Existing unit tests also cover this case.	2015-05-11 15:11:16 -04:00
Geraldine Van der Auwera	5d8b9a7c20	Moved MQ0 out of HC exclusion and into StandardUGAnnotation	2015-05-03 01:04:49 +02:00
Geraldine Van der Auwera	071d82d1bf	Un-exclude SD and TRA from HC annotators; resolves #966 Exclude MQ0BySample Move SD and TRA to new StandardUGAnnotation interface There is now annotation interface (StandardUGAnnotation) holding annots that are standard in UG but should't be used as they are now with HC. This allows us to not have to exclude these annotations explicitly in HC, but still be able to use them for development purposes.	2015-05-03 00:45:53 +02:00
Geraldine Van der Auwera	e49f6dfd0f	Merge pull request #970 from broadinstitute/gg_minor_docfixes Fairly minor if plentiful fixes to various gatkdocs. Merging this without formal review since all tests pass, the gatkdocs build, and no one really wants to review corrections to grammar, typos and layout for 120+ documents. Review will be done by users in production ;-)	2015-05-03 00:36:12 +02:00
Geraldine Van der Auwera	919c3eaa2e	Numerous doc fixes; mostly formatting and clarifications	2015-05-03 00:28:46 +02:00
Geraldine Van der Auwera	fddc5331e1	Merge pull request #965 from broadinstitute/gg_nsubtil_clamp_hmm_fix Clamp the HMM window starting coordinate to 1 instead of 0	2015-05-01 22:18:20 +02:00
Ron Levine	9ff827c83a	More allele trimming for VariantAnnotator	2015-04-29 21:11:49 -04:00
Geraldine Van der Auwera	f2b34d0823	Clamp the HMM window starting coordinate to 1 instead of 0	2015-04-30 01:37:20 +02:00
David Roazen	19ceca5e86	Queue: add -qsub-broad argument When -qsub-broad is specified instead of -qsub, use the "h_vmem" parameter instead of "h_rss" to specify memory limit requests. Also cause the GridEngine native arguments to be output by default to the logger, instead of only when in debug mode.	2015-04-27 17:43:25 -04:00
Ron Levine	d5f98e99f0	Bypass reads with a bad CIGAR length	2015-04-21 11:55:56 -04:00
Geraldine Van der Auwera	bfcac455c9	Merge pull request #932 from broadinstitute/yf_fix_picard_md Fix the scala wrapper for Picard MarkDuplicates	2015-04-16 12:08:39 -04:00
Khalid Shakir	90b579c78e	CatVariants now allows different input / output file types. Escaping the CatVariantsIntegrationTest classpaths for possible spaces in the directory names.	2015-04-13 14:39:46 -03:00
Yossi Farjoun	a7487e282a	since Picard mark duplicates moved to a different package, this class was broken. here's the fix. it would be good to have tests for all the scala picard-wrappers, but that is out of scope for this commit.	2015-04-13 08:44:30 -04:00
Yossi Farjoun	d30a6258bc	added the missing file to the error message	2015-04-06 08:21:55 -04:00
Alex Baumann	024ec69e97	Modify GATK command line header for unique keys The GATK command line header keys were being repeated in the VCF and subsequently lost to a single key value by HTSJDK. This resolves the issue by appending the name of the walker after the text "GATKCommandLine" and a number after that if the same walker was used more than once in the form: GATKCommandLine.(walker name) for the first occurrence of the walker, and GATKCommandLine.(walker name).# where # is the number of the occurrence of the walker (e.g. GATKCommandLine.SomeWalker.2 for the second occurrence of SomeWalker). Integration test added to EngineFeaturesIntegrationTest to verify two runs of same walker follow expected form. Resolves #909 See also: HTSJDK #43	2015-04-02 13:56:11 -04:00
Ron Levine	fe87484074	Update -mv example documentation Made general doc fixes	2015-04-01 02:37:42 -04:00
Geraldine Van der Auwera	d7f7022dce	Merge pull request #904 from broadinstitute/pd_orig_dp Added keepOriginalDP argument to SelectVariants	2015-03-30 09:01:33 -04:00
ldgauthier	0101003138	Merge pull request #899 from broadinstitute/ldg_M2_tandemRepeatsAndContamination Lots of changes to M2:	2015-03-30 07:58:35 -04:00
Geraldine Van der Auwera	87b3dddb39	Merge pull request #894 from broadinstitute/gg_ami_docs_license Edited ASEReadCounter documentation	2015-03-28 13:15:24 -04:00
Laura Gauthier	5a10758e2e	Annotation changes for M2: Build a ReferenceContext in ActiveRegionWalkers to pass in to annotation engine so we can call the TandemRepeatAnnotator from M2 Make TandemRepeatAnnotator default annotation for M2. Setup (but don't use yet) HC-style contamination downsampling. New HC integration test with TandemRepeatAnnotator	2015-03-27 18:25:23 -04:00
Ron Levine	aef0a83c52	Automatically choose indexing strategy by file extension	2015-03-27 11:10:35 -04:00
Geraldine Van der Auwera	9b812308b1	Edited ASEReadCounter documentation Also changed output file variable type from String to Enum	2015-03-26 02:43:53 -04:00
Phillip Dexheimer	c97c253ec8	Added keepOriginalDP argument to SelectVariants Fixes #830	2015-03-25 22:45:31 -04:00
Geraldine Van der Auwera	dfa18a8fc6	Merge pull request #887 from broadinstitute/pd_vcf_cmdline_hdr Fixed logging of 'out' command line parameter in VCF headers	2015-03-25 00:48:55 -04:00
Ami Levy-Moonshine	c5fc5c4f8c	create 2 new tools: - ASEReadCounter (public tool) replce Tuuli's script to produce the input to Manny's tool. It count the number of reads that support the ref allele and the alt allele, filtereing low qual reads and bases and keep only properPaired reads - ASECaller (private tool) take both RNA and DNA, and produce ontingencyTables still under development minor changes in other tools: - update RNA HC variant calling scala script - expose FS method pValueForContingencyTable to be able to call it from ASEcaller In ASEReadCounter: - allow different option to deal with overlaping read from the same fragment - add option to ignore or include indels in the pileups - add option to disabled DuplicateRead add ASEReadCounterIntegrationTest.java and files for the test	2015-03-21 16:56:00 -04:00
Phillip Dexheimer	3b567d7a98	Fixed logging of 'out' command line parameter in VCF headers	2015-03-18 23:12:13 -04:00
Geraldine Van der Auwera	a75e1d4ce4	Fixes the test that was failing due to gsalib build failure	2015-03-17 04:26:03 -04:00
Phillip Dexheimer	4d4d33404e	Added gsa.reshape.concordance.table function to gsalib	2015-03-16 22:52:27 -04:00
Geraldine Van der Auwera	1d39ed9156	Merge pull request #814 from broadinstitute/biocyberman_maven_patches Biocyberman maven patches	2015-03-13 16:26:02 -04:00
Geraldine Van der Auwera	39a972f348	Merge pull request #872 from broadinstitute/eb_create_rgq_format_field Added the RGQ format annotation to monomorphic sites in the VCF output of GenotypeGVCFs. Fixes #870	2015-03-13 13:59:53 -04:00
Geraldine Van der Auwera	7681e89454	Merge pull request #869 from broadinstitute/gg_fix_vqsr_plots_GSA-860 Switched VQSR tranches plot ordering rule	2015-03-13 10:46:55 -04:00
Eric Banks	1ff9463285	Added the RGQ format annotation to monomorphic sites in the VCF output of GenotypeGVCFs. Now, instead of stripping out the GQs for mono sites, we transfer them to the RGQ. This is extremely useful for people who want to know how confident the hom ref genotype calls are. Perhaps this is just what CRSP needs for pertinent negatives. Note that I also changed the tool to no longer use the GenotypeSummaries annotation by default since it was adding some seemingly unnecessary annotations (like mean GQ now that we keep the GQ around and number of no-calls). Let me know if this was a mistake (although Laura gave me a thumbs up).	2015-03-13 10:27:20 -04:00
Phillip Dexheimer	6ffa295963	Regression: The new 'includeUnmapped' PartitionBy annotation was incorrectly set for HC Fixes #828	2015-03-13 00:24:57 -04:00
Geraldine Van der Auwera	aa4084d42f	Switched VQSR tranches plot ordering rule	2015-03-12 19:57:03 -04:00
Geraldine Van der Auwera	f8a081a262	Updated readme in public/doc to just point to the website	2015-03-12 11:52:48 -04:00
Ron Levine	bee7f655b7	Log a warning if using incompatible arguments in DepthOfCoverage Add reference gene list file	2015-03-10 18:14:21 -04:00
Ron Levine	71d68c3d93	Fix NotPrimaryAlignmentFilter documentation	2015-03-05 20:30:46 -05:00
biocyberman	ff6e288241	Upgrade SLF4J to allow new convient logging syntaxes Signed-off-by: David Roazen <droazen@broadinstitute.org>	2015-03-02 17:01:10 -05:00
Ron Levine	44e5965a4b	Change GC Content value type from Integer to Float	2015-02-25 13:56:42 -05:00
Geraldine Van der Auwera	f3a57a6b07	Merge pull request #811 from broadinstitute/seru71_fix_MateSameStrandFilter Corrected logical expression in MateSameStrandFilter	2015-02-23 17:57:10 -05:00
Ron Levine	2cbaef2fb2	Throw exception for -dcov argument given to ActiveRegionWalkers	2015-02-19 08:24:39 -05:00
seru71	3ee0311fdb	corrected logical expression in MateSameStrandFilter Signed-off-by: David Roazen <droazen@broadinstitute.org>	2015-02-12 12:21:44 -05:00
Phillip Dexheimer	92c7c103c1	GenotypeConcordance: monomorphic sites in truth are no longer called "Mismatching Alleles" when the comp genotype has an alternate allele * PT 84700606	2015-02-07 15:54:38 -05:00
rpoplin	b8b23b931e	Merge pull request #807 from broadinstitute/rhl_handle_cigar Process X and = CIGAR operators	2015-02-01 11:09:52 -05:00
Phillip Dexheimer	3354c07b1c	Added optional element "includeUnmapped" to the PartitionBy annotation * The value of this element (default true) determines whether Queue will explicitly run this walker over unmapped reads * This patch fixes a runtime error when FindCoveredIntervals was used with Queue * PT 81777160	2015-01-31 15:47:57 -05:00
Ron Levine	9d4b876ccd	Process X and = CIGAR operators Add simple BaseRecalibrator integration test for CIGAR = and X operators	2015-01-29 17:00:00 -05:00
Khalid Shakir	1808c90d2a	Added introductory CRAM support. Replaced usage of GATKSamRecordFactory with calls to wrapper GATKSAMRecord extending SAMRecord. Minor other updates for test changes. Added exampleCRAM.cram generated by GATK, with .bai and .crai indexes generated by CRAMTools. CRAM-to-CRAM test disabled due to https://github.com/samtools/htsjdk/issues/148 Using exampleBAM.bam input, outputs of GATK's generated CRAM match CRAMTools generated CRAM, but not samtools/PrintReads SAM output, as things like insert sizes are different. If required for other tools, CRAM indexes must be generated via CRAMTools until we can generate them via CRAMFileWriter. Generation of exampleCRAM.cram: * java -jar target/executable/GenomeAnalysisTK.jar -T PrintReads -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleBAM.bam -o public/gatk-utils/src/test/resources/exampleCRAM.cram * java -jar cramtools-2.1.jar index -I public/gatk-utils/src/test/resources/exampleCRAM.cram * java -jar cramtools-2.1.jar index -I public/gatk-utils/src/test/resources/exampleCRAM.cram --bam-style-index CRAM generation by existing tools: * samtools view -C -T public/gatk-utils/src/test/resources/exampleFASTA.fasta -o testSamtools.cram public/gatk-utils/src/test/resources/exampleBAM.bam * java -jar cramtools-2.1.jar cram --ignore-md5-mismatch --capture-all-tags -Q -n -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleBAM.bam -O testCRAMTools.cram * java -jar target/executable/GenomeAnalysisTK.jar -T PrintReads -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleBAM.bam -o testGATK.cram CRAMTools view of the above: * java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I public/gatk-utils/src/test/resources/exampleCRAM.cram \| tail -n 1 * java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I testSamtools.cram \| tail -n 1 * java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I testCRAMTools.cram \| tail -n 1 * java -jar cramtools-2.1.jar bam --skip-md5-check -R public/gatk-utils/src/test/resources/exampleFASTA.fasta -I testGATK.cram \| tail -n 1	2015-01-26 14:47:39 -03:00
Khalid Shakir	de3ca65232	Bumping HTSJDK version to pickup a bug fix for CRAM.	2015-01-26 14:47:39 -03:00
Phillip Dexheimer	72f76add71	Added -trimAlternates argument to SelectVariants * PT 84021222 * -trimAlternates removes all unused alternate alleles from variants. Note that this is pretty aggressive for monomorphic sites	2015-01-21 21:33:35 -05:00
Joel Thibault	5ce34d81b8	Allows users to disable specific read filters from the command line - enable this for DuplicateReadFilter only - enable the @DisabledReadFilters annotation to do this at the Walker level	2015-01-21 13:17:29 -05:00
Ron Levine	804b2a36b7	Fix SplitNCigar reads exception by making the list of RNAReadTransformer non-abstract, add test for -fixNDN Includes documentation changes for -fixNDN argument and the read transformer documentation. Documentation changes to CombineVariants	2015-01-14 22:22:05 -05:00
Phillip Dexheimer	6190d660e0	Edits to work with the latest htsjdk release: * TextCigarCodec.decode() is now static, and the getSingleton() method is gone * MergingSamRecordIterator now wants a Collection<SamReader> rather than Collection<SAMFileReader> in the constructor * SeekableBufferedStream now correctly reads the requested number of bytes, removed workaround in GATKBAMIndex	2015-01-13 21:32:10 -05:00
Phillip Dexheimer	b73e9d506a	Added GATKVCFConstants and GATKVCFHeaderLines to consolidate the GATK-specific VCF annotations * Removed unused annotations (CCC and HWP) * Renamed one of the two GC annotations to "IGC" (for Interval GC) * Revved picard & htsjdk (GATK constants are now removed from htsjdk) * PT 82046038	2015-01-13 21:32:09 -05:00
Ryan Poplin	2e5f9db758	Raising per-sample limits on the number of reads in ART and HC. -- Active Region Traversal was using per sample limits on the number of reads that were too low, especially now that we are running one sample at a time. This caused issues with high confidence variants being dropped in high coverage data. -- HaplotypeCallerGVCFIntegrationTest PL/annotation changes due to using more reads in those tests -- Removed a CountReadsInActiveRegionsIntegrationTest test for excessive coverage because the read coverage no longer goes over the limits in ART	2015-01-09 11:21:42 -05:00
rpoplin	03203e249e	Merge pull request #792 from broadinstitute/rhl_pairhmm_log_stderr Rhl pairhmm log stderr	2015-01-07 12:41:10 -05:00
Ron Levine	7d58544f17	Do not use logger, write to stderr, could not get the correct logger dependency in pom.xml	2015-01-06 10:32:11 -05:00
Ryan Poplin	10b23bfb04	Adding Axiom_Exome_Plus.sites_only.all_populations.poly.vcf to the resource bundle because it is used in the v3.3 best practices	2015-01-05 14:52:31 -05:00
Ron Levine	26c46ae05e	Change logger.info to logger.error	2015-01-05 14:14:02 -05:00
Ron Levine	b4fda38922	Use logging system instead of stderr	2015-01-05 14:04:10 -05:00
rpoplin	3240b3538a	Merge pull request #794 from broadinstitute/rhl_read_backed_phasing Rhl read backed phasing	2015-01-05 09:47:25 -05:00
Ron Levine	64375f6341	Messages that were going to stdout now going to stderr Make PairHMM outputs go to stderr instead of stdout Change output from stdout to stderr in close() Updated lib with output going to stderr	2014-12-23 11:03:29 -05:00
Menachem Fromer	11cd0080c3	Add option to genotype additional user-defined interval lists Add Qscript 'ONLY_GENOTYPE_xhmmCNVpipeline' to genotype additional user-defined interval lists Add Qscript 'ONLY_GENOTYPE_xhmmCNVpipeline' to genotype additional user-defined interval lists (and similar option to Qscript 'xhmmCNVpipeline')	2014-12-21 13:02:17 -05:00
Ron Levine	069398ad46	Added more tests and documentation	2014-12-19 12:57:43 -05:00
Ron Levine	08790e1dab	Fix mmultiallelic info field annotation for VariantAnnotator Add multi-allele test for info field annotations Fix to process all types of INFO annotations roll back to previous version, removes INFO and FORMAT Correct @return for VariantAnnotatorEngine.getNonReferenceAlleles() Enhance comments and clean up multi-allelic logic, handle header info number = R only parse counts of A & R Add INFO for AC update MD5 Performance enhancement, only parse multiallelic with a count A or R Make argument final in getNonReferenceAlleles() Code cleanup, add exceptions for bad expression/allele size mismatch and missing header info for an expression Change exception to warning for expression value/number of alleles check remove adevertised exceptions	2014-12-17 22:21:00 -05:00
Phillip Dexheimer	71bdfbe465	Fix VariantsToTable output of FORMAT record lists when -SMA is specified * PT 84242218 * Note that FORMAT fields behave the same as INFO fields - if the annotation has a count of A (one entry per Alt Allele), it is split across the multiple output lines. Otherwise, the entire list is output with each field	2014-12-10 21:41:15 -05:00
Geraldine Van der Auwera	45eddb4ecb	Updated gsalib version to 2.1 for resubmitting with updated license to CRAN	2014-12-09 17:07:48 -05:00
Phillip Dexheimer	a5dee8a42e	Fix NPE in SplitSamFile * PT 82892316 * Added integration test * Fixed similar error in debug output of HC	2014-12-07 10:37:30 -05:00
Alec Wysoker	4fe6ccec98	Add -output-file-extension option to GATKDoclet to produce html instead of php.	2014-12-01 18:06:36 -05:00
Alec Wysoker	62e5d42380	Fix code to filter current directory from paths pass to Reflection library.	2014-12-01 17:45:46 -05:00
Ron Levine	386aeda022	Add HaplotypeCaller argument so integration tests can specify the hardware dependent PairHMM sub-implementation	2014-11-25 21:53:53 -05:00
rpoplin	00027e1555	Merge pull request #774 from broadinstitute/ldg_makeSelectVariantsTrimAlleles Add -trim argument to SelectVariants to trim alleles to minimal represen...	2014-11-13 13:58:13 -05:00
Ron Levine	67656bab23	Resolved conflict during rebasing Add more logging to annotators, change loggers from info to warn Add comments to testStrandBiasBySample() Clarify comments in testStrandBiasBySample remove logic for not prcossing an indel if strand bias (SB) was not computed remove per variant warnings in annotate() Log warnings if using the wrong annotator or missing a pedgree file Log test failures once in annotate(), because HaplotypeCaller does not call initialize(). Avoid using exceptions Fix so only log once in annotate(), Hardey-Weinberg does not require pedigree files, fix test MD5s so pass Check if founderIds == null Update MD5s from HaplotypeCaller integrations tests and clean up code Change logic so SnpEff does not throw excpetions, change engine to utils in imports Update test MD5s, return immediately if cannot annotate in SnpEff.initialization() Post peer review, add more logging warnings Update MD5 for testHaplotypeCallerMultiSampleComplex1, return null if PossibleDeNovo.annotate() is not called by VariantAnnotator	2014-11-12 02:45:49 -05:00
Laura Gauthier	783a4fd651	Change default behavior of SelectVariants to trim remaining alleles when samples are subset. -noTrim argument preserves original alleles. Add test for trimming.	2014-11-11 16:32:25 -05:00
Valentin Ruano-Rubio	c5977e5c8f	Correct wrong left-alignment of reads in HC bamout Story: ----- https://www.pivotaltracker.com/story/show/80684230 Changes: ------- - Corrected the bug: AlignmentUtils#createReadAlignedToRef was not realigning against the reference but the best haplotype for the read. Test: ---- - Added integration test in HaplotypeCallerIntegrationTest to check that the bug has been fixed. - Fixed md5s modified by this change; these are cause due to small changes in the state of the random-number generator and read vs variant site overlapping.	2014-11-10 10:09:58 -05:00

1 2 3 4 5 ...

4696 Commits (bc3b3ac0ec4b4fd72a9e856470edaeb4c7566a06)