Commit Graph

4562 Commits (19bbe45cbc3960bcb70b6bfb74bdb4e73e762851)

Author SHA1 Message Date
Geraldine Van der Auwera 19bbe45cbc Updated licenses for 2015 2015-08-06 15:23:11 -04:00
David Benjamin ddb01058d3 moved DiffObjects 2015-08-05 21:19:02 -04:00
Geraldine Van der Auwera 875c7ffa1a Fixed typos and made some argument docs improvements 2015-07-29 23:06:19 -04:00
Louis Bergelson 9d9827f176 Merge pull request #1031 from broadinstitute/lb_update_for_java8
Updated gatk so it compiles with java 8
2015-07-28 11:09:19 -04:00
vruano 8f6daf70db Refactoring of ReferenceConfidenceModel likelihood calculation in non variant sites
Changed a division by -10.0 to a multiplication by -.1 in QualUtils (typically multiplication is faster than division).

Addresses performance issue #1081.
2015-07-26 08:33:46 -04:00
David Roazen 5fd3d2be76 Move swapExt() methods to QScriptUtils, have versions in QScript class call into the util versions 2015-07-23 10:23:55 -04:00
Valentin Ruano Rubio 66cf22b28f Merge pull request #1069 from broadinstitute/vrr_ad_genotype_gvcfs_bugfix
Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
2015-07-22 18:53:43 -04:00
vruano 315e193e51 Fix AD propagation when subsetting alleles in non-diploid GenotypeGVCF.
Addresses issue #913.

Also remove some commented out code and toxic debugging code that uses System.out/err.println.
2015-07-22 17:08:13 -04:00
Joseph White 3bd988825f Removed walkers for handling Beagle data
Added deprecation statements to DeprecatedToolChecks.java
    Removed integration test for Beagle walker
    Added URL for Beagle documentation
2015-07-21 18:36:08 -04:00
Eric Banks 178bf12b27 Merge pull request #1046 from broadinstitute/rhl_catvariants_sort
Fix for mis-sorted VCF files in CatVariants
2015-07-21 17:37:27 -04:00
Valentin Ruano Rubio 9360e1d293 Merge pull request #1059 from broadinstitute/vrr_true_false_list_removal
More efficient implementation of the indel read qualities recalculati…
2015-07-21 17:13:45 -04:00
vruano 82f1236633 More efficient implementation of the indel read qualities recalculation for the PCR error model.
Addresses #1054.
2015-07-21 14:25:11 -04:00
Ron Levine 6e46b3696e Merge contiguous intervals properly 2015-07-14 15:23:37 -04:00
John Wallace 8fc631b7ae Fix for mis-sorted VCF files in CatVariants
When using CatVariants, VCF files were being sorted solely on the base
pair position of the first record, ignoring the chromosome.  This can
become problematic when merging files from different chromosomes,
espeically if you have multiple VCFs per chromosome.

As an example, assume the following 3 lines are all in separate files:
1       10
1       100
2       20

The merged VCF from CatVariants (without -assumeSorted) would read:
1       10
2       20
1       100

This has the potential to break tools that expect chromosomes to be
contiguous within a VCF file.

This commit changes the comparator from one of Pair<Integer, File> to
one of Pair<VariantContext, File>.  We construct a
VariantContextComparator from the provided reference, which will sort
the first record by chromosome and position properly.  Additionally, if
-assumeSorted is given, we simply use a null VariantContext as the first
record, which will all be equal (as all will be null)
2015-07-14 14:12:31 -04:00
Louis Bergelson e1c41b2c38 Updated gatk so it compiles on java 8
updated cofoja to 1.2 from 1.0
added explicit type casts in places that java 8 required them
2015-06-26 15:59:46 -04:00
Ron Levine 09686f4595 Make VQSLOD definition accurate 2015-06-25 16:47:50 -04:00
Geraldine Van der Auwera 719bb15340 Merge pull request #1019 from broadinstitute/rhl_var_index_param_gz
Indexing parameters not required if output file has the g.vcf.gz exte…
2015-06-17 14:30:20 -04:00
Geraldine Van der Auwera 697c4b0cf1 Added else clause to handle symbolic alleles
Add test for createAlleleMapping
2015-06-17 10:52:56 -04:00
Laura Gauthier ce5ecf1383 Enable contamination correction via downsampling (as for HaplotypeCaller), added test
Add oxoG read count annotation and add as default annotation
Add ##SAMPLE VCF header line in accordance with TCGA VCF spec, specifying "File" line in sample header with BAM file name and "SampleName" with BAM sample name (Don't print sample file path if --no_cmdline_in_header is specified to help with test consistency)
Turn on active region assembly-based physical phasing for M2
Clean up M2-related annotations so UG doesn't crash if M2 annotations are called
2015-06-15 07:59:15 -04:00
Ron Levine b35085ca28 Indexing parameters not required if output file has the g.vcf.gz extensionv 2015-06-13 11:46:56 -04:00
Ron Levine dbed660183 Add spannning deletions allele 2015-06-12 16:43:06 -04:00
Joseph White 398dc7a123 Changed error message for Contigs Out of Order
Changed confusing error message for out of order contigs

Updated Exception message.
2015-06-11 21:46:06 -04:00
Geraldine Van der Auwera 2a7f95eddb Merge pull request #1009 from broadinstitute/gg_patch_depthofcoverage_#1002
User (mnw21cam) patch to fix DoC slowdown in 3.4
2015-06-10 11:16:08 -04:00
droazen 5e3f3d69db Merge pull request #1012 from broadinstitute/rhl_build_vec_pairhmm_lib
Built VectorLoglessPairHMM lib with icc with gcc 4.4.7
2015-06-08 15:25:57 -04:00
Geraldine Van der Auwera 95f2899f05 User (mnw21cam) patch to fix DoC slowdown in 3.4 2015-06-05 21:12:46 -04:00
Louis Bergelson ebdda72c88 fix typo in queue arguments 2015-06-05 17:06:23 -04:00
Ron Levine 40d8fb99a3 Built VectorLoglessPairHMM lib with icc with gcc 4.4.7 2015-06-05 15:38:25 -04:00
droazen 847c832ef9 Merge pull request #999 from broadinstitute/rhl_load_vector_pair_hmm
Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version
2015-06-04 12:54:59 -04:00
Eric Banks 27d3bafcbd Merge pull request #997 from broadinstitute/eb_add_foreign_read_filter
Added a new filter that can be used to remove reads that are too smal…
2015-05-22 14:34:28 -04:00
Eric Banks 8c81e7df95 Added a new filter that can be used to remove reads that are too small and overly clipped. 2015-05-22 14:33:35 -04:00
Ron Levine 3b0cb028e6 Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version 2015-05-22 14:16:00 -04:00
Ron Levine a6ca97ef14 Site-level selection based on genotype filter status 2015-05-21 11:27:20 -04:00
Kristian Cibulskis 3b1ee17727 added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)
refactored out VCF constants and added descriptions

added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)

added new dream evaulation markdown

added results for SMC 4

fixed up documentation, moved location to /dsde/working/mutect/dream_smc, and checked in scala script

added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)

fixed bug which would overwrite germline_risk filter errors
updated "how to" documents and records

fixed license text

thinned down FP regression test from 700 sites to 100.  we have better ways (DREAM, NN) to check accuracy of the method and 100 is good enough to catch regressions

why oh why do the MD5-based unit tests produce different results on different machine architectures?  I hate that :/

Thanks to GG, LDG and DR -- test should now produce the same results regardless of machine architecture

disabled downsampling... hopefully in the final attempt to make this work cross architecture!

enforced LOGLESS_CACHING... hopefully in the final final attempt to make this work cross architecture!

refactored out VCF constants and added descriptions
2015-05-15 07:14:33 -04:00
Geraldine Van der Auwera d1a7edd796 Update pom versions to mark the start of GATK 3.5 development 2015-05-15 00:44:54 -04:00
Geraldine Van der Auwera f19618653a Update pom versions for the 3.4 release 2015-05-15 00:40:39 -04:00
Geraldine Van der Auwera 8b20523f5e Merge pull request #979 from broadinstitute/ami-fixASE-bug
solve bug - now work also when the reads does not have mate
2015-05-14 21:09:52 -04:00
David Roazen caafe84e74 Rev htsjdk to version 1.132 and picard to version 1.131, and switch to using the versions in maven central
-We now pull htsjdk and picard from maven central.

-Updated the GATK codebase as necessary to adapt to changes in the Feature
 interface.

-Since VCFHeader now requires that all header lines have unique keys, uniquified
 the keys of GVCFBlock header lines by including the min/max GQ in the key.
 Updated MD5s accordingly.

-Other MD5s changed as a result of an htsjdk fix to eliminate "-0" in VCF output.
2015-05-14 15:26:23 -04:00
Ami Levy-Moonshine 536d550794 solve bug - now work also when the reads does not have mate
reads with no mate will be counted as valid reads
2015-05-12 17:51:01 -04:00
Ron Levine 4a75d54e65 Added invert and exclude flags for variant selection queries 2015-05-12 15:08:28 -04:00
Geraldine Van der Auwera 7a75f4ae79 Merge pull request #974 from broadinstitute/jw_Var2BinPEDSwap
Correct errant array element swap in FAM file output.
2015-05-12 08:49:16 -04:00
Eric Banks 53a34cea4a Merge pull request #938 from broadinstitute/eb_fix_spanning_deletions_in_genotyping
Added a fix for genotyping positions over spanning deletions.
2015-05-11 23:11:47 -04:00
Joseph White abb6bc6f57 Correct errant array element swap in FAM file output.
dad and mom are swapped; paternal first, then maternal

updated MD5 chksums for test files

remove commented lines
2015-05-11 20:45:50 -04:00
Eric Banks 530e0e5ea6 Added a fix for combining/genotyping positions over spanning deletions.
Previously, if a SNP occurred in sample A at a position that was in the middle of a deletion for sample B,
sample B would be genotyped as homozygous reference there (but it's NOT reference - there's a deletion).
Now, sample B is genotyped as having a symbolic DEL allele.

Minor cleanup added.  Note that I also removed Laura's previous fix for this problem.

Existing integration tests change because I've added a new header line to the VCF being output.
I also added several tests for the new functionality showing:
1. genotyping from separate and already combined gvcfs give the same output
2. genotyping over multiple spanning deletions works
3. combining works too

Existing unit tests also cover this case.
2015-05-11 15:11:16 -04:00
Geraldine Van der Auwera 5d8b9a7c20 Moved MQ0 out of HC exclusion and into StandardUGAnnotation 2015-05-03 01:04:49 +02:00
Geraldine Van der Auwera 071d82d1bf Un-exclude SD and TRA from HC annotators; resolves #966
Exclude MQ0BySample
Move SD and TRA to new StandardUGAnnotation interface
There is now annotation interface (StandardUGAnnotation) holding annots that are standard in UG but should't be used as they are now with HC. This allows us to not have to exclude these annotations explicitly in HC, but still be able to use them for development purposes.
2015-05-03 00:45:53 +02:00
Geraldine Van der Auwera e49f6dfd0f Merge pull request #970 from broadinstitute/gg_minor_docfixes
Fairly minor if plentiful fixes to various gatkdocs. Merging this without formal review since all tests pass, the gatkdocs build, and no one really wants to review corrections to grammar, typos and layout for 120+ documents. Review will be done by users in production ;-)
2015-05-03 00:36:12 +02:00
Geraldine Van der Auwera 919c3eaa2e Numerous doc fixes; mostly formatting and clarifications 2015-05-03 00:28:46 +02:00
Geraldine Van der Auwera fddc5331e1 Merge pull request #965 from broadinstitute/gg_nsubtil_clamp_hmm_fix
Clamp the HMM window starting coordinate to 1 instead of 0
2015-05-01 22:18:20 +02:00
Ron Levine 9ff827c83a More allele trimming for VariantAnnotator 2015-04-29 21:11:49 -04:00
Geraldine Van der Auwera f2b34d0823 Clamp the HMM window starting coordinate to 1 instead of 0 2015-04-30 01:37:20 +02:00