Commit Graph

13964 Commits (6e46b3696eb7e64bb15cc81620a414d19d2d5a95)

Author SHA1 Message Date
Ron Levine 6e46b3696e Merge contiguous intervals properly 2015-07-14 15:23:37 -04:00
ldgauthier 45a1d82305 Merge pull request #1041 from broadinstitute/ldg_ContEst
Ported latest (non-yet-public) ContEst into GATK-private
2015-07-10 19:42:03 -04:00
Laura Gauthier 1159cb3aa9 Ported latest (non-yet-public) ContEst into GATK; verified results against Firehose version
Change file paths to put ContEst stuff in cancer directory
2015-07-09 17:06:06 -04:00
Geraldine Van der Auwera 8ea4dcab8d Merge remote-tracking branch 'unstable/master' 2015-07-09 15:17:03 -04:00
kcibul 00526d4624 Merge pull request #1022 from broadinstitute/kc_m2_pon
update results of NA12878 using official ICE PON (same git hash for t…
2015-07-07 16:37:46 -04:00
kcibul e6dff9cc4e Merge pull request #1037 from broadinstitute/ldg_M2_contaminationAnalysis
Document contamination downsampling analysis
2015-07-07 16:36:52 -04:00
Geraldine Van der Auwera c109a953f8 Merge pull request #1029 from broadinstitute/rhl_vqslod_definition
Make VQSLOD definition accurate
2015-07-06 19:52:15 -04:00
Laura Gauthier b6da9366a6 Document contamination downsampling analysis
Add Yossi's Queue script to create synthetic contamination data
2015-07-06 12:42:13 -04:00
kcibul aaf4e33e15 Merge pull request #1028 from broadinstitute/kc_oxog_fixes
Fix Foxog NaN output and add read stats for indels
2015-07-04 10:24:21 -04:00
Kristian Cibulskis fa04024303 fixes NaN output in Foxog (github issue 1025) and also emit read directions stats for indels (issue 1024)
fixed docs
2015-07-03 08:44:17 -04:00
Eric Banks d8e5d663fd Merge pull request #1030 from broadinstitute/rhl_incorrect_rbp
Merge if both GT are phased
2015-06-30 21:13:45 -04:00
Ron Levine 1a7e83fa50 Merge if both GT are phased 2015-06-30 13:03:16 -04:00
Eric Banks 5ea2aff379 Merge pull request #1033 from broadinstitute/eb_fix_spanning_dels_with_new_allele
Update the allele remapping code to handle the new spanning deletion allele.
2015-06-29 22:57:14 -04:00
Eric Banks f994220617 Update the allele remapping code to handle the new spanning deletion allele.
Now that Ron updated the GATK so that we use star to represent spanning
deletions, we need to catch those cases in the code that remaps alleles.
Otherwise, we try to pad the stars and that's just bad.

Added test from actual failing data.
2015-06-29 17:58:22 -04:00
Ron Levine 09686f4595 Make VQSLOD definition accurate 2015-06-25 16:47:50 -04:00
Geraldine Van der Auwera 719bb15340 Merge pull request #1019 from broadinstitute/rhl_var_index_param_gz
Indexing parameters not required if output file has the g.vcf.gz exte…
2015-06-17 14:30:20 -04:00
Eric Banks a4987310ae Merge pull request #1014 from broadinstitute/gg_fix_combinevariants_del_allele_1000
Added else clause to handle symbolic alleles
2015-06-17 12:52:18 -04:00
Geraldine Van der Auwera 697c4b0cf1 Added else clause to handle symbolic alleles
Add test for createAlleleMapping
2015-06-17 10:52:56 -04:00
Eric Banks 29ebfc32c3 Merge pull request #1020 from broadinstitute/eb_handle_multiple_spanning_dels
Handle cases where a given sample has multiple spanning deletions.
2015-06-16 14:20:46 -04:00
Eric Banks fe0b5e0fbe Handle cases where a given sample has multiple spanning deletions.
When a sample has multiple spanning deletions and we are asked to assign
likelihoods to the spanning deletion allele, we currently choose the first
deletion.  Valentin pointed out that this isn't desired behavior.  I
promised Valentin that I would address this issue, so here it is.

I do not believe that the correct thing to do is to sum the likelihoods
over all spanning deletions (I came up with problematic cases where this
breaks down).

So instead I'm using a simple heuristic approach: using the hom alt PLs, find
the most likely spanning deletion for this position and use its likelihoods.

In the 10K-sample VCF from Monkol there were only 2 cases that this problem
popped up.  In both cases the heuristic approach works well.
2015-06-16 12:20:43 -04:00
Kristian Cibulskis 7018fd7203 update results of NA12878 using official ICE PON (same git hash for the caller) 2015-06-16 10:09:36 -04:00
kcibul 578d429348 Merge pull request #1017 from broadinstitute/ldg_contaminationDS
Enable contamination correction via downsampling (as for HaplotypeCal…
2015-06-15 14:37:10 -04:00
Laura Gauthier ce5ecf1383 Enable contamination correction via downsampling (as for HaplotypeCaller), added test
Add oxoG read count annotation and add as default annotation
Add ##SAMPLE VCF header line in accordance with TCGA VCF spec, specifying "File" line in sample header with BAM file name and "SampleName" with BAM sample name (Don't print sample file path if --no_cmdline_in_header is specified to help with test consistency)
Turn on active region assembly-based physical phasing for M2
Clean up M2-related annotations so UG doesn't crash if M2 annotations are called
2015-06-15 07:59:15 -04:00
Eric Banks 9522be8762 Merge pull request #1016 from broadinstitute/rhl_allele_rep_span_dels
Add spannning deletions allele
2015-06-13 22:12:23 -04:00
Ron Levine b35085ca28 Indexing parameters not required if output file has the g.vcf.gz extensionv 2015-06-13 11:46:56 -04:00
Ron Levine dbed660183 Add spannning deletions allele 2015-06-12 16:43:06 -04:00
Geraldine Van der Auwera 456fefa860 Merge pull request #1001 from broadinstitute/jw_clarify_overlaping_contigs
Changed error message for Contigs Out of Order
2015-06-12 15:03:10 -04:00
Joseph White 398dc7a123 Changed error message for Contigs Out of Order
Changed confusing error message for out of order contigs

Updated Exception message.
2015-06-11 21:46:06 -04:00
Geraldine Van der Auwera 2a7f95eddb Merge pull request #1009 from broadinstitute/gg_patch_depthofcoverage_#1002
User (mnw21cam) patch to fix DoC slowdown in 3.4
2015-06-10 11:16:08 -04:00
kcibul aad89cd653 Merge pull request #1005 from broadinstitute/kc_m2_pon
created panel of normals queue creation script and instructions
2015-06-09 11:10:17 -04:00
droazen 5e3f3d69db Merge pull request #1012 from broadinstitute/rhl_build_vec_pairhmm_lib
Built VectorLoglessPairHMM lib with icc with gcc 4.4.7
2015-06-08 15:25:57 -04:00
Geraldine Van der Auwera 95f2899f05 User (mnw21cam) patch to fix DoC slowdown in 3.4 2015-06-05 21:12:46 -04:00
Louis Bergelson 588d6f1180 Merge pull request #1013 from lbergelson/patch-1
fix typo in queue arguments
2015-06-05 19:27:51 -04:00
Louis Bergelson ebdda72c88 fix typo in queue arguments 2015-06-05 17:06:23 -04:00
Ron Levine 40d8fb99a3 Built VectorLoglessPairHMM lib with icc with gcc 4.4.7 2015-06-05 15:38:25 -04:00
droazen 847c832ef9 Merge pull request #999 from broadinstitute/rhl_load_vector_pair_hmm
Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version
2015-06-04 12:54:59 -04:00
Kristian Cibulskis 5ceb63cc35 created panel of normals queue creation script and instructions
increased runtime java memory, changed default PON for NN to be new ICE PON

updated FP rates, when using new default PON.  SNPs up by ~3%, INDELs down by 40%

updated git hash reference

updated git hash reference
2015-06-02 16:23:10 -04:00
Geraldine Van der Auwera 526f7c0d07 Merge pull request #985 from broadinstitute/sa_refactor_cleansing_hack_negative_zeros_973_depends_on_841
removed in-line conditional (hack) that changed the result from 0.0 to -0.0; see issue #841
2015-05-23 00:02:52 -04:00
Eric Banks 27d3bafcbd Merge pull request #997 from broadinstitute/eb_add_foreign_read_filter
Added a new filter that can be used to remove reads that are too smal…
2015-05-22 14:34:28 -04:00
Eric Banks 8c81e7df95 Added a new filter that can be used to remove reads that are too small and overly clipped. 2015-05-22 14:33:35 -04:00
Ron Levine 3b0cb028e6 Fix loading of VectorLoglessPairHMM by rolling back to Intel's lib version 2015-05-22 14:16:00 -04:00
Geraldine Van der Auwera 7f306bc4b6 Merge pull request #980 from broadinstitute/Sheila_QD_Update
Sheila qd update
2015-05-22 12:04:43 -04:00
Sheila Chandran dac0b8ddfc Added QD calculation 2015-05-22 11:59:10 -04:00
Geraldine Van der Auwera e96e52ee9d Merge pull request #986 from broadinstitute/rhl_select_genotype_filter_status
Site-level selection based on genotype filter status
2015-05-22 09:59:00 -04:00
Ron Levine a6ca97ef14 Site-level selection based on genotype filter status 2015-05-21 11:27:20 -04:00
melonistic 8d25b2ba40 removed in-line conditional (hack) that changed the result from 0.0 to -0.0; see issue #841
removed irrelevant -0 comments as specified in issue #841 but committed in #973
2015-05-16 23:12:09 -04:00
kcibul 28a7ea43ec Merge pull request #982 from broadinstitute/kc_fp_analysis
added "artifact detection mode" for PON creation
2015-05-15 07:45:20 -04:00
Kristian Cibulskis 3b1ee17727 added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)
refactored out VCF constants and added descriptions

added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)

added new dream evaulation markdown

added results for SMC 4

fixed up documentation, moved location to /dsde/working/mutect/dream_smc, and checked in scala script

added "artifact detection mode" for PON creation
added "str_contraction" artifact filter (improves specificity, especially in exomes)

fixed bug which would overwrite germline_risk filter errors
updated "how to" documents and records

fixed license text

thinned down FP regression test from 700 sites to 100.  we have better ways (DREAM, NN) to check accuracy of the method and 100 is good enough to catch regressions

why oh why do the MD5-based unit tests produce different results on different machine architectures?  I hate that :/

Thanks to GG, LDG and DR -- test should now produce the same results regardless of machine architecture

disabled downsampling... hopefully in the final attempt to make this work cross architecture!

enforced LOGLESS_CACHING... hopefully in the final final attempt to make this work cross architecture!

refactored out VCF constants and added descriptions
2015-05-15 07:14:33 -04:00
Geraldine Van der Auwera d1a7edd796 Update pom versions to mark the start of GATK 3.5 development 2015-05-15 00:44:54 -04:00
Geraldine Van der Auwera f19618653a Update pom versions for the 3.4 release 2015-05-15 00:40:39 -04:00