Commit Graph

13518 Commits (4f4ff5c3277b53ecb60badbd5c0681d8f589ca72)

Author SHA1 Message Date
kshakir 4f4ff5c327 Merge pull request #682 from broadinstitute/ks_revert_md5_db_per_test
Reverting md5 db per test
2014-07-11 03:42:18 +08:00
Ryan Poplin 193e389b41 Merge pull request #679 from broadinstitute/eb_better_tail_merging_PT74222522
Improved tail merging: now tails can be merged to branches that are not ...
2014-07-10 13:54:33 -04:00
Valentin Ruano Rubio 598b481733 Merge pull request #671 from broadinstitute/vrr_omniploidy_assesment_na12878
Now AssessNA12878 can handle different input ploidy for omniploidy asses...

Story:

  https://www.pivotaltracker.com/story/show/74469346
2014-07-10 13:19:32 -04:00
Khalid Shakir 18f6d56b4c Revert "Using the base directory for each test run when outputting MD5DB mismatches."
This reverts commit f192f032a153755a84b1d682f6e652a7c6787fb9.
2014-07-11 01:11:25 +08:00
Khalid Shakir cc09ef9190 Revert "Appending to md5db in the gatkdir, with additional logging."
This reverts commit 0aa2884f7b006f5d48c325bf942b92c183e45074.
2014-07-11 01:11:20 +08:00
droazen a105cc4c0a Merge pull request #681 from broadinstitute/dr_md5mismatch_generator_script
Shell script to generate md5 mismatches file from a Bamboo log of an integration test run
2014-07-10 10:40:09 -04:00
David Roazen af0aab8791 Shell script to generate md5 mismatches file from a Bamboo log of an integration test run 2014-07-10 10:32:10 -04:00
Eric Banks 1d97b4a191 Improved tail merging: now tails can be merged to branches that are not entirely reference.
This is useful for e.g. cases where there are SNPs on insertions.  Before tails were forced to be merged
(incorrectly) only to a reference node, but now they can be merged to any path in the graph from which they
directly branch.

Also, I've transferred over Ryan's code to refuse to process kmer sizes such that there are non-unique kmers
in the reference sequence with them.
2014-07-10 08:57:01 -04:00
kshakir aecd34d274 Merge pull request #677 from broadinstitute/ks_md5_db_per_test_type
Appending to md5db in the gatkdir, with additional logging.
2014-07-10 17:53:24 +08:00
Ryan Poplin 5eee065133 Merge pull request #674 from broadinstitute/rp_improve_genotyping
Improvements to genotyping accuracy.
2014-07-09 16:03:09 -04:00
Khalid Shakir a7d1904c63 Appending to md5db in the gatkdir, with additional logging. 2014-07-10 03:58:47 +08:00
Valentin Ruano-Rubio cd2c5ce1b2 Now AssessNA12878 can handle different input ploidy for omniploidy assessment where we may run HC as if it was a haploid, tetraploid (and so forth) organism.
For example, when the input is Haploid it is considered ok to have a FN if the actual genotype is 0/1 as there is 50% chance to not call it at all.

Also it considers that the genotype call is concordant as long as the AC is as close as it can be to the 50% percent given the ploidy. So for a 0/1 true call is it ok
to have a 0 or 1 call in haploids and also 0/0/1/1 in tetraploid, and also 0/0/1 or 0/1/1 with triploid input, but it is not a 0/0/0/1 in tetraploids or 0/0/0/0/1/1 with hextaploid input.

Story:

  http://www.pivotaltracker.com/story/show/72090992

Changes:

  AssessNA12878 has a new argument (-ploidy / --inputPloidy) to indicate the expected ploidy of the input.
  By default this is the obvious choice of 2 as NA12878 is human.

  In the input has calls with a different ploidy it will complain with an user exception.

  Also some refactoring has been done to make the code a bit more concise in some parts.
2014-07-09 14:42:33 -04:00
droazen a85bc0b577 Merge pull request #678 from broadinstitute/dr_remove_junit_imports
Remove junit imports in the test suite
2014-07-09 14:40:38 -04:00
Ryan Poplin 74a7674d70 Improvements to genotyping accuracy.
-- Global mismapping penalty was only applied to the reference haplotype. This led to problems with overlapping events, mostly STR haplotypes. Now the penalty is applied to every haplotype.
-- We subset the reads down to only those which overlap the event (after assembly based realignment) for likelihood calculations.
2014-07-09 13:11:07 -04:00
David Roazen 719e685759 Remove junit imports in the test suite 2014-07-09 12:09:27 -04:00
droazen 465c5ed9b0 Merge pull request #676 from broadinstitute/cww_junit_assert
Replace improper use of org.junit imports with org.testng.
2014-07-08 16:12:34 -04:00
Chris Whelan 635b9bd1af Replace improper use of org.junit imports with org.testng. 2014-07-08 16:07:39 -04:00
droazen b2e51838d7 Merge pull request #673 from broadinstitute/ks_md5_db_per_test_run
Creating and MD5DB per each test, instead of overwriting in the top dir
2014-07-08 10:58:59 -04:00
Khalid Shakir 2129aa05d8 Bug fix for poms missing package test artifacts. 2014-07-08 06:34:26 +08:00
Khalid Shakir e5be9c7073 Using the base directory for each test run when outputting MD5DB mismatches. 2014-07-08 06:34:25 +08:00
Ryan Poplin f084fa5ab2 Merge pull request #669 from broadinstitute/eb_improve_decomposition_of_haplotypes_PT71966916
When converting a haplotype to a set of variants we now check for cases that are overly complex.
2014-07-02 10:08:32 -04:00
Eric Banks bad7865078 When converting a haplotype to a set of variants we now check for cases that are overly complex.
In these cases, where the alignment contains multiple indels, we output a single complex
variant instead of the multiple partial indels.

We also re-enable dangling tail recovery by default.
2014-07-01 14:18:59 -04:00
ldgauthier 297c2b0651 Merge pull request #670 from broadinstitute/rp_SBBS_min_count
SB tables should be created even if the ref or alt columns have no count...
2014-07-01 08:34:23 -04:00
Ryan Poplin e14bff212d SB tables should be created even if the ref or alt columns have no counts. This is so that FS/SOR will still be calculated when the variant is extremely high or low frequency.
-- Removed long running HC integration test... sorry
2014-06-30 15:19:15 -04:00
Ryan Poplin 5c45641051 Merge pull request #668 from broadinstitute/rp_AD_realigned_reads_8b9d1c1
Reads are now realigned to the most likely haplotype before being used b...
2014-06-30 12:35:40 -04:00
Ryan Poplin 0127799cba Reads are now realigned to the most likely haplotype before being used by the annotations.
-- AD,DP will now correspond directly to the reads that were used to construct the PLs
-- RankSumTests, etc. will use the bases from the realigned reads instead of the original alignments
-- There is now no additional runtime cost to realign the reads when using bamout or GVCF mode
-- bamout mode no longer sets the mapping quality to zero for uninformative reads, instead the read will not be given an HC tag
2014-06-30 10:35:50 -04:00
jmthibault79 dc93507023 Merge pull request #667 from broadinstitute/ks_refactor_doc_scala_package
Refactored DoC scala package
2014-06-25 16:05:30 -04:00
Khalid Shakir f5345e903a Fixed minor pom.xml line that was out of place after sortpom. 2014-06-26 01:01:19 +08:00
Khalid Shakir 7b5f88a49c Refactored DoC custom Queue wrappers to a non-package object.
Now, "mvn verify && mvn verify" should work again.
2014-06-26 00:59:18 +08:00
droazen b935ed0df1 Merge pull request #665 from broadinstitute/ks_force_delete_bad_symlinks
Executing a version of the delete_maven_links.sh
2014-06-25 00:13:05 -04:00
Eric Banks 92028027c9 Merge pull request #666 from broadinstitute/pd_merge_selectvariants_test
Removed redundant SelectVariantsIntegrationTest, merged its only test i...
2014-06-24 21:19:46 -04:00
Phillip Dexheimer 06d619e9aa Removed redundant SelectVariantsIntegrationTest, merged it's only test into protected version 2014-06-24 18:59:59 -04:00
Khalid Shakir 45d819a00e For now, executing the delete_maven_links.sh just ahead of creating the symbolic links during the process-test-resources phase.
Better than running it during the "clean" phase, since these users may not run "mvn clean" before attempting to build.
2014-06-25 02:32:15 +08:00
kshakir 42cfa3b53b Merge pull request #664 from broadinstitute/ks_picard_maven_gatk_root
Fixed gatk-root path in private/picard-maven/pom.xml
2014-06-24 18:48:40 +08:00
Khalid Shakir bffc9fbabd Fixed gatk-root path in private/picard-maven/pom.xml 2014-06-24 18:45:51 +08:00
Eric Banks c191103326 Merge pull request #663 from broadinstitute/pd_jexl_user_exception
Recast the "Invalid JEXL expression detected" error in SelectVariants fr...
2014-06-20 17:17:54 -04:00
Phillip Dexheimer 65eeb4a7ab Recast the "Invalid JEXL expression detected" error in SelectVariants from a RuntimeException to a UserException
- PT 68931448
2014-06-20 00:05:23 -04:00
Eric Banks db7dc8ab5f Merge pull request #660 from broadinstitute/pd_catvariants_list
Added functionality to CatVariants to process .list files with -V
2014-06-19 23:54:03 -04:00
Phillip Dexheimer da5e567b73 Added functionality to CatVariants to process .list files with -V
- Pivotal 70305712
2014-06-19 21:46:13 -04:00
Ryan Poplin da1dab6c32 Merge pull request #661 from broadinstitute/jw_allele_balance_gvcf
Enable AB annotation in reference model pipeline. Incorporates patches f...
2014-06-19 13:10:41 -04:00
Eric Banks 41d2a793f0 Merge pull request #662 from broadinstitute/eb_Carlos_Borroto_commits
Eb carlos borroto commits
2014-06-19 12:49:14 -04:00
Eric Banks 1092dd6e25 From Carlos Barroto: switch outputRoot in SplitSamFile to an empty string instead of null. 2014-06-19 11:06:55 -04:00
Eric Banks 9212edba41 From Carlos Barroto: made 'level' in Picard's CalculateHsMetrics Scala Queue extension an argument. 2014-06-19 11:06:50 -04:00
Ryan Poplin 8b75428a90 Enable AB annotation in reference model pipeline. Incorporates patches from John Wallace to public github account 2014-06-19 09:35:04 -04:00
Eric Banks 2df2a153e6 Merge pull request #658 from broadinstitute/ldg_PbyTwithPriors
Updated CalculateGenotypePosteriors to compute genotype posteriors using...
2014-06-18 15:04:39 -04:00
Eric Banks 9640d5d745 Merge pull request #649 from broadinstitute/pd_genotypegvcf_args
Refactored StandardCallerArgumentCollection to expose args to GenotypeGVCFs
2014-06-18 11:36:13 -04:00
Laura Gauthier 2356d5d63f Updated CalculateGenotypePosteriors to compute genotype posteriors using likelihoods from all members of the trio.
(Right now it only works if all members of the trio are called.)
Takes posteriors as input, defaulting to PLs
Added annotations for possible de novos for us in full genotype refinement pipeline
Added family priors to CGP integration test.
Changed CGP to use PP tag instead of GP tag because posteriors are Phred-scaled. Updated CGP integration test md5s to reflect change.
2014-06-18 11:17:15 -04:00
Nigel Delaney 7570666f2a Merge pull request #655 from broadinstitute/nfd_mathutil_opts
Optimization of function to calculate the logged sum of exponentiated values
2014-06-17 17:07:42 -04:00
Nigel Delaney 5e258bfeff Minor optimization to function to calculate the log of exponentials.
* Avoids calling Math.Pow whenever possible (skips -Inf and 0 values),
leads to better performance.
2014-06-17 15:26:10 -04:00
Ryan Poplin 0ce28b78e0 Merge pull request #657 from broadinstitute/rp_ref_model_pipeline_noMQ_indels
Using MQRanksum for indels was a bad idea according to the knowledgebase...
2014-06-16 11:29:11 -04:00