Commit Graph

13117 Commits (cecdd2f2c5c03083d877af4eb0f8739fbdabb43e)

Author SHA1 Message Date
amilev cecdd2f2c5 Merge pull request #539 from broadinstitute/eb_hard_clip_exon_overhangs_for_ami
Add the capability to the N-cigar splitter to also hard-clip off overhan...
2014-03-03 12:23:11 -05:00
MauricioCarneiro f7d10b9781 Merge pull request #544 from broadinstitute/eb_archive_reduce_reads
Moving Reduce Reads to the archive.
2014-03-03 11:29:12 +09:00
Eric Banks 6c872308d8 Add the capability to the N-cigar splitter to also hard-clip off overhangs based on observed split positions.
We use a "manager" to keep track of observed splits and previous reads.  This can be extended/modified in the
future to try to salvage those overhangs instead of hard-clipping them and/or try other possible strategies.

Added unit tests and more integration tests.
2014-03-02 21:10:34 -05:00
Eric Banks 22ad18b919 Moving Reduce Reads to the archive.
The GATK now fails with a user error if you try to run with a reduced bam.
(I added a unit test for that; everything else here is just the removal of all traces of RR)
2014-03-02 02:03:14 -05:00
Eric Banks 293234a8dc Merge pull request #540 from broadinstitute/eb_add_ability_to_ignore_individual_filters
Add an option to AssessNA12878 to be able to ignore one or more specific...
2014-03-01 22:27:10 -05:00
Eric Banks db85dc6fc0 Add an option to AssessNA12878 to be able to ignore one or more specific filters (instead of either all or none).
Useful in conjunction with ROCCurveNA12878 in determining a good VQSR cut.
2014-03-01 22:25:46 -05:00
kshakir e16996d881 Merge pull request #543 from broadinstitute/ks_mvn_gc_config
Attempting to limit GC during Maven tests
2014-03-01 23:09:21 +07:00
Khalid Shakir 387188e5bb Attempting to limit gc during Maven tests, using defaults found in JavaCommandLineFunction 2014-03-01 15:24:45 +08:00
cwhelan 523eeecc15 Merge pull request #537 from broadinstitute/cw_duplicatevcfcheck_66084436
Added command line checks for duplicate files in ROD lists
2014-02-27 13:39:09 -05:00
Chris Whelan e61ba8b340 Added command line checks for duplicate files in ROD lists
-- Keep a list of processed files in ArgumentTypeDescriptor.getRodBindingsCollection
  -- Throw user exception if a file name duplicates one that was previously parsed
  -- Throw user exception if the ROD list is empty
  -- Added two unit tests to RodBindingCollectionUnitTest
2014-02-27 13:32:18 -05:00
Eric Banks 4395d25726 Merge pull request #538 from broadinstitute/ks_integration_test_fix
Fixes test counts, and full paths of diff commands
2014-02-26 21:04:06 -05:00
Khalid Shakir da587d48ed Using absolute paths in generated diff commands, to ease running them from any directory. 2014-02-27 04:43:39 +08:00
Khalid Shakir c163e6d0d2 Separate failsafe directories for each of the integration test types [#66515572] 2014-02-27 04:43:39 +08:00
Eric Banks 84d8b0e9a1 Merge pull request #535 from broadinstitute/ks_pd_queuelogdir_gatherbam_patches
Ks pd queuelogdir gatherbam patches
2014-02-26 08:55:10 -05:00
Khalid Shakir f02ce6eca7 Added tests for cleaning up scattered .bai files, and using the log directory.
Re-added import java.io.File for BamGatherFunction.
Other cleanup to resolve scala syntax warnings from intellij.
Moved Example UG script to from protected to public.
2014-02-26 02:11:28 +08:00
pdexheimer 0405afeab2 Inherit BamGatherFunction from MergeSamFiles rather than PicardBamFunction
- This change means that BamGatherFunction will now have an @Output field for the BAM index, which will allow the bai to be deleted for intermediate functions

Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2014-02-26 02:11:28 +08:00
pdexheimer 504c125c26 Ensure .out files are saved into logDirectory
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2014-02-26 02:11:28 +08:00
pdexheimer 51dcd364a5 Added logDirectory argument
Signed-off-by: Khalid Shakir <kshakir@broadinstitute.org>
2014-02-26 02:11:28 +08:00
kshakir e340b6237a Merge pull request #534 from broadinstitute/ks_queue_private_link_qscripts
Added missing private qscripts symbolic links to queue-private module.
2014-02-25 06:16:50 -05:00
Khalid Shakir a90745bbe5 Added missing private qscripts symbolic links to queue-private module. 2014-02-25 17:46:47 +08:00
Eric Banks b1885d449b Merge pull request #533 from broadinstitute/eb_normalize_FS_contingency_table
Stopgap procedure to rescue Fisher Strand for cases where there's lots of data.
2014-02-25 02:18:01 -05:00
Eric Banks 0f30df0356 Stopgap procedure to rescue Fisher Strand for cases where there's lots of data.
This commit consists of 2 main changes:
1. When the strand table gets too large, we normalize it down to values that are more reasonable.
2. We don't include a particular sample's contribution unless the total ref and alt counts are at least 2 each;
this is a heuristic method for dealing only with hets.

MD5s change as expected.
Hopefully we'll have a more robust implementation for GATK 3.1.
2014-02-25 01:04:27 -05:00
droazen e8ea9f58d3 Merge pull request #531 from broadinstitute/ks_build_patches
Build patches
2014-02-24 15:13:16 -05:00
Valentin Ruano Rubio 1c7eac50fc Merge pull request #532 from broadinstitute/vrr_graphbased_there_is_no_such_edge_fix
Fix for a bug a bug in (Assembly Graph) Routes.
2014-02-24 12:08:47 -05:00
Valentin Ruano-Rubio 0b3a70b8c1 Fix for a bug a bug in (Assembly Graph) Routes.
The slicePrefix method functionality was broken.

Story:

https://www.pivotaltracker.com/story/show/64595624

Changes:

1. Fixed the bug.
2. Added unit test to check on the method functionality.
3. Added a integration test to verify the bug has been fixed in a empirical data reprudible case.
2014-02-24 10:54:39 -05:00
Khalid Shakir 7e516b294f Replaced local drmaa and Jama artifacts with versions from maven central.
Removed unused caliper binary from local repo.
2014-02-22 01:21:35 +08:00
Khalid Shakir 9b7fc37b14 Moved private/scala/test to private/queue-private/src/test/scala
Added junction/symbolic links so that queue-private tests will run.
2014-02-22 01:21:35 +08:00
Khalid Shakir a75043b207 When git describe fails use "exported" instead of "unknown". 2014-02-22 01:21:35 +08:00
Khalid Shakir 4670c87313 Fixed mvn run for packagetests over external-example. 2014-02-22 01:21:34 +08:00
Khalid Shakir 70ecce2a0f Fixed scope for test-jar depedencies. 2014-02-22 01:21:34 +08:00
Valentin Ruano Rubio a567a4d42c Merge pull request #530 from broadinstitute/vrr_gvcf_enabling_allele_trimming
Activate reverse allele trimming in GVCF
2014-02-20 04:26:25 -05:00
Valentin Ruano-Rubio 463af7143f Activate reverse allele trimming in GVCF
Story:

https://www.pivotaltracker.com/s/projects/1007536

Changes:

1. HC's GenotypingEngine now invokes reverseAlleleTrimming on GVCF variant output lines.
2. GenotypeGVCFs also reverse trim after regenotyping as some alt. alleles are dropped (observed in real-data).
2014-02-20 03:17:24 -05:00
Eric Banks 132e2429c8 Merge pull request #529 from broadinstitute/eb_fix_gvcf_writer_missing_blocks
Fixing a bug in the GVCF writer.
2014-02-20 02:37:32 -05:00
Eric Banks 53a7d5cbae Fixing a bug in the GVCF writer.
The writer was never resetting the pointer to the end of the last non-ref VariantContext that it saw.
This was fine except when it jumped to a new contig - and a lower position on that contig - where it
thought that it was still part of that previous non-ref VariantContext so wouldn't emit a reference
block.  Therefore, ref blocks were missing from the beginnings of all chromosomes (except chr1).

Added unit test to cover this case.
2014-02-20 02:33:43 -05:00
Eric Banks 235f0c6fa0 Merge pull request #528 from broadinstitute/eb_fix_cat_variants_usage_message
Fix the usage message for CatVariants to make it accurate.
2014-02-19 22:45:22 -05:00
Eric Banks 341d1bf2dd Fix the usage message for CatVariants to make it accurate.
It just hit a user on our forum...
2014-02-19 20:42:08 -05:00
Valentin Ruano Rubio 6edebcb4ce Merge pull request #526 from broadinstitute/vrr_genotype_gvcf_untrimmed_alleles
Fixing GenotypesGVCF.
2014-02-19 19:22:48 -05:00
Valentin Ruano-Rubio c167fb5fdf Fixing GenotypesGVCF.
Bug uncovered by some untrimmed alleles in the single sample pipeline output.

Notice however does not fix the untrimmed alleles in general.

Story:

https://www.pivotaltracker.com/story/show/65481104

Changes:

1. Fixed the bug itself.
2. Fixed non-working tests (sliently skipped due to exception in dataProvider).
2014-02-19 14:20:39 -05:00
droazen 6963bf6c91 Merge pull request #527 from broadinstitute/dr_update_internal_build_script_for_maven
Update script to release jars for internal use for maven, and add Queue jar
2014-02-18 16:16:17 -05:00
David Roazen a3110b17a7 Update script to release jars for internal use for maven, and add Queue jar
This script publishes GATK/Queue jars for internal GSA use to the following locations
whenever tests pass:

/humgen/gsa-hpprojects/GATK/private_unstable_builds/GenomeAnalysisTK_latest_unstable.jar
/humgen/gsa-hpprojects/Queue/private_unstable_builds/Queue_latest_unstable.jar

These jars include private code, and so are for internal use only.
2014-02-18 15:54:20 -05:00
Eric Banks 95c85b8105 Merge pull request #523 from broadinstitute/rp_random_forest_vqsr
Initial commit of the random forest classifier.
2014-02-17 15:07:34 -05:00
Ryan Poplin 43c20264b0 Initial commit of the random forest classifier. 2014-02-17 13:07:27 -05:00
kshakir cb2d937a34 Merge pull request #525 from broadinstitute/ks_external_test_fix
Fixed build bug in ./ant-bridge.sh unittest -Dsingle=...
2014-02-15 16:17:26 +08:00
Khalid Shakir a505db79f5 Fixed build bug in ./ant-bridge.sh unittest -Dsingle=..., due to external-example.
pipeline.run property no longer required to be passed by test executor.
2014-02-15 13:52:20 +08:00
droazen 688792c5b0 Merge pull request #520 from broadinstitute/jt_fix_failing_tests_post_maven
Fix for the Array Out of Bounds test error
2014-02-14 14:02:17 -05:00
droazen 1e82f117ad Merge pull request #518 from broadinstitute/ks_skashin_gatkdocs_arguments
Ks skashin gatkdocs arguments
2014-02-14 13:57:19 -05:00
Eric Banks f6022a944b Merge pull request #513 from broadinstitute/eb_clean_up_genotype_posteriors
Various small fixes for CalculateGenotypePosteriors based on feedback fr...
2014-02-14 13:50:46 -05:00
Eric Banks 3724d4e5f3 Various small fixes for CalculateGenotypePosteriors based on feedback from guys in Ben Neale's group.
Note that this tool is still a work in progress and very experimental, so isn't 100% stable.  Most of
the features are untested (both by people and by unit/integration tests) because Chris Hartl implemented
it right before he left, and we're going to need to add tests at some point soon.  I added a first
integration test in this commit, but it's just a start.

The fixes include:

1. Stop having the genotyping code strip out AD values.  It doesn't make sense that it should do this so
I don't know why it was doing that at all.
Updated GenotypeGVCFs so that it doesn't need to manually recover them anymore.
This also helps CalculateGenotypePosteriors which was losing the AD values.
Updated code in LeftAlignAndTrimVariants to strip out PLs and AD, since it wasn't doing that before.
Updated the integration test for that walker to include such data.

2. Chris was calling Math.pow directly on the normalized posteriors which isn't safe.
Instead, the normalization routine itself can revert back to log scale in a safe manner so let's use it.
Also, renamed the variable to posteriorProbabilities (and not likelihoods).

3. Have CGP update the AC/AF/AN counts after fixing GTs.
2014-02-14 13:48:14 -05:00
kshakir 8b136d53b9 Merge pull request #524 from broadinstitute/ks_symlink_bin_jar
Create symlinks target/GenomeAnalysisTK.jar and target/Queue.jar
2014-02-15 02:32:59 +08:00
Khalid Shakir c64131e9fd Updated nightly tar of gatkdocs based on comments by droazen. 2014-02-15 02:27:53 +08:00