Commit Graph

12072 Commits (6b4d88ebe96d3383a0778c6f8a3bbf6bd88ccaee)

Author SHA1 Message Date
Geraldine Van der Auwera 6b4d88ebe9 Created ListAnnotations utility (extends CommandLineProgram)
--Refactored listAnnotations basic method out of VA into HelpUtils
	--HelpUtils.listAnnotations() is now called by both VA and the new ListAnnotations utility (lives in sting.tools)
	--This way we keep the VA --list option but we also offer a way to list annotations without a full valid VA command-line, which was a pain users continually complained about
	--We could get rid of the VA --list option altogether ...?
2013-03-20 06:15:27 -04:00
Geraldine Van der Auwera 95a9ed853d Made some documentation updates & fixes
--Mostly doc block tweaks
	--Added @DocumentedGATKFeature to some walkers that were undocumented because they were ending up in "uncategorized". Very important for GSA: if a walker is in public or protected, it HAS to be properly tagged-in. If it's not ready for the public, it should be in private.
2013-03-20 06:15:20 -04:00
Ryan Poplin c813259283 Merge pull request #119 from broadinstitute/md_assessn12878_bugfixes
AssessNA12878 bugfixes
2013-03-19 05:11:50 -07:00
David Roazen d4f873f664 Revert "github webhook handler: convert from daemon to cron job"
Turns out the email script doesn't work correctly from cron.
Converting the webhook script back to a daemon for now until
it can be made to work as a cron job.

This reverts commit 9679accb641537f5c637cce0aeb63f3925521b42.
2013-03-19 03:50:39 -04:00
David Roazen ff79118379 github webhook handler: convert from daemon to cron job
-having this as a daemon was annoying because we had to be sure to
 re-spawn the daemon whenever it got killed

-now it will be run as a cron job once per minute

-delete now-unnecessary spawn script
2013-03-19 02:47:13 -04:00
David Roazen f9ad8d4325 Merged bug fix from Stable into Unstable
Conflicts:
	private/gsa-engineering/pdfgen/trigger_pdfgen.sh
2013-03-19 01:23:58 -04:00
David Roazen 532efad8cd Release scripts: small changes to reduce intermittent failures
-don't check exit status of wget in the trigger_pdfgen script;
 it was exiting with non-0 status even though the pdf generation
 was being triggered correctly

-introduce a delay after filtering the git history to allow HEAD
 to be properly reset

-re-enable sanity checks in filter_stable and source_release scripts
 that had temporarily been disabled while the new protected repository
 was being set up
2013-03-19 01:09:30 -04:00
Mark DePristo d7bec9eb6e AssessNA12878 bugfixes
-- @Output isn't required for AssessNA12878
-- Previous version would could non-variant sites in NA12878 that resulted from subsetting a multi-sample VC to NA12878 as CALLED_BUT_NOT_IN_DB sites.  Now they are properly skipped
-- Bugfix for subsetting samples to NA12878.  Previous version wouldn't trim the alleles when subsetting down a multi-sample VCF, so we'd have false FN/FP sites at indels when the multi-sample VCF has alleles that result in the subset for NA12878 having non-trimmed alleles.  Fixed and unit tested now.
2013-03-18 15:48:08 -04:00
Eric Banks a36e2b8f9d Merge pull request #118 from broadinstitute/ami-typoInCoveredByNSamplesSites
fix typos in argument docs in CoveredByNSamplesSites and rewrite an unac...
2013-03-18 11:10:10 -07:00
Ami Levy-Moonshine 0e9c1913ff fix typos in argument docs and in printed output in CoveredByNSamplesSites and rewrite an unaccurate comment 2013-03-18 13:54:21 -04:00
Mark DePristo 2b80068164 Merged bug fix from Stable into Unstable 2013-03-18 12:36:21 -04:00
Mark DePristo 7ab7c873a1 Temp. to PairHMM to avoid bad likelihoods
-- Simply caps PairHMM likelihoods from rising above 0 by taking the min of the likelihood and 0.  Will be properly fixed in GATK 2.5 with better PairHMM implementation.
2013-03-18 12:34:51 -04:00
David Roazen a67d8c8dd6 Bump timeout for MaxRuntimeIntegrationTest
Looks like returning this timeout to its original value was a
bit too aggressive -- adding 40 seconds to the tolerance limit.
2013-03-17 16:17:29 -04:00
droazen a67aae0261 Merge pull request #114 from broadinstitute/dr_tweak_test_timeouts
Further tweaking of test timeouts
2013-03-15 15:43:55 -07:00
Mark DePristo d86a1242d1 Merge pull request #115 from broadinstitute/md_kb_unstable_server_GSA-778
NA12878 KB startup script takes full path to GATK.jar
2013-03-15 13:34:10 -07:00
Mark DePristo 2f27e5682a NA12878 KB startup script takes full path to GATK.jar 2013-03-15 16:33:29 -04:00
David Roazen 236eb54abd Trivial script to publish private unstable jars for group use
-Jars will get updated every time the "Serial Commit Tests" plan in
 Bamboo passes on the master branch

-Differs from the nightly builds in that it includes "private" and
 has actually passed the test suite

-latest jar is always located at:
 /humgen/gsa-hpprojects/GATK/private_unstable_builds/GenomeAnalysisTK_latest_unstable.jar
2013-03-15 16:00:59 -04:00
Mark DePristo 090db06793 Merge pull request #110 from broadinstitute/rp_fix_extending_partial_haplotype_bug_GSA-840
Bug fix in assembly for edge case in which the extendPartialHaplotype fu...
2013-03-15 11:53:31 -07:00
David Roazen 742a7651e9 Further tweaking of test timeouts
Increase one timeout, restore others that were only timing out due to the
Java crypto lib bug to their original values.

-DOUBLE timeout for NanoSchedulerUnitTest.testNanoSchedulerInLoop()

-REDUCE timeout for EngineFeaturesIntegrationTest to its original value

-REDUCE timeout for MaxRuntimeIntegrationTest to its original value

-REDUCE timeout for GATKRunReportUnitTest to its original value
2013-03-15 14:49:21 -04:00
droazen e681df68c9 Merge pull request #113 from broadinstitute/dr_parallel_tests_print_exited_classes
parallel tests: print names of test classes that had an error in real time
2013-03-15 11:41:40 -07:00
David Roazen 68c6ebd93f parallel tests: print names of test classes that had an error in real time 2013-03-15 14:28:20 -04:00
Ryan Poplin 0cf5d30dac Bug fix in assembly for edge case in which the extendPartialHaplotype function was filling in deletions in the middle of haplotypes. 2013-03-15 14:20:25 -04:00
droazen 9d6d1f94b0 Merge pull request #112 from broadinstitute/dr_parallel_tests_print_unfinished_classes
parallel tests: start printing the names of unfinished test classes once...
2013-03-15 10:57:59 -07:00
Mark DePristo 4a042e9bff Merge pull request #111 from broadinstitute/rp_no_ref_padding_bug_GSA-860
Fix for edge case bug of trying to create insertions/deletions on the ed...
2013-03-15 10:34:45 -07:00
David Roazen f42a52c090 parallel tests: start printing the names of unfinished test classes once there are < 10 jobs left
This will let us see in real time in Bamboo which classes are preventing
our runs from finishing
2013-03-15 13:34:30 -04:00
Ryan Poplin b8991f5e98 Fix for edge case bug of trying to create insertions/deletions on the edge of contigs.
-- Added integration test using MT that previously failed
2013-03-15 12:32:13 -04:00
David Roazen 0fd40dbde9 parallel tests: use experimental Class A storage
(We were previously using Class C storage)
2013-03-15 10:20:27 -04:00
Ryan Poplin daa0f8b551 Merge pull request #109 from broadinstitute/md_qd_fix_for_high_depth
QualityByDepth remaps QD values > 40 to a gaussian around 30
2013-03-15 07:05:32 -07:00
Mark DePristo 8317cc155e Merge pull request #108 from broadinstitute/eb_bqsr_out_of_bounds_fix
Added check in the MalformedReadFilter for reads without stored bases (i...
2013-03-14 17:29:35 -07:00
MauricioCarneiro 6f0269df2c Merge pull request #107 from broadinstitute/eb_fix_bqsr_clip_exception 2013-03-14 14:40:06 -07:00
Eric Banks 232afdcbea Added check in the MalformedReadFilter for reads without stored bases (i.e. that use '*').
* We now throw a User Error for such reads
  * User can override this to filter instead with --filter_bases_not_stored
  * Added appropriate unit test
2013-03-14 17:17:26 -04:00
Mark DePristo 2d35065238 QualityByDepth remaps QD values > 40 to a gaussian around 30
-- This is a temporarily fix / hack to deal with the very high QD values that are generated by the haplotype caller when nearby events occur within reads.  In that case, the QUAL field can be many fold higher than normal, and results in an inflated QD value.  This hack projects such high QD values back into the good range (as these are good variants in general) so they aren't filtered away by VQSR.
-- The long-term solution to this problem is to move the HaplotypeCaller to the full bubble calling algorithm
-- Update md5s
2013-03-14 16:09:41 -04:00
droazen 0fd9f0e77c Merge pull request #104 from broadinstitute/eb_fix_output_annotation_GSA-837
Fixed the logic of the @Output annotation and its interaction with 'required'
2013-03-14 12:52:00 -07:00
David Roazen c3b5f66386 run_parallel_tests: further attempts to work around git issues in bamboo 2013-03-14 15:35:55 -04:00
Mark DePristo 5d6faef50e Merge pull request #106 from broadinstitute/rp_unknown_sites_assess_as_tp_in_kb
Changing CALLED_IN_DB_UNKNOWN_STATUS to count as TRUE_POSITIVEs in the s...
2013-03-14 11:50:12 -07:00
Ryan Poplin 38914384d1 Changing CALLED_IN_DB_UNKNOWN_STATUS to count as TRUE_POSITIVEs in the simplified stats for AssessNA12878. 2013-03-14 14:44:18 -04:00
Eric Banks 6d6264b108 Merge pull request #105 from broadinstitute/gg_annotations_cleanup_45802765
Cleaned up annotations
2013-03-14 11:35:00 -07:00
delangel ec43112d28 Merge pull request #100 from broadinstitute/eb_maxIndelSize_SV_fix
Fixed bug in SelectVariants where maxIndelSize argument wasn't getting a...
2013-03-14 11:32:56 -07:00
Geraldine Van der Auwera 61349ecefa Cleaned up annotations
- Moved AverageAltAlleleLength, MappingQualityZeroFraction and TechnologyComposition to Private
  - VariantType, TransmissionDisequilibriumTest, MVLikelihoodRatio and GCContent are no longer Experimental
  - AlleleBalanceBySample, HardyWeinberg and HomopolymerRun are Experimental and available to users with a big bold caveat message
  - Refactored getMeanAltAlleleLength() out of AverageAltAlleleLength into GATKVariantContextUtils in order to make QualByDepth independent of where AverageAltAlleleLength lives
  - Unrelated change, bundled in for convenience: made HC argument includeUnmappedreads @Hidden
  - Removed unnecessary check in AverageAltAlleleLength
2013-03-14 14:26:48 -04:00
Eric Banks 7cab709a88 Fixed the logic of the @Output annotation and its interaction with 'required'.
ALL GATK DEVELOPERS PLEASE READ NOTES BELOW:

I have updated the @Output annotation to behave differently and to include a 'defaultToStdout' tag.
  * The 'defaultToStdout' tags lets walkers specify whether to default to stdout if -o is not provided.
  * The logic for @Output is now:
    * if required==true then -o MUST be provided or a User Error is generated.
    * if required==false and defaultToStdout==true then the output is assigned to stdout if no -o is provided.
      * this is the default behavior (i.e. @Output with no modifiers).
    * if required==false and defaultToStdout==false then the output object is null.
      * use this combination for truly optional outputs (e.g. the -badSites option in AssessNA12878).

  * I have updated walkers so that previous behavior has been maintained (as best I could).
    * In general, all @Outputs with default long/short names have required=false.
    * Walkers with nWayOut options must have required==false and defaultToStdout==false (I added checks for this)
  * I added unit tests for @Output changes with David's help (thanks!).
  * #resolve GSA-837
2013-03-14 11:58:51 -04:00
Eric Banks 573ed07ad0 Fixed reported bug in BQSR for RNA seq alignments with Ns.
* ClippingOp updated to incorporate Ns in the hard clips.
  * ReadUtils.getReadCoordinateForReferenceCoordinate() updated to account for Ns.
  * Added test that covers the BQSR case we saw.
  * Created GSA-856 (for Mauricio) to add lots of tests to ReadUtils.
    * It will require refactoring code and not in the scope of what I was willing to do to fix this.
2013-03-14 11:26:52 -04:00
David Roazen acaa96f853 parallel_tests: use a safer method to copy the working dir into an LSF-accessible location
-"git clone" was failing intermittently with disturbing error messages about
 missing certain files. Use cp -r instead.

-Add extra checks and steps to try to ensure we have a complete checkout
 with no missing files.
2013-03-14 11:23:56 -04:00
David Roazen be729410b9 run_parallel_tests: use independent java.io.tmpdir for each run
-Turns out the Java 6 JCE crypto library (used to decrypt our AWS keys)
 uses the current list of files in the java.io.tmpdir as a source of
 entropy. This file list operation was prohibitively slow with a large,
 shared temp directory.

-Starting with an independent, empty temp dir for each run should solve
 this problem, and get rid of all/most of the test timeouts we've been
 seeing.
2013-03-14 08:55:26 -04:00
Eric Banks ff87b62fe3 Fixed bug in SelectVariants where maxIndelSize argument wasn't getting applied to deletions.
Added unit tests and docs.
2013-03-13 15:11:34 -04:00
Ryan Poplin 3b4dca1b94 Merge pull request #103 from broadinstitute/md_fragutils
Cleanup FragmentUtils; Add concept of strandless reads
2013-03-13 10:12:40 -07:00
Mark DePristo b5b63eaac7 New GATKSAMRecord concept of a strandless read, update to FS
-- Strandless GATK reads are ones where they don't really have a meaningful strand value, such as Reduced Reads or fragment merged reads.  Added GATKSAMRecord support for such reads, along with unit tests
-- The merge overlapping fragments code in FragmentUtils now produces strandless merged fragments
-- FisherStrand annotation generalized to treat strandless as providing 1/2 the representative count for both strands.  This means that that merged fragments are properly handled from the HC, so we don't hallucinate fake strand-bias just because we managed to merge a lot of reads together.
-- The previous getReducedCount() wouldn't work if a read was made into a reduced read after getReducedCount() had been called.  Added new GATKSAMRecord method setReducedCounts() that does the right thing.  Updated SlidingWindow and SyntheticRead to explicitly call this function, and so the readTag parameter is now gone.
-- Update MD5s for change to FS calculation.  Differences are just minor updates to the FS
2013-03-13 11:16:36 -04:00
Mark DePristo 925846c65f Cleanup of FragmentUtils
-- Code was undocumented, big, and not well tested.  All three things fixed.
-- Currently not passing, but the framework works well for testing
-- Added concat(byte[] ... arrays) to utils
2013-03-13 07:36:20 -04:00
David Roazen 8ed78b453f Increase timeout for a test in the EngineFeaturesIntegrationTest
-This test was intermittently failing when run on the farm
2013-03-12 23:53:26 -04:00
David Roazen 3847de5290 run_parallel_tests: detect farm glitches
-add a function to detect the case where there were no ant test failures,
 but one or more jobs exited with an error
2013-03-12 23:26:33 -04:00
Mark DePristo c289103c7d Merge pull request #102 from broadinstitute/dr_parallel_test_runner_improvements
parallel test runner: support multiple kinds of tests per run, logging, ...
2013-03-12 18:04:55 -07:00