Commit Graph

5164 Commits (1f820d50263eac7e97746b43ffe67bdfbfadeb3c)

Author SHA1 Message Date
kiran 1f820d5026 Added two files from some refactoring changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5205 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-06 19:20:12 +00:00
kiran 1085bbf303 Fixed issue where all comp tracks were being treated as known tracks. Fixed issue where multiple JEXL expressions were causing an exception because the underlying object did not implement the Comparable interface. Fixed issue where variants being compared to the known track were not being checked for equality of variation type. Fixed issue where functional annotations were not being iterated over properly. Refactored a lot of helper methods into a separate VariantEvalUtils utility class. Significantly expanded the test suite using a small VCF with SNPs, indels, and non-variant loci which makes it much easier to see what the proper answer should be, and included the appropriate grep and awk commands in the comments to confirm the values.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5204 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-06 19:19:20 +00:00
depristo c4707631e2 MethodsDevelopmentPipeline is now the test bed for large scale AWS_S3 logging. Can be disabled from command line if this is necessary
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5203 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-06 17:03:45 +00:00
fromer 8b8b4fced1 Removed explicit memoryLimit, so that memLimit given on the command-line will NOT be ignored...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5202 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-06 01:55:17 +00:00
kshakir cc5d695bcf Renamed the IPFL Test to IPFL PipelineTest so that it'll be picked up by the PipelineTests.
HACK: Turned off JNA autoRead() in the jobInfoEnt LSF structure to try and dodge the SIGSEGV during strlen calls during bmods. 


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5201 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-05 00:06:12 +00:00
depristo ce51ffb56e Oops, old local paths committed on accident.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5200 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 23:35:56 +00:00
depristo 29f3ad72f3 SAMFileWriter that allows the user to move reads, but only a bit, in an incoming coordinated sorted BAM files. Does some local reordering and local mate fixing, under specified constrained. These constrains allow us to make a special -- under testing for Eric, who promised to try this out a bit, expand test cases and integration tests -- but soon to be the default and only model of the realigner that only moves reads with ISIZE < 3000 that directly emits a coordinate sorted, mate fixed validating BAM file without needing FixMates externally. Preliminary testing shows this runs in a totally fine amount of memory and produces equivalent results to the previous version.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5199 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 22:27:05 +00:00
depristo 11ea321b39 Trivial header cleanup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5198 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 22:23:15 +00:00
depristo fe4aa58d35 Removing unused class
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5197 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 22:22:28 +00:00
depristo 0ad1ea4aa1 Fixed Umapped misspelling
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5196 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 22:21:41 +00:00
asivache 03f265d8bd Change DP format field description in the header line (expected count=1)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5195 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 21:28:25 +00:00
fromer 4cdc974c5f Preliminary Qscript to run DoC for the purpose of CNV detection
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5194 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 21:25:59 +00:00
asivache c0e998621c Computes two format (genotype) level annotations: total read depth in the given sample (DP format field) and fraction of reads supporting alt allele(s) in the given sample (FA format field)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5193 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 21:23:55 +00:00
asivache 8700b74640 Now annotates indels as well. Probably can also annotate mixed vcf with indels +snps, but not tested in that mode...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5192 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 20:28:03 +00:00
corin cd6ace1b47 Includes UG version of indel genotyping rather than IGV2
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5191 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-04 20:25:46 +00:00
chartl bfc6ef1753 A successful attempt at a queue integration test, ensuring that the InProcessFunction libraries are working as expected.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5190 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 21:30:35 +00:00
carneiro 358a400474 made ApplyVariantCut a default part of the pipeline, added the -noCut option if you don't want to use it.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5189 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 19:29:36 +00:00
corin ce2866122d Calls the bams pulled down from firehose cleaned by default
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5188 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 18:34:07 +00:00
corin a22ea53665 Updated template for the MPG pipeline's queue script runner.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5187 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 18:33:29 +00:00
corin 9fc45e1234 Use the yaml as an arguemtn to get out squid numbers
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5186 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 18:32:05 +00:00
kshakir b1ff371c8f Peek inside the StingText.properties to make sure the version in the properties matches the build version.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5185 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 18:07:55 +00:00
hanna 5c3198520c A few minor modifications masquerading as significant changes according to
svn's logs:
- Copied BAM indexing engine from Picard back into the GATK anticipating
  shard merging algorithm.  Tried to leave most of the building blocks in
  Picard.  If this turns into a logistical nightmare, I'll merge the building
  blocks into the GATK as well.
- Reorganized the org.broadinstitute.sting.gatk.datasources package, giving
  better separation of query and management functionality for reads, ref, rmd,
  and samples.  
- Merged Shard building blocks into org.broadinstitute.sting.gatk.datasources.
  reads package, indicating it's current strong relationship with the reads,
  rather than the general unifying element I wish this would be.
- Collapsed BAMFormatAwareShard into Shard.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5184 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 17:59:19 +00:00
carneiro 7af003666d added optional argument -cut to apply the variant cut to the ts recalibrated vcf.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5183 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 17:34:40 +00:00
chartl 5398cf620a Bug fixes in the in process function (spoiled by python: was not closing my writers). SortByRef now works somewhat like the perl script does, rather than doing a memory-expensive sort. Adding a QTools qscript which is kinda clunky, and will be used mostly for integration tests of these IPFs, pending some better way to construct argument collections and function accessors at compile-time.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5182 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 17:32:46 +00:00
kiran 9ddc95c833 NewEvaluationContext needs to be generated in the inner loop. Otherwise, multiple comp tracks end up getting routed to the same row of the output table. Added test to cover multiple comp tracks.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5181 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 07:04:53 +00:00
chartl a9d0921529 That variable name could only lead to trouble.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5180 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 05:03:48 +00:00
chartl 9515f94242 Commiting a simple merge IPF for use with qscripts (currently use a long grep, awk, pipe command, which can be unsafe and is hard to extend). Tests for all these functions coming soon. Also, IntelliJ + intermittent VPN connection = botched repository.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5179 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 05:01:21 +00:00
ebanks 918cc09477 Allow multiple records at a position
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5178 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-03 04:19:05 +00:00
kiran cb6454bf98 Multiple eval tracks should be bound with different names, rather than just 'eval'. Added tests to cover usage with multiple tracks.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5177 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 22:33:50 +00:00
carneiro cf15819db5 updated to work with the new VariantEval.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5176 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 17:46:07 +00:00
rpoplin 47357b726e Fixing import GenotypeCalculationModel since it doesn't exist anymore.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5175 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 15:39:43 +00:00
rpoplin 5a8e2c2739 Going through the backlog of emergency hacks I put in for the 1000G release. It is possible to call a site in an analysis panel but when using all samples the site isn't called because of going over the minimum deletion threshold, for example.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5174 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 15:12:26 +00:00
fromer 7605f0e6c1 Corrected input/output definitions for Queue
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5173 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 07:39:00 +00:00
fromer 3839fd1a25 Updated phasing pipeline to properly read samples from VCF and BAM files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5172 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 07:16:05 +00:00
ebanks 43fb11b923 Removing stray non-ASCII character
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5171 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 03:10:08 +00:00
kiran 2732c839d4 Restored parallelism and associated tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5170 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 02:04:03 +00:00
kiran fd8dd8fb9b Fixed an issue where a no-call in the eval track would prevent a site from a comparison track from being loaded. Added a new test to cover the use case.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5169 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-02 01:47:53 +00:00
carneiro aab0ec209b small bug fix on chromosome names.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5168 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 20:55:19 +00:00
fromer 798955b006 After discussing with Mark, revert to "Master merging" of phase information from VCFs. This has the advantage of creating minimal phased VCFs from RBP, from which phase info is merged into the original "master VCF". Also, updated Genotype.sameGenotype() to be simpler and NOT REVERSE the ignorePhase flag in comparing Allele lists/sets
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5167 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 19:50:15 +00:00
kiran dac83d21bc Fixes for IndelLengthHistogram for someone on GS. This evaluator apparently doesn't have an integration test. I'll fix that tonight.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5166 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 19:48:09 +00:00
hanna 06b63d8336 Pulled out CpG stratification in test results at Kiran's suggestion.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5165 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 18:36:09 +00:00
hanna 25f045cac6 Changing locking errors to warnings. This will hopefully allow us to diagnose
the mysterious failure in STING_INTEGRATION-3832, the next time it appears.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5164 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 16:29:31 +00:00
corin 027c91871f Commenting out something I meant to leave commented out
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5163 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 15:47:52 +00:00
hanna 91297c138b Update VCFStreamingIntegrationTest to use new variant eval command-line
arguments, output format.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5162 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 15:40:43 +00:00
hanna 7d89ce820b Got tired of waiting for Kiran to fix the build: updated NewVariantEval ->
VariantEval.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5161 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 15:32:39 +00:00
hanna 96241c6637 More testng fallout: fixing another seemingly 'random' issue arising from an
alternate test ordering.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5160 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 15:25:50 +00:00
depristo e510798bc2 Missed one uncomment
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5159 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 13:01:59 +00:00
depristo d9532ecf53 Better run reporting structure. Now text report is attached as well as inline in the email, so you can easily view it in fix width fonts!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5158 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 12:58:48 +00:00
depristo 393df46055 updates to handle only reporting on a specific SVN revision. Updated the R script to show the domain name of the runner, now that S3 logging is working
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5157 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 12:02:12 +00:00
chartl e5e65ecfbe Bugfix for GetSatisfaction: ensure that the two statistics objects (the map, and the pair) are actually pointers to the very same object.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5156 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 06:40:42 +00:00