fromer
798955b006
After discussing with Mark, revert to "Master merging" of phase information from VCFs. This has the advantage of creating minimal phased VCFs from RBP, from which phase info is merged into the original "master VCF". Also, updated Genotype.sameGenotype() to be simpler and NOT REVERSE the ignorePhase flag in comparing Allele lists/sets
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5167 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 19:50:15 +00:00
kiran
dac83d21bc
Fixes for IndelLengthHistogram for someone on GS. This evaluator apparently doesn't have an integration test. I'll fix that tonight.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5166 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 19:48:09 +00:00
hanna
06b63d8336
Pulled out CpG stratification in test results at Kiran's suggestion.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5165 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 18:36:09 +00:00
hanna
25f045cac6
Changing locking errors to warnings. This will hopefully allow us to diagnose
...
the mysterious failure in STING_INTEGRATION-3832, the next time it appears.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5164 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 16:29:31 +00:00
hanna
91297c138b
Update VCFStreamingIntegrationTest to use new variant eval command-line
...
arguments, output format.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5162 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 15:40:43 +00:00
hanna
7d89ce820b
Got tired of waiting for Kiran to fix the build: updated NewVariantEval ->
...
VariantEval.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5161 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 15:32:39 +00:00
hanna
96241c6637
More testng fallout: fixing another seemingly 'random' issue arising from an
...
alternate test ordering.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5160 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 15:25:50 +00:00
chartl
e5e65ecfbe
Bugfix for GetSatisfaction: ensure that the two statistics objects (the map, and the pair) are actually pointers to the very same object.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5156 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 06:40:42 +00:00
ebanks
34f5587f2c
As with the cleaner, don't exception out when trying to get the GATK version after -Ddisable.help=true
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5155 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 06:08:59 +00:00
ebanks
4243f0dea7
1) Fix for Tim et al: HashMaps don't necessarily return objects in a deterministic fashion when keys are pointers; break it apart into a list.
...
2) Fix for Kiran: when running with -Ddisable.help=true, don't exception out when trying to get the GATK version.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5154 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 06:06:48 +00:00
kshakir
e74f28ad89
If there's an LSF queue maximum time limit set and the user hasn't specified one for this job, pass on the queue defined maximum limit with the job.
...
Updated LibBatIntegrationTest to use proper networked temp directory accessible by local machines and nodes.
Disabling the FCPTest until the VE3 is incorporated into the fullCallingPipeline.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5151 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 23:13:09 +00:00
hanna
391f248640
Inserted a dangerous (but hidden) command-line argument for use by the Picard team.
...
Used to process intervals over BAMs without indices. Tim understands the risks but
wants this anyway, as a temporary solution to a pipeline problem.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5148 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 22:10:06 +00:00
kiran
4cb910bc38
Fixed import statements.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5145 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 19:26:37 +00:00
kshakir
d4f744a4d4
Checking if the interval files exist before using them to calculate the minimum scatter parts.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5143 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 18:07:34 +00:00
kiran
b7aac3b846
Corrected import statement to reflect VE3's new position in core.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5142 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 18:01:02 +00:00
kiran
3f387bc8d8
Transitioned over to VE3 architecture.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5141 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 17:54:18 +00:00
kiran
401feca90d
Updates to VariantEval 3.0 integration test.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5140 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 17:45:06 +00:00
kiran
cab426f86f
VariantEval 3.0 is now in core.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5139 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 17:42:08 +00:00
fromer
c59b2a8296
Removed experimental "master merging" from CombineVariants
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5138 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 17:13:05 +00:00
kiran
b0432ee1e2
First part of a two-stage commit. Removing old VariantEval to make room for VariantEval 3.0 in core.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5137 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 17:03:41 +00:00
ebanks
d406d9b3fc
There's no reason to special case no-calls if they already have PLs associated with them. Just use the PLs!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5136 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 15:05:45 +00:00
kiran
83dcca7e82
Added ability to load a GATKReport from disk.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5134 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 05:31:49 +00:00
hanna
5e7a5cf924
Quick fix for Danny Lieber: flesh out the additional functionality required
...
to align to a reference other than what's specified in the header.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5133 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 05:28:37 +00:00
depristo
b5d1aab8dc
Scripts to create the GATK IAM user and give him/her rights to PutObject (and only PutObject) into the S3 storage instance. Updated the GATKRunReport to now upload using the GATK user, not mark@depristo.com. Running with -et AWS_S3 sends run reports up to the Amazon S3 cloud now. Going to request a few external users try this option so we can see it running at scale. I'm sure S3 can handle a few hundred thousand 1Kb uploads per days, though
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5132 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 03:48:33 +00:00
kiran
e26da9b047
Changed column-key names to not have spaces, as GATKReport gets very upset about this.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5131 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-31 03:31:54 +00:00
depristo
197c91e2fb
Working implementation of GATKRunReport POSTing to Amazon Web Services S3 storage. Requires users to explicitly provide the secret key to do the upload. Am investigating options to avoid having to do this in the future. Pretty cool little experiment for those who are interested in S3 interaction (extremely trivial)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5130 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-30 21:23:54 +00:00
depristo
8640ca6278
Trivial bug fix so that we don't bring the start up TraversalEngine banner twice when we only process a single locus
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5129 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-30 21:22:16 +00:00
kshakir
2ef66af903
Moved the maximum number of intervals check from FCP to the Queue core so that scatter gather will no longer blow up if you specify a scatter count that is too high.
...
Moved the BamListWriter from FCP to ListWriterFunction in the Queue core.
Added an ExampleCountLoci QScript along with an example pipeline integration test which checks MD5s.
Added a few more utility methods to PipelineTest including a currentGATK variable that points to the GATK jar.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5121 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 23:33:58 +00:00
asivache
04d66a7d0d
Updated integration test's MD5s reflect the fact that assay sequences were previously designed incorrectly for indels, the bug is now fixed.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5120 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 23:00:22 +00:00
scalvo
5934b9cb82
Augment function isChrM by allowing "CRS" in addition to "chrM" or "MT", as a standard contig name indicating the mitochondrial chromosome. CRS stands for Cambridge Reference Sequence and is the standard in the field.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5119 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 22:45:45 +00:00
asivache
7af0532292
An attempt to have more intelligent sorting of RODs. Tested with maf only so far. Should be able to reference-sort dbsnp, bed and vcf as well, bugs nonwithstanding. Very simple, brute-force implementation using SortingCollection. Should I have used tribble indexing machinery instead?
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5118 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 22:10:07 +00:00
asivache
fa8963522b
Ignore header line if it happens to be passed to the codec again, instead of crashing on it
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5116 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 21:44:33 +00:00
asivache
8d389e149f
Now can deal with input files that contain multiple copies of the same event. Only one assay sequence will be designed for each distinct variant, redundant variants will be discarded. Redundancy is defined as same start, same variant type, same ref and alt alleles (it does not matter, e.g., what the sample was as we do not record sample information anywhere).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5115 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 21:42:29 +00:00
fromer
f2de39d661
Calculates phase concordance rates between trio and RBP-phasing tracks, stratified by trio status (Het3, non-Het3)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5114 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 20:50:01 +00:00
fromer
ffd5f407a5
Retain only a single walker to perform calculation of haplotype extents
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5110 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 18:33:32 +00:00
depristo
2182b8c7e2
Better query start / stop function that directly parses the cigar string, unlike the previous version. Now properly handles H (hard-clipped) reads. Added -baq OFF and -baq RECALCULATE integration tests on all three 1KG technologies. Please let me know if this new code somehow fails.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5108 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 15:08:21 +00:00
kiran
9cb1ae384c
Constant precision for floating point numbers. Added integration test - carries over tests from VariantEval with the necessary modifications to command-line arguments and md5s. Disabled use of 'synchronized' keyword because I clearly don't get how that keyword is supposed to work yet...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5107 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 05:19:18 +00:00
depristo
f29bb0639b
Documentation and cleanup of the distributed GATK implementation. Detailed documentation -- given that Matt will be extending the system in the near future -- about how the locking and processing trackers work. Added error trapping to note that distributed, shared-memory parallelism isn't yet implemented, instead of just not working silently. General utility function for the analysis of distributedGATK operation in the analysis directory
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5106 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 03:40:09 +00:00
asivache
f036a178f1
Added support for MAF features. So far works for MAF Lite only, annotated MAF is NOT TESTED yet AT ALL.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5105 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 03:20:46 +00:00
fromer
91e4bb0285
Added walker to calculate haplotype lengths for ALL fragments produced by stitching together phased sites (actually, stitching together everything BUT unphased het sites)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5104 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 03:09:20 +00:00
asivache
ac3fd567b4
Ugly one-off error fixed in building design sequences for indels: the event position is immediately *before* the event, so the ref base at the current locus is the base immediately *before* [ref/alt] element
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5103 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 02:53:03 +00:00
kiran
3e9f185dad
Fixed issue with GenotypeConcordance being initialized incorrectly when the first seen comptrack had no samples.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5102 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 01:12:27 +00:00
kiran
58f0ecff89
Fixes to support evaluations with TableType elements - each such object now gets a separate entry in the output table. Added codon degeneracy stratification. Handle null elements in reports (useful for debugging).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5101 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-27 22:09:59 +00:00
hanna
a264b16358
Patch from Brett (with minor tweaking by me) to expose all the relationships
...
of a particular sample in hash format. Thanks, Brett!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5100 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-27 21:46:13 +00:00
fromer
9c728979cc
In order to calculate haplotype lengths of trio+RBP, I implemented a simple trio phaser as an option to ComparePhasingToTrioPhasingNoRecombination, which already decides if the trio could theoretically phase
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5099 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-27 20:17:48 +00:00
depristo
5ed128f839
Slightly more tolerant timing setting. Main() method in GenomeLocProcessTracker to generating timing data for trackers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5097 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-27 15:16:07 +00:00
depristo
61c29d550d
Fix for NullPointer where a run starts but there's nothing to do (no shards) and reduceInit() wasn't being called correctly
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5096 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-27 15:15:10 +00:00
depristo
f522eb2848
Previous tests were just too big...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5095 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-27 13:48:38 +00:00
kiran
2901299ff6
Sets the number of samples to all of the samples in the file when it's not specifed on the command-line explicitly. GenotypeConcordance no longer a standard evaluation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5094 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-27 01:38:26 +00:00
hanna
4a33cdacde
Some basic integration tests detecting breakage in OTF BAM index generation.
...
Doing it manually for the moment so that there's at least something testing
this capability; will followup eventually with Mark to see whether we can
shape the VCF index generation code in such a way that it supports BAM index
testing as well.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5093 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 23:48:04 +00:00