fromer
466f8f8a3c
Compares RBP phasing to a simple trio phasing model that can phase a child het iff both parental genotypes are known and at least one of them is not het [at EACH of the sites in the pair to be phased]
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5092 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 23:43:29 +00:00
ebanks
68729045ca
Always best to use the left-aligned version of the dbsnp vcf
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5091 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 20:21:50 +00:00
asivache
43812a28fc
If among all the multiple alignments for the given read we have 'unmapped' ones (can happen with bwa 0.5.7 and maybe later versions), then discard the latters and keep only the mapped ones. Keep 'unmapped' only if its the only alignment available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5090 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 20:07:08 +00:00
asivache
63b709d992
When remapping the read, set MAPQ, CIGAR etc to 0/null for unmapped reads. This is not required according to spec but current samtools jdk otherwise dies in STRICT validation mode.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5089 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 19:49:07 +00:00
ebanks
d33162145b
Moving the --sites_only argument up into the VCFWriter itself so that any walkers that write VCFs can choose not to emit genotypes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5088 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 19:38:16 +00:00
kiran
a97184fddf
Frick! Changed to refer to the *playground* version of VariantEvaluator.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5087 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 19:33:03 +00:00
corin
73e2942c62
Reformated backdrop--removed the date
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5086 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 18:25:59 +00:00
kiran
a9d0772516
When evaluating JEXL expressions, on't blow up if the eval VC is null
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5085 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 18:25:03 +00:00
kiran
22e599ec76
Fixed output report to properly handle evaluation modules with TableType objects. Promoted CpG to a standard stratification. Demoted Filter to a non-standard stratification. Now, if the filter stratification is not specified, VariantEval only evaluates PASSing sites.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5084 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 17:38:21 +00:00
ebanks
2dcce58279
oneoffs walker to assess GLs at truth sites
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5083 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 14:59:05 +00:00
ebanks
dfc5a3d1f3
added integration test for --sites_only option
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5082 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 14:58:15 +00:00
ebanks
0429301536
Added ability to output just sites (no genotypes) from UG with the --sites_only argument. Note that we do still genotype in this mode so that the INFO annotations are identical, but we strip the genotypes out of the VC right before writing to output. In other words, this is not designed to make UG go faster; the point here is to allow downstream tools not to have to parse GTs if they don't want to. Here you go, Ryan.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5081 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 14:52:38 +00:00
ebanks
01e032e89c
Missorted BAMs are User Exceptions
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5080 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 14:09:39 +00:00
depristo
be697d96f9
An apparently robust implementation of the file locking for distributed computation, using Lucene's file creation locking approach. It is worth trying out for those with large-scale, high-cost data sets. Details and discussion at group meeting on Wednesday. Some cleanup still needed.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5079 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 13:45:40 +00:00
kshakir
df2e7bd355
Disabled FCPTest whilst we figure out where the C426 bams went.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5078 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 05:11:57 +00:00
hanna
862b299b47
Fix Picard OTF index generation issue.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5077 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-26 03:42:46 +00:00
kshakir
ce5b11317b
Moved some shutdown logic from the LSF job runner into the QGraph.
...
Because of Java's type erasure JobManagers must provide runtime access to the runner class to shutdown.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5076 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 20:28:54 +00:00
fromer
6ac888d26a
Correct accounting for cases where first het in interval is phased
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5075 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 19:48:54 +00:00
fromer
af79fa629f
PROPERLY print out list of intervals and their stats
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5074 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 19:20:36 +00:00
delangel
db2e2cb0ff
Another trivial change to make VQSR work with indels
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5073 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 19:05:31 +00:00
corin
b22f82d5dd
Minor formatting udpates to deal with long bait names, multiple sequencer types, and date formatting
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5072 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 19:02:40 +00:00
fromer
17ba75e502
Can now print out list of intervals and their stats
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5071 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 18:36:59 +00:00
corin
32cdcc933c
A quick python script to give the status of the projects in the humgen/gsa-pipeline/ directory
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5070 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 15:21:50 +00:00
kshakir
b3c9b9bfbe
+1 file that should have been with the last checkin.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5069 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 05:31:17 +00:00
kshakir
9923e05e0a
Moved MD5 utils from WalkerTest to BaseTest for use by PipelineTests.
...
Moved VariantEval validation from FCPTest to PipelineTest.
Cleaned up some duplicate code for writing temp files during tests.
Moved FCPTest to playground namespace to match move for FCP.q.
Added a basic HelloWorldPipelineTest for the HelloWorld QScript.
Moved duplicated error handling from JobRunners into the FunctionEdge.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5068 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-25 04:11:49 +00:00
hanna
9db02059ac
Fix for Ryan's issue: reads ending with indel distort the location of the
...
pileup, resulting a two map() calls for the same locus (and no map call for
the locus immediately following).
Fixed bug and added comprehensive unit tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5067 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-24 19:49:39 +00:00
kshakir
76ee57639d
Updated FCPTest to match changes to UG in r5058.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5066 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-24 19:30:02 +00:00
depristo
7b92cd5008
Adding lucene dependency for file locking -- may be removed in the near future
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5065 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-24 18:59:42 +00:00
fromer
61fe409211
Basic walker to count the number of (phased) hets in each exome target
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5064 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-24 17:53:14 +00:00
depristo
c50f39a147
V3 of the distributed GATK. High-efficiency implementation. Support for status tracking for debugging and display. Still not safe for production use due to NFS filelock problem. V4 will use alternative file locking mechanism
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5063 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-24 16:45:07 +00:00
delangel
fd864e8e3a
Minimal necessary (but most likely not sufficient) changes to run VQSR on indel data: don't fill Ti/Tv fields if non-SNP, request VC only st start of position, check if isSNP() before doing snp-specific operations.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5062 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-24 02:36:36 +00:00
depristo
a51061fd96
Improved distributed processing analytics. Still not 100% ready for prime-time. More improvements incoming. Iterator claim now supports requests to obtain in a single atomic claim (one lock) multiple sequential shards, which radically reduces overhead. However, deadlocking is still possible...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5061 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-23 16:17:25 +00:00
ebanks
2d4bcb60a1
Don't print out alt alleles for ref calls
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5060 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-23 06:33:31 +00:00
ebanks
2ba35dc7ba
Bad chain files are user errors
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5059 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-23 06:04:36 +00:00
ebanks
2bbcc9275a
Committing the fragment-based calling code. Results look great in all datasets (will show this at 1000G this week with Ryan). Note that this is an intermediate commit. The code needs to be cleaned up and the fragmentation code needs to be moved up into LocusIteratorByState. This should all happen later this week, but I don't want Ryan to have to keep running from my own personal Sting directory. The current crappy implementation adds ~10% to the runtime, but that should all go away in the next iteration.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5058 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-23 05:04:17 +00:00
ebanks
bb6999b032
Better documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5057 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-23 03:36:09 +00:00
corin
1dcdebbc9e
Updating the file path for proper inclusion of the background in the tearsheet.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5056 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-22 19:15:33 +00:00
depristo
c52d2d5f79
Bug fix for SimpleTimer that didn't always convert elapsed times from milliseconds to seconds
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5055 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-22 18:50:59 +00:00
depristo
ff61aeb762
continuing to push to get right answers for long-running jobs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5054 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-22 15:02:02 +00:00
delangel
a50d7f74fa
Change to support plotting of indel quality as a function of covariates - for now, just call different R calling script.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5053 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-22 14:09:23 +00:00
delangel
fa0c476b82
Script for calling indels in all phase 1 samples - VQSR part still needs work but raw calling is done
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5052 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-22 14:07:10 +00:00
depristo
9b1b8d46aa
Performance tracking of GenomeLocProcessingTrackers, as well as a marker for where to put tracker in HierarchicalMicroScheduler
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5051 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 22:24:42 +00:00
rpoplin
95d6ddc38c
lastProgressPrintTime should only be updated when a progress log is printed not when a performance log is printed
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5050 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 22:23:14 +00:00
depristo
8ece2b9230
Distributed GATK analysis scripts
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5049 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 22:09:07 +00:00
carneiro
a0731eaa81
updated NA12878 Trio gold standard data.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5048 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 18:48:31 +00:00
depristo
94b64ec54a
Moving scala script into analysis directory
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5047 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 18:42:18 +00:00
depristo
63e8103c4e
A new top-level directory to hold analysis scripts associated with specific analyses
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5046 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 18:40:02 +00:00
depristo
b45566760e
intermediate checkin
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5045 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 18:39:25 +00:00
kshakir
6fbd18c759
Cleaning up obsolete code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5044 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 16:27:35 +00:00
kshakir
8d46cf3604
Testing a configuration change for build system.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5043 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-21 14:44:41 +00:00