weisburd
3c24223d02
Script for concatenating 2 AnnotatorInputTables, and writing the result to standard out. Merge-sorts the 2 tables while concatenating them
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3505 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-08 22:44:16 +00:00
depristo
f32a32269c
minor change for eric
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3418 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 20:13:36 +00:00
depristo
d1098fa77b
Removing unnecessary dependences that were causing problems for Sendu
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3410 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 13:07:41 +00:00
depristo
886e9c1297
Now can take a VCF file as input
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3378 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 17:06:12 +00:00
depristo
2a212c497f
minor improvements and bug fixes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3376 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 15:54:42 +00:00
depristo
43544cfdf9
remote control of IGV to jump to any number of loci in a file and screenshot the locus to a file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3375 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 12:59:36 +00:00
weisburd
04e14ef85a
Refactored so it could be used for knownGene and CCDS as well as refGene
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3373 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 02:45:11 +00:00
depristo
2a803e9044
simple tool that takes two dbSNP files and subsets the seconds to only include rsID SNPs present in the first. Used to make b129 against b37 by subsetting b131/b37 vs. b129/b36
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3352 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 13:39:09 +00:00
depristo
d3c33d4b3f
more powerful management routines for my pipeline
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3351 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 13:37:39 +00:00
weisburd
f120a00433
Fixed bug so that the strand, alternate, and reference columns are now moved correctly
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3342 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 17:59:39 +00:00
depristo
d6b036cdab
Minor improvements to simple python code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3330 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 21:34:46 +00:00
chartl
d5b675b3e6
Added - Q&D script to gather verbose bed files to a VCF.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3294 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 02:49:16 +00:00
weisburd
a462b5e1e7
Changed a default path
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3291 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 17:07:21 +00:00
weisburd
28f746b76a
Added option to generate UCSC or NCBI sequence
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3283 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 17:26:00 +00:00
weisburd
c214056d88
Script for concatenating results of GenerateTranscriptToInfo.py into one big file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3279 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:47:08 +00:00
weisburd
0069cb426d
Script for spawning LSF jobs that run the TranscriptToInfo.java walker on each of the 50 contigs.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3277 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:27:52 +00:00
weisburd
ba7fe7c4e1
Renamed
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3276 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:25:07 +00:00
weisburd
4937295a0b
Renamed
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3275 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:24:56 +00:00
depristo
bf3dbd8401
some useful routines for working with project processing. madPipeline contains a bunch of useful routines for building pipelines that I finally put into one file. Let's just say that I'm really looking forward to the new pipeline system...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3260 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 12:34:04 +00:00
weisburd
c7b4f78316
Added -m arg
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3233 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 22:38:47 +00:00
depristo
7902db616e
Marginally more useful output
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3201 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 14:45:14 +00:00
chartl
4eba9bffc1
Grabs average SNP calls, mismatch rate, aligned reads, and other important lane metrics from a SQUID export and summarizes them across multiple margins (lane numbers, flowcells, samples, libraries)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3193 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-17 03:09:05 +00:00
depristo
7973806716
interim update
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3173 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 14:25:02 +00:00
chartl
2e4377b1cf
Awesome: JobDispatcher can now dispatch jobs by gene from a target .design file found in /seq/references.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3170 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 18:17:41 +00:00
weisburd
04c22a6640
Added handling of UCSC and NCBI reference sequences
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3165 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:40:31 +00:00
weisburd
2183f10a1d
Script for validating and converting text files into the tabular format required for GenomicAnnotator -B inputs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3156 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-13 13:35:10 +00:00
chartl
fab31e1d53
Check in so I don't lose this code -- spawning of jobs by genes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3137 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-08 16:18:40 +00:00
chartl
27fb6f7594
Make sure to convert non-integer chromosomes (M,X,Y) back from their corresponding integer representations (0,23,24) when writing in .bed format
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3119 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-03 20:01:21 +00:00
chartl
687fd477ff
Just some code I want to freeze. If you ever need to estimate the % of bases covered by exon, given an interval list, give it to getTargetedGenes. Not the best name for this function, but I don't expect anyone to use it but me.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3111 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 20:21:50 +00:00
chartl
ac9c335cd2
This is a python job dispatcher I've been using, which builds on Mark's FarmJob utility, and an example script of how I'm using it. Basically I wrote it to smartly break up analysis over an interval list, givin a maximum number of bases per job, a list of available queues, and a limit on each queue. It handles going over these limits in three ways:
...
1) [default]: Fail
- No jobs are actually spawned
2) Space
- User provides a string of the form A:B:C where A is the number of days to wait before scheduling jobs over
the queue limits; B the number of hours, C the number of minutes. Exceeding the queue limits again will
increment the space by another A:B:C
3) Stop-Resume
- Spawns the maximum number of jobs, and writes a file describing the next job, and a hash code of the remaining jobs.
The next time the script is run, it spawns the next set of jobs starting with what's written in the file. If the
hash code (and thus the command string) changes between runs, the dispatcher fails-fast.
The base job dispatcher class is also capable of dealing with dependencies if it is used correctly.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3102 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-31 19:53:13 +00:00
chartl
dc802aa26f
Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3090 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 13:32:00 +00:00
depristo
08d9ae403d
better farm commands, and simple utility to convert ucsc repeat masks to interval lists
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3040 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 13:11:06 +00:00
chartl
4bdc3b2784
automatic generation of individual and individual set import files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3001 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 10:36:33 +00:00
chartl
d9b12b468f
Adding default filter info
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3000 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 10:05:46 +00:00
andrewk
196bca6819
Script to split concordance files into their constituent sets and calculate summary stats from a concordance file - SNPs called and number in dbSNP
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2992 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-12 22:20:44 +00:00
andrewk
9298e13201
Make annotated VCF not be broken
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2906 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-01 23:22:41 +00:00
kshakir
5f9c3f3884
Outputing annotated VCF to the current directory instead of attempting to write in the directory next to the original vcf.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2869 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 21:31:24 +00:00
kiran
217deb9809
Changed the INFO field delimiter from a comma to a semicolon
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2847 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 20:44:57 +00:00
chartl
0e4b5ad9c6
Check to ensure sample status is "Complete" before writing out the bam file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2844 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 15:36:42 +00:00
chartl
e491b42951
Dumb little script that grabs Picard metrics (alignment, hybrid selection, insert size) from picard_aggregation given the path to the bam file; zips them up, and spits them out; for use with Firehose
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2824 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 14:09:30 +00:00
chartl
60f05379a7
fix typo. Check explicitly that fingerprint files exist.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2821 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-10 21:29:54 +00:00
chartl
d16a1b5645
Simple python script for generating a firehose-parseable text file from MS-Dos formatted TSV spreadsheets (of the type that we get from project management). Will be deprecated in a few weeks with the advent of direct BSP ID entries, but useful until then.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2820 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-10 20:25:40 +00:00
kshakir
57a168c0db
Added a header crediting the python script as a source. Looping over an arbitrary number of headers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2804 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-07 03:38:30 +00:00
andrewk
61a67cdce4
Moving file up a directory for dependencies
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2798 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-05 19:36:13 +00:00
andrewk
58456822ab
Two perl scripts (from Kristian Cibulskis) and one python script for annotating VCF files with the information generated by the cancer MAF annotation tool.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2797 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-05 19:25:46 +00:00
depristo
e964660df3
snpSelector now supports min and max q scores.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2751 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-31 19:38:34 +00:00
depristo
7b3c34d210
keeping a backup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2750 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-31 15:36:25 +00:00
chartl
0fb032a436
Quick script that changes "chr#" to "#" and "chrM" to "MT" and moves mitochondria to the end of the vcf; in accordance with the 1KG reference.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2727 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-28 21:59:33 +00:00
chartl
4990139b60
A collection of python objects that are useful for VCF validation. Use 'em or don't.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2679 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-25 18:44:10 +00:00
depristo
cf46e3c85f
Valuable series of commands
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2641 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-20 13:58:59 +00:00
depristo
8226f4aa12
minor cleanup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2616 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-16 20:23:20 +00:00
chartl
dfe160ff77
Minor changes (additional info calculated)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2522 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-06 16:41:01 +00:00
depristo
588006ee92
Now supports strings in command line for farm submission
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2507 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-06 13:15:40 +00:00
depristo
9fb6533549
new -a option does fast merging of already sorted files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2500 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-05 13:55:39 +00:00
depristo
89f3ee614a
minor printing fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2492 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-30 22:14:50 +00:00
depristo
fcc80e8632
Completely rewritten duplicate traversal, more free of bugs, with integration tests for count duplicates walker validated on a TCGA hybrid capture lane.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2458 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-28 23:56:49 +00:00
andrewk
4e7e0432a2
Updated SNP calling power from coverage tools to work with new UnifiedGenotyper and DepthOfCoverage tools.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2378 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 20:44:30 +00:00
andrewk
f5e547ed6e
Add ability for flat file table parsing module to skip ahead to first occurence of a regular expression (use case: consistently parsing DepthOfCoverage output for histogram section of file across file format changes)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2377 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 20:38:50 +00:00
andrewk
bf76019f22
Minor change to coverage evalution script, to update for new file format and add output fields
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2375 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 18:06:08 +00:00
depristo
0d2a761460
Bugfix for minBaseQuality to ignore deletion reads. LocusMismatch walker now allows us to skip every nths eligable site
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2357 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 14:38:39 +00:00
depristo
56467df49a
minor improvements to snpSelector to work with hapmap chip VCF files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2343 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 17:59:32 +00:00
depristo
b2dfe85648
Better support for reading truth file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2307 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-10 12:16:05 +00:00
chartl
6a4118ad3c
grr, ought to actually assign it to the TRUTH_CALLS variable
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2302 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-09 23:31:46 +00:00
chartl
987fced151
Should read truth data from the parser options rather than direct from args
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2301 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-09 23:26:26 +00:00
chartl
8825211fdb
Adding this to subversion so it's protected
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2299 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-09 21:26:17 +00:00
depristo
2632cb6b58
minor improvements to snp selector
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2275 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-07 03:37:14 +00:00
chartl
b817db0962
Syzygy has a default LOD score of 0.91 on bases with no coverage, this is problematic. Set the minimum lod threshold to 1 because I just don't want to see that codswallop.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2268 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 23:29:14 +00:00
depristo
0753315156
updates to the python snp selector -- now sorts info fields and we stop printing unnecessary debugging info in vcf2table
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2265 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 22:16:02 +00:00
chartl
0f89a38473
forgot to commit this earlier
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2264 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 22:10:16 +00:00
chartl
c1263e841c
stop printing the debug info -- hurr
...
Also it turns out that sometimes there can be a call with zero total non-I/non-D bases -- so add one to numerator and denominator to prevent divide by zero error
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2262 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 16:17:38 +00:00
chartl
0c2d6d7e41
A brute-force script to convert Syzygy lod-score calls files into a proper VCF -- with some useful annotations.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2261 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 16:07:06 +00:00
depristo
2c7cb912f0
Bug fixes for mixed none/valued attributes. also now assigns fake float values for display, if requested, for covariates using the -plottable flag
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2253 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-03 23:52:35 +00:00
depristo
dbb8b86ed1
Minor updates to correctly handle emitting FN calls
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2231 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-02 22:53:17 +00:00
rpoplin
67179e2412
Initial checkin of AnalyzeCovariates.java which replaces analyzeRecalQuals_1KG.py and is updated to use the new Covariates system. It creates similar plots of residual error for each covariate that was used in the calculation. There is also an option to filter out base qualities below a given threshold.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2215 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-02 16:47:35 +00:00
depristo
8a87d5add1
misc. bug fixes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2212 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-02 14:36:03 +00:00
depristo
c93d37d9fb
continuing improvements in output of snpSelector
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2198 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-01 15:42:06 +00:00
depristo
2ea93385be
Better support for comparison to truth. Now emits FP rates for each covariate if a truth file is provided. Also now writes out a detailed recal.log file that can be parsed directly into R
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2179 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-29 22:20:40 +00:00
chartl
662bbbd53b
Awful stupid bug. This will use up one of my bad code offsets.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2178 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-29 20:09:33 +00:00
chartl
fa2d564f2c
And the compulsory one-second-later fix -- better handling of arguments (e.g. for callng from outside of /trunk/python/)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2177 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-29 20:02:43 +00:00
chartl
45673d7851
A quick and dirty script that, given a list of input VCF files, will output a new VCF file which looks identical to the first VCF file of the input list, except that the info field has been updated to reflect the union of all the INFO annotations across the VCF files
...
Note: this is primarily for use with two files with mostly disjoint annotations. It views "SB=2.5" as a different info field than "SB=2.2" and so will output as info "SB=2.5;SB=2.2". That is, it compares the full field string, rather than only the field name.
Usage:
./mergeVCFInfoFields I=[comma-delimited list of files] O=[output file]
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2176 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-29 20:01:29 +00:00
depristo
65da04ca85
Now uses the theoretically correct relationship between SNP FP and TP ratios for Illumina data. maxQ score for a snp is now 60
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2168 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-25 22:08:12 +00:00
depristo
03342c1fdd
Restructuring and interface change to ReadBackedPileup. We now lower support the Pileup interface, the BasicPileup static methods, and the ReadBackedPileup class. Now everything is a ReadBackedPileup and all methods to manipulate pileups are off of it. Also provides the recommended iterable() interface of pileup elements so you can use the syntax for (PileupElement p : pileup) and access directly from p.getBase() and p.getQual() and p.getSecondBase(). Only a few straggler walkers use the old style interface -- but those walkers will be retired soon. Documentation coming in the AM. Please everyone use the new syntax, it's safer, and will be more efficient as soon as the LocusIteratorByState directly emits the ReadBackedPileup for the Alignment context, as opposed to the current interface. In the process of the change over, discovered several bugs in the second-best base code due to things getting out of sync, but these changes were resolved manually. All other integrationtests passed without modification.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2154 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-25 03:51:41 +00:00
depristo
bc35a34f60
More informative printing, no longer prints tons of NaN warnings
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2139 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-24 15:45:48 +00:00
depristo
da7de9960b
General bug fixes for snpSelector. More robust error checking and handling of NaN values.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2106 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-21 14:48:29 +00:00
depristo
52494d8176
cleanup of SNP selector -- ready for some additional testing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2042 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-13 21:46:31 +00:00
depristo
1a4d071d37
Better snpSelector, plus VCFmerge tool
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2022 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-11 22:02:57 +00:00
depristo
3990c6d950
snpSelector v3 -- bootstrapping support and VCF output
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2004 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-09 22:48:51 +00:00
depristo
f777c806d6
snpSelector v2 -- code refactoring and support for comparison with known truth. Looks great.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1986 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-07 19:32:12 +00:00
depristo
7cb51dbc31
snpSelector v1 -- and supporting changes to VCF reader
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1983 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-06 23:00:46 +00:00
chartl
eca0942644
Oops. Let's make sure only to write calls that the pool supports to the auxiliary vcf files.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1974 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-04 17:14:55 +00:00
chartl
fc17e75759
Put this puppy through its paces. Eliminated the sorting and header-handling stuff; that isn't the purvey of this script and should be handled downstream or by a script wrapper.
...
I also secretly handled another pesky overlow exception. Occasionally Syzygy could report lods of like -1000; e.g. posterior probabilities of one in one (((googol) googol) googol) googol which of course makes python blow up. Now we safely output an accurate posterior.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1971 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-04 06:05:45 +00:00
chartl
3d9195f8b6
Added - converter from expanded summary to VCF (beautiful thing, really)
...
Removed - the ugly hackjob that was expanded summary to Geli
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1970 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-03 22:20:47 +00:00
depristo
d60c632099
Minor output improvement
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1965 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-03 13:20:55 +00:00
depristo
44ea55d338
Useful library for parsing VCF files, plus a general VCF->table converter
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1964 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-03 13:14:04 +00:00
chartl
99337df929
Now looks up and propagates Syzygy's LOD scores into the appropriate field (so variantfiltration can adjust lod scores accurately)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1950 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-30 21:13:03 +00:00
chartl
7654051aee
Faster grepping
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1948 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-30 16:59:17 +00:00
chartl
4319ff0610
A python script that will convert pooled expanded summary files (from Jason Flannick's pipeline) into .geli files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1947 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-30 16:39:57 +00:00
depristo
c1e1d910cb
simple monitor for watching pilot 1 call progress
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1769 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-06 13:04:53 +00:00
depristo
de9f2b11da
Detects unmapped (no bai) bam files and doesn't blow up
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1725 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-25 12:56:28 +00:00
ebanks
8349004414
Generalize the regexp for analysis files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1714 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-24 03:17:41 +00:00