Commit Graph

263 Commits (4d08d3984969ba3fe2a29c3cb50ccc588d76f76c)

Author SHA1 Message Date
corin 74f705d943 fixed silly syntax errors
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3805 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-16 15:08:46 +00:00
corin 917469ef43 This script produces information for a firehose job-finished email
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3804 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-16 14:44:52 +00:00
chartl 19a5830186 Restore "type" annotation (but not genomechange or cDNA change, which are already encoded in the VCF)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3784 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-13 17:33:15 +00:00
chartl 9cc1a411b2 Altering the formatting of the annotation to work better with VariantEval's AminoAcidTransition
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3782 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-13 16:31:14 +00:00
chartl 80a5ddfa2f Memory string added
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3771 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-12 20:43:19 +00:00
chartl 46c39f2d53 Quick python scripts for going from genotype VCFs to site-only VCFs, and one to fix BC vcf files (which had "het" genotypes at non-variant sites)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3768 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-12 19:13:32 +00:00
depristo dd978dd525 misc. changes to python scripts
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3751 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-09 00:13:35 +00:00
depristo cf910d9cc2 misc. useful updates to python library
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3683 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-30 16:33:32 +00:00
weisburd e7939f7036 Fixed error message
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3653 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-28 14:50:28 +00:00
weisburd 1cb8f51f8c Fixed -t arg
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3634 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-24 23:44:10 +00:00
weisburd 3cd0570c1e Now can run with multiple processes, multiple threads, or both
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3633 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-24 23:25:01 +00:00
weisburd dae3ce2c0f changed log dir
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3632 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-24 23:08:13 +00:00
weisburd fea8054e9e Updated long name for -l to --run-locally
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3631 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-24 22:26:45 +00:00
weisburd 72e669538e Updated arg description for -s
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3629 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-24 22:04:01 +00:00
weisburd cef12b45e1 Fixed typo
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3561 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-15 21:17:14 +00:00
weisburd c0370f4d0a Added both inclusive and exclusive filters
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3538 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-11 18:40:41 +00:00
weisburd d1a4c4f0d3 Added -w filter option allowing user to specify chromosomes to be skipped.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3531 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-10 20:58:25 +00:00
weisburd 6fd2d39a7d Modified run_locally mode to use os.system(..) instead of popen
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3515 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-09 17:10:03 +00:00
weisburd a3ccf49f5b Write error to stderr
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3514 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-09 17:09:10 +00:00
weisburd 2b31975cb4 Added more options for coordinate systems - now you can add 1 to either the start coordinates, the end coordinates, or both
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3508 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-08 22:49:19 +00:00
weisburd 410afcdf2c Added parallelization options - when running locally, multiple processes can be spawned, or a -nt arg can be specified to run each TranscriptToInfo instance multi-threaded
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3507 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-08 22:48:07 +00:00
weisburd 92c72d3361 Added back lines that update the *big-table-header.txt file before using it
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3506 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-08 22:45:41 +00:00
weisburd 3c24223d02 Script for concatenating 2 AnnotatorInputTables, and writing the result to standard out. Merge-sorts the 2 tables while concatenating them
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3505 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-08 22:44:16 +00:00
depristo f32a32269c minor change for eric
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3418 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 20:13:36 +00:00
depristo d1098fa77b Removing unnecessary dependences that were causing problems for Sendu
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3410 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 13:07:41 +00:00
depristo 886e9c1297 Now can take a VCF file as input
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3378 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 17:06:12 +00:00
depristo 2a212c497f minor improvements and bug fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3376 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 15:54:42 +00:00
depristo 43544cfdf9 remote control of IGV to jump to any number of loci in a file and screenshot the locus to a file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3375 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 12:59:36 +00:00
weisburd 04e14ef85a Refactored so it could be used for knownGene and CCDS as well as refGene
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3373 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 02:45:11 +00:00
depristo 2a803e9044 simple tool that takes two dbSNP files and subsets the seconds to only include rsID SNPs present in the first. Used to make b129 against b37 by subsetting b131/b37 vs. b129/b36
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3352 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 13:39:09 +00:00
depristo d3c33d4b3f more powerful management routines for my pipeline
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3351 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 13:37:39 +00:00
weisburd f120a00433 Fixed bug so that the strand, alternate, and reference columns are now moved correctly
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3342 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 17:59:39 +00:00
depristo d6b036cdab Minor improvements to simple python code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3330 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 21:34:46 +00:00
chartl d5b675b3e6 Added - Q&D script to gather verbose bed files to a VCF.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3294 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 02:49:16 +00:00
weisburd a462b5e1e7 Changed a default path
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3291 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 17:07:21 +00:00
weisburd 28f746b76a Added option to generate UCSC or NCBI sequence
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3283 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 17:26:00 +00:00
weisburd c214056d88 Script for concatenating results of GenerateTranscriptToInfo.py into one big file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3279 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:47:08 +00:00
weisburd 0069cb426d Script for spawning LSF jobs that run the TranscriptToInfo.java walker on each of the 50 contigs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3277 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:27:52 +00:00
weisburd ba7fe7c4e1 Renamed
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3276 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:25:07 +00:00
weisburd 4937295a0b Renamed
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3275 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:24:56 +00:00
depristo bf3dbd8401 some useful routines for working with project processing. madPipeline contains a bunch of useful routines for building pipelines that I finally put into one file. Let's just say that I'm really looking forward to the new pipeline system...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3260 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 12:34:04 +00:00
weisburd c7b4f78316 Added -m arg
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3233 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 22:38:47 +00:00
depristo 7902db616e Marginally more useful output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3201 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 14:45:14 +00:00
chartl 4eba9bffc1 Grabs average SNP calls, mismatch rate, aligned reads, and other important lane metrics from a SQUID export and summarizes them across multiple margins (lane numbers, flowcells, samples, libraries)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3193 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-17 03:09:05 +00:00
depristo 7973806716 interim update
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3173 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 14:25:02 +00:00
chartl 2e4377b1cf Awesome: JobDispatcher can now dispatch jobs by gene from a target .design file found in /seq/references.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3170 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 18:17:41 +00:00
weisburd 04c22a6640 Added handling of UCSC and NCBI reference sequences
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3165 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:40:31 +00:00
weisburd 2183f10a1d Script for validating and converting text files into the tabular format required for GenomicAnnotator -B inputs
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3156 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-13 13:35:10 +00:00
chartl fab31e1d53 Check in so I don't lose this code -- spawning of jobs by genes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3137 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-08 16:18:40 +00:00
chartl 27fb6f7594 Make sure to convert non-integer chromosomes (M,X,Y) back from their corresponding integer representations (0,23,24) when writing in .bed format
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3119 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-03 20:01:21 +00:00
chartl 687fd477ff Just some code I want to freeze. If you ever need to estimate the % of bases covered by exon, given an interval list, give it to getTargetedGenes. Not the best name for this function, but I don't expect anyone to use it but me.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3111 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-01 20:21:50 +00:00
chartl ac9c335cd2 This is a python job dispatcher I've been using, which builds on Mark's FarmJob utility, and an example script of how I'm using it. Basically I wrote it to smartly break up analysis over an interval list, givin a maximum number of bases per job, a list of available queues, and a limit on each queue. It handles going over these limits in three ways:
1) [default]: Fail
  - No jobs are actually spawned

2) Space
  - User provides a string of the form A:B:C where A is the number of days to wait before scheduling jobs over
    the queue limits; B the number of hours, C the number of minutes. Exceeding the queue limits again will
    increment the space by another A:B:C

3) Stop-Resume
 - Spawns the maximum number of jobs, and writes a file describing the next job, and a hash code of the remaining jobs.
   The next time the script is run, it spawns the next set of jobs starting with what's written in the file. If the 
   hash code (and thus the command string) changes between runs, the dispatcher fails-fast.

The base job dispatcher class is also capable of dealing with dependencies if it is used correctly.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3102 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-31 19:53:13 +00:00
chartl dc802aa26f Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3090 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-29 13:32:00 +00:00
depristo 08d9ae403d better farm commands, and simple utility to convert ucsc repeat masks to interval lists
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3040 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-19 13:11:06 +00:00
chartl 4bdc3b2784 automatic generation of individual and individual set import files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3001 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 10:36:33 +00:00
chartl d9b12b468f Adding default filter info
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3000 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 10:05:46 +00:00
andrewk 196bca6819 Script to split concordance files into their constituent sets and calculate summary stats from a concordance file - SNPs called and number in dbSNP
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2992 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-12 22:20:44 +00:00
andrewk 9298e13201 Make annotated VCF not be broken
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2906 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-01 23:22:41 +00:00
kshakir 5f9c3f3884 Outputing annotated VCF to the current directory instead of attempting to write in the directory next to the original vcf.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2869 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 21:31:24 +00:00
kiran 217deb9809 Changed the INFO field delimiter from a comma to a semicolon
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2847 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 20:44:57 +00:00
chartl 0e4b5ad9c6 Check to ensure sample status is "Complete" before writing out the bam file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2844 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 15:36:42 +00:00
chartl e491b42951 Dumb little script that grabs Picard metrics (alignment, hybrid selection, insert size) from picard_aggregation given the path to the bam file; zips them up, and spits them out; for use with Firehose
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2824 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 14:09:30 +00:00
chartl 60f05379a7 fix typo. Check explicitly that fingerprint files exist.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2821 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-10 21:29:54 +00:00
chartl d16a1b5645 Simple python script for generating a firehose-parseable text file from MS-Dos formatted TSV spreadsheets (of the type that we get from project management). Will be deprecated in a few weeks with the advent of direct BSP ID entries, but useful until then.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2820 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-10 20:25:40 +00:00
kshakir 57a168c0db Added a header crediting the python script as a source. Looping over an arbitrary number of headers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2804 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-07 03:38:30 +00:00
andrewk 61a67cdce4 Moving file up a directory for dependencies
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2798 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-05 19:36:13 +00:00
andrewk 58456822ab Two perl scripts (from Kristian Cibulskis) and one python script for annotating VCF files with the information generated by the cancer MAF annotation tool.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2797 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-05 19:25:46 +00:00
depristo e964660df3 snpSelector now supports min and max q scores.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2751 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-31 19:38:34 +00:00
depristo 7b3c34d210 keeping a backup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2750 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-31 15:36:25 +00:00
chartl 0fb032a436 Quick script that changes "chr#" to "#" and "chrM" to "MT" and moves mitochondria to the end of the vcf; in accordance with the 1KG reference.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2727 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-28 21:59:33 +00:00
chartl 4990139b60 A collection of python objects that are useful for VCF validation. Use 'em or don't.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2679 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-25 18:44:10 +00:00
depristo cf46e3c85f Valuable series of commands
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2641 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-20 13:58:59 +00:00
depristo 8226f4aa12 minor cleanup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2616 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-16 20:23:20 +00:00
chartl dfe160ff77 Minor changes (additional info calculated)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2522 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-06 16:41:01 +00:00
depristo 588006ee92 Now supports strings in command line for farm submission
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2507 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-06 13:15:40 +00:00
depristo 9fb6533549 new -a option does fast merging of already sorted files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2500 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-05 13:55:39 +00:00
depristo 89f3ee614a minor printing fix
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2492 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-30 22:14:50 +00:00
depristo fcc80e8632 Completely rewritten duplicate traversal, more free of bugs, with integration tests for count duplicates walker validated on a TCGA hybrid capture lane.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2458 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-28 23:56:49 +00:00
andrewk 4e7e0432a2 Updated SNP calling power from coverage tools to work with new UnifiedGenotyper and DepthOfCoverage tools.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2378 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 20:44:30 +00:00
andrewk f5e547ed6e Add ability for flat file table parsing module to skip ahead to first occurence of a regular expression (use case: consistently parsing DepthOfCoverage output for histogram section of file across file format changes)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2377 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 20:38:50 +00:00
andrewk bf76019f22 Minor change to coverage evalution script, to update for new file format and add output fields
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2375 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 18:06:08 +00:00
depristo 0d2a761460 Bugfix for minBaseQuality to ignore deletion reads. LocusMismatch walker now allows us to skip every nths eligable site
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2357 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 14:38:39 +00:00
depristo 56467df49a minor improvements to snpSelector to work with hapmap chip VCF files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2343 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 17:59:32 +00:00
depristo b2dfe85648 Better support for reading truth file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2307 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-10 12:16:05 +00:00
chartl 6a4118ad3c grr, ought to actually assign it to the TRUTH_CALLS variable
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2302 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-09 23:31:46 +00:00
chartl 987fced151 Should read truth data from the parser options rather than direct from args
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2301 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-09 23:26:26 +00:00
chartl 8825211fdb Adding this to subversion so it's protected
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2299 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-09 21:26:17 +00:00
depristo 2632cb6b58 minor improvements to snp selector
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2275 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-07 03:37:14 +00:00
chartl b817db0962 Syzygy has a default LOD score of 0.91 on bases with no coverage, this is problematic. Set the minimum lod threshold to 1 because I just don't want to see that codswallop.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2268 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 23:29:14 +00:00
depristo 0753315156 updates to the python snp selector -- now sorts info fields and we stop printing unnecessary debugging info in vcf2table
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2265 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 22:16:02 +00:00
chartl 0f89a38473 forgot to commit this earlier
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2264 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 22:10:16 +00:00
chartl c1263e841c stop printing the debug info -- hurr
Also it turns out that sometimes there can be a call with zero total non-I/non-D bases -- so add one to numerator and denominator to prevent divide by zero error



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2262 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 16:17:38 +00:00
chartl 0c2d6d7e41 A brute-force script to convert Syzygy lod-score calls files into a proper VCF -- with some useful annotations.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2261 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-04 16:07:06 +00:00
depristo 2c7cb912f0 Bug fixes for mixed none/valued attributes. also now assigns fake float values for display, if requested, for covariates using the -plottable flag
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2253 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-03 23:52:35 +00:00
depristo dbb8b86ed1 Minor updates to correctly handle emitting FN calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2231 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-02 22:53:17 +00:00
rpoplin 67179e2412 Initial checkin of AnalyzeCovariates.java which replaces analyzeRecalQuals_1KG.py and is updated to use the new Covariates system. It creates similar plots of residual error for each covariate that was used in the calculation. There is also an option to filter out base qualities below a given threshold.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2215 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-02 16:47:35 +00:00
depristo 8a87d5add1 misc. bug fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2212 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-02 14:36:03 +00:00
depristo c93d37d9fb continuing improvements in output of snpSelector
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2198 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-01 15:42:06 +00:00
depristo 2ea93385be Better support for comparison to truth. Now emits FP rates for each covariate if a truth file is provided. Also now writes out a detailed recal.log file that can be parsed directly into R
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2179 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-29 22:20:40 +00:00
chartl 662bbbd53b Awful stupid bug. This will use up one of my bad code offsets.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2178 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-29 20:09:33 +00:00