gatk-3.8/python
hanna 52f930d708 Bug fix.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5985 348d0f76-0448-11de-a6fe-93d51630548a
2011-06-13 18:48:55 +00:00
..
genomicAnnotatorScripts Adding a script to do big table conversion. Removing Ben's script which is totally obsolete and busted. 2010-10-12 01:40:43 +00:00
1kgStatsForCalls.py Better reporting and now with a special mode for listing exceptions 2010-09-01 16:19:51 +00:00
AnnotateVCFwithMAF.py Restore "type" annotation (but not genomechange or cDNA change, which are already encoded in the VCF) 2010-07-13 17:33:15 +00:00
JobDispatcher.py Memory string added 2010-07-12 20:43:19 +00:00
JobDispatcherExample.py This is a python job dispatcher I've been using, which builds on Mark's FarmJob utility, and an example script of how I'm using it. Basically I wrote it to smartly break up analysis over an interval list, givin a maximum number of bases per job, a list of available queues, and a limit on each queue. It handles going over these limits in three ways: 2010-03-31 19:53:13 +00:00
MPGQueuePipelineStatus.py updated to work with the new tearsheet 2011-01-28 18:46:38 +00:00
ParseDCCSequenceData.py Added ParseDCCSequenceData.py to repository and made changes that allow an analysis of quantity of sequence data by platform and project, moved table / record system to a new module called FlatFileTable.py and built that into ParseDCCSequenceData and CoverageEval.py; changed lod threshold in CoverageEvalWalker. 2009-07-08 22:04:26 +00:00
RefseqLibrary.py Awesome: JobDispatcher can now dispatch jobs by gene from a target .design file found in /seq/references. 2010-04-14 18:17:41 +00:00
RunPilot2Pipeline.py Fix merger command 2009-09-11 13:13:23 +00:00
SyzygyCallsFileToVCF.py Minor changes (additional info calculated) 2010-01-06 16:41:01 +00:00
VCFValidationAnalysis.py A collection of python objects that are useful for VCF validation. Use 'em or don't. 2010-01-25 18:44:10 +00:00
Verify1KGArchiveBAMs.py Minor utility improvements 2011-02-26 15:36:26 +00:00
analyzeRunReports.py updates to handle only reporting on a specific SVN revision. Updated the R script to show the domain name of the runner, now that S3 logging is working 2011-02-01 12:02:12 +00:00
change_paths.py (Hopefully) short-lived script to rework the directory structure from core / 2011-06-08 19:18:22 +00:00
collectCalls.py Adding this to subversion so it's protected 2009-12-09 21:26:17 +00:00
countCoverageWithSamtools.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
createCaseControlMetaData.py A helper script that will take a list of bams, a list of case sample IDs, and a list of control sample IDs, and generate a sample meta data yaml (which includes the bamfiles) 2011-03-21 16:11:55 +00:00
dataProcessingPaper.py Trival changes to data processing paper python 2010-12-01 14:57:14 +00:00
expandedSummaryToVCF.py Oops. Let's make sure only to write calls that the pool supports to the auxiliary vcf files. 2009-11-04 17:14:55 +00:00
faiReader.py Better snpSelector, plus VCFmerge tool 2009-11-11 22:02:57 +00:00
farm_commands.py Now supports strings in command line for farm submission 2010-01-06 13:15:40 +00:00
farm_commands2.py Minor improvements to my crappy old python job management system. Mauricio's first task is to retire all of this code and move the DPP pipeline over to Queue 2010-12-09 04:44:16 +00:00
fasta.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
firehose_out_email.py updated to deal with new cleaning pipeline outputs and potentially infinity TI/TV 2010-08-24 16:01:09 +00:00
fixSoapSnp.py Simple pre-processing script for soapsnp files 2010-11-04 20:34:43 +00:00
gatherIndelsToVCF.py Adding this to get around lsf/csh issues (see recent help message). Also seems like a good time to reiterate http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/ 2010-09-19 02:45:16 +00:00
gatherSampleSummaries.py If you scatter depth of coverage and need to do something more sophisticated than gathering up (e.g. concatenating) the interval summary file, and need to smartly gather up a full summary file, modify (stress on MODIFY) this script to do it 2011-02-25 01:23:53 +00:00
gatkConfigParser.py High performance LocusIterator implementation. Now with greatly reduced memory impact and 2x (and more potentially) speed ups of raw locus iteration. General performance improvements to SSG with empirical probs. You can enable high-performance locus iteration with the -LIBS arg. It's still testing but passes validing pileup. 2009-09-03 03:06:25 +00:00
generate1KGHapmapVCF.py Quick script that changes "chr#" to "#" and "chrM" to "MT" and moves mitochondria to the end of the vcf; in accordance with the 1KG reference. 2010-01-28 21:59:33 +00:00
generate_per_sample_metrics.py Bug fix. 2011-06-13 18:48:55 +00:00
getBamFilesFromSpreadsheet.py Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics 2010-03-29 13:32:00 +00:00
getLaneAndSequenceInfo.py Grabs average SNP calls, mismatch rate, aligned reads, and other important lane metrics from a SQUID export and summarizes them across multiple margins (lane numbers, flowcells, samples, libraries) 2010-04-17 03:09:05 +00:00
getRecentBamList.py Updating the bam list is a bit trickier than most of us originally thought. Need to ensure that *3* files exist: the .bam, the .bai, and the finished.txt (or else bad things can happen) 2011-06-08 14:42:31 +00:00
getTargetedGenes.py Just some code I want to freeze. If you ever need to estimate the % of bases covered by exon, given an interval list, give it to getTargetedGenes. Not the best name for this function, but I don't expect anyone to use it but me. 2010-04-01 20:21:50 +00:00
igvController.py Now can take a VCF file as input 2010-05-18 17:06:12 +00:00
indelVerboseStats.py Minor improvements to simple python code 2010-05-07 21:34:46 +00:00
lsf_post_touch.py Do this the right way 2010-09-19 04:30:48 +00:00
madPipelineUtils.py Minor improvements to my crappy old python job management system. Mauricio's first task is to retire all of this code and move the DPP pipeline over to Queue 2010-12-09 04:44:16 +00:00
makeIndelMask.py Removing unnecessary dependences that were causing problems for Sendu 2010-05-21 13:07:41 +00:00
makeMetricsFilesForFirehose.py Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics 2010-03-29 13:32:00 +00:00
memo.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
mergeVCFInfoFields.py Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics 2010-03-29 13:32:00 +00:00
picard_utils.py misc. changes to python scripts 2010-07-09 00:13:35 +00:00
privateMutationRates.py Private mutation simulator and analysis routines for EOMI paper 2011-01-07 21:23:29 +00:00
pushback_file.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
realignBamByChr.py Improvements to 1KG processing pipeline 2010-07-16 15:33:47 +00:00
recalAssociation.py One last little thing, I swear 2011-03-11 17:37:40 +00:00
sample_lister.py module for listing out samples for data processing and firehose reporting 2010-07-21 15:05:41 +00:00
setFilterGenotypesToRef.py Quick python scripts for going from genotype VCFs to site-only VCFs, and one to fix BC vcf files (which had "het" genotypes at non-variant sites) 2010-07-12 19:13:32 +00:00
splitIntervalsByContig.py Adding this to get around lsf/csh issues (see recent help message). Also seems like a good time to reiterate http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/ 2010-09-19 02:45:16 +00:00
theoPost.py Modifications, bugfix to theoretical posteriors. (Bug fix: eliminated discontinuity in prior distribution) 2011-02-14 19:47:34 +00:00
ucscRepeatMaskToIntervalList.py better farm commands, and simple utility to convert ucsc repeat masks to interval lists 2010-03-19 13:11:06 +00:00
updateTribbleToCorrectSVNVersion.py clean-up the docs a little 2010-09-29 05:02:41 +00:00
validatePosterior.py Committing two pieces of code for exome analysis, in case they need be returned to 2011-01-31 14:13:09 +00:00
vcf2table.py now support -o output option, useful for pipelines 2010-08-06 14:57:04 +00:00
vcfGenotypeToSites.py Quick python scripts for going from genotype VCFs to site-only VCFs, and one to fix BC vcf files (which had "het" genotypes at non-variant sites) 2010-07-12 19:13:32 +00:00
vcfReader.py now support -o output option, useful for pipelines 2010-08-06 14:57:04 +00:00
vcf_b36_to_hg18.py Tired of writing vcf_hg18_to_b36 over and over again when necessary. Added a -r flag to this script that does it. 2010-09-28 14:51:57 +00:00