gatk-3.8/python
depristo effcd26977 Shorter outputs, new summary mode
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@4440 348d0f76-0448-11de-a6fe-93d51630548a
2010-10-06 14:34:50 +00:00
..
1kgScripts update script to put pilot1 bams directly onto hphome 2009-09-08 14:41:35 +00:00
genomicAnnotatorScripts Changes from James Pirruccello: now can handle differences between UCSC and NCBI tables, properly sorting despite the contig prefix differences (presence or absence of 'chr'), and converts NCBI format to UCSC format for use by the GenomicAnnotator. 2010-08-26 19:02:29 +00:00
1kgStatsForCalls.py Better reporting and now with a special mode for listing exceptions 2010-09-01 16:19:51 +00:00
AlignBam.py Give usage message if no arguments provided. 2009-08-31 00:28:43 +00:00
AlignBams.py Short python script that takes paired-end BAMs and aligns them with BWA. Referenced in GSA wiki tutorial 2009-07-31 00:04:10 +00:00
AnnotateVCFwithMAF.py Restore "type" annotation (but not genomechange or cDNA change, which are already encoded in the VCF) 2010-07-13 17:33:15 +00:00
BarcodeAnalysis.py Added command line options to make the barcode analysis script executable by end users. 2009-08-24 21:15:09 +00:00
CoverageEval.py Updated SNP calling power from coverage tools to work with new UnifiedGenotyper and DepthOfCoverage tools. 2009-12-16 20:44:30 +00:00
CoverageMeta.py Updated SNP calling power from coverage tools to work with new UnifiedGenotyper and DepthOfCoverage tools. 2009-12-16 20:44:30 +00:00
DOCParameter.py Checking in Michael's DoC parameterization script; 2009-09-03 15:07:49 +00:00
EvalMapping.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
FastaQuals2Fastq.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
FlatFileTable.py Add ability for flat file table parsing module to skip ahead to first occurence of a regular expression (use case: consistently parsing DepthOfCoverage output for histogram section of file across file format changes) 2009-12-16 20:38:50 +00:00
Geli2GFF.py Updated coverter to reflect change in contig ordering in Geli files 2009-06-03 10:05:28 +00:00
Gelis2PopSNPs.py Fixing odd merge problem with VariantEval -- better cluster analysis (no cumsum), rodVariant is now an AllelicVariant 2009-07-14 18:53:27 +00:00
JobDispatcher.py Memory string added 2010-07-12 20:43:19 +00:00
JobDispatcherExample.py This is a python job dispatcher I've been using, which builds on Mark's FarmJob utility, and an example script of how I'm using it. Basically I wrote it to smartly break up analysis over an interval list, givin a maximum number of bases per job, a list of available queues, and a limit on each queue. It handles going over these limits in three ways: 2010-03-31 19:53:13 +00:00
LogRegression.py Merged functionality of two python scripts into LogRegression.py, some clarity updates to covariate and regression java files. 2009-06-02 16:55:05 +00:00
LogisticRegressionByReadGroup.py Revert some debug code in RecalQual.py. Make LogisticRegression easier to Ctrl-C out of. 2009-06-05 01:53:48 +00:00
MergeBAMBatch.py Fixes for VariantEval for genotyping mode 2009-09-18 21:01:43 +00:00
MergeBAMsUtils.py Fixes for VariantEval for genotyping mode 2009-09-18 21:01:43 +00:00
MergeBamsByKey.py Trivial change 2009-06-12 19:11:28 +00:00
MergeEvalMapTabs.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
ParseDCCSequenceData.py Added ParseDCCSequenceData.py to repository and made changes that allow an analysis of quantity of sequence data by platform and project, moved table / record system to a new module called FlatFileTable.py and built that into ParseDCCSequenceData and CoverageEval.py; changed lod threshold in CoverageEvalWalker. 2009-07-08 22:04:26 +00:00
RecalQual.py Updated version of the recalibration tool 2009-06-19 17:45:47 +00:00
RefseqLibrary.py Awesome: JobDispatcher can now dispatch jobs by gene from a target .design file found in /seq/references. 2010-04-14 18:17:41 +00:00
RunPilot2Pipeline.py Fix merger command 2009-09-11 13:13:23 +00:00
SAM.py Better merge support 2009-05-18 21:18:51 +00:00
SamWalk.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
SamWalkTest.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
SimpleSAM.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
SimulateReads.py Actually writes out a good header now 2009-05-18 13:34:52 +00:00
SpawnMapperJobs.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
SpawnValidationJobs.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
StressTestGATK.py General purpose pileup code -- you can use these features to obtain detailed pileup data from reads and offsets. Useful for all pileup based walkers. Expanded support for rodSAMPileup to enable the new ValidatingPileupWalker, which takes a samtools pileup output and checks that GATK gives identical output as samtools on a per base and per qual pileup. It's going to be a very useful validation tool. 2009-04-14 22:13:10 +00:00
SyzygyCallsFileToVCF.py Minor changes (additional info calculated) 2010-01-06 16:41:01 +00:00
VCFValidationAnalysis.py A collection of python objects that are useful for VCF validation. Use 'em or don't. 2010-01-25 18:44:10 +00:00
ValidateGATK.py Actually listens to justPrint now 2009-07-15 16:52:46 +00:00
Verify1KGArchiveBAMs.py Marginally more useful output 2010-04-20 14:45:14 +00:00
WalkLociTest.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
Walker.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
aln_file.nocvs.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
aln_file.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
analyzeRunReports.py Shorter outputs, new summary mode 2010-10-06 14:34:50 +00:00
callingProgress.py simple monitor for watching pilot 1 call progress 2009-10-06 13:04:53 +00:00
collectCalls.py Adding this to subversion so it's protected 2009-12-09 21:26:17 +00:00
compSNPCalls.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
countCoverageWithSamtools.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
create_venn_evals.py Script to split concordance files into their constituent sets and calculate summary stats from a concordance file - SNPs called and number in dbSNP 2010-03-12 22:20:44 +00:00
easyRecalQuals.py Updated python files 2009-07-07 14:15:39 +00:00
expandedSummaryToVCF.py Oops. Let's make sure only to write calls that the pool supports to the auxiliary vcf files. 2009-11-04 17:14:55 +00:00
faiReader.py Better snpSelector, plus VCFmerge tool 2009-11-11 22:02:57 +00:00
farm_commands.py Now supports strings in command line for farm submission 2010-01-06 13:15:40 +00:00
farm_commands2.py misc. changes to python scripts 2010-07-09 00:13:35 +00:00
fasta.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
firehose_out_email.py updated to deal with new cleaning pipeline outputs and potentially infinity TI/TV 2010-08-24 16:01:09 +00:00
gatherIndelsToVCF.py Adding this to get around lsf/csh issues (see recent help message). Also seems like a good time to reiterate http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/ 2010-09-19 02:45:16 +00:00
gatkConfigParser.py High performance LocusIterator implementation. Now with greatly reduced memory impact and 2x (and more potentially) speed ups of raw locus iteration. General performance improvements to SSG with empirical probs. You can enable high-performance locus iteration with the -LIBS arg. It's still testing but passes validing pileup. 2009-09-03 03:06:25 +00:00
generate1KGHapmapVCF.py Quick script that changes "chr#" to "#" and "chrM" to "MT" and moves mitochondria to the end of the vcf; in accordance with the 1KG reference. 2010-01-28 21:59:33 +00:00
getBamFilesFromSpreadsheet.py Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics 2010-03-29 13:32:00 +00:00
getLaneAndSequenceInfo.py Grabs average SNP calls, mismatch rate, aligned reads, and other important lane metrics from a SQUID export and summarizes them across multiple margins (lane numbers, flowcells, samples, libraries) 2010-04-17 03:09:05 +00:00
getTargetedGenes.py Just some code I want to freeze. If you ever need to estimate the % of bases covered by exon, given an interval list, give it to getTargetedGenes. Not the best name for this function, but I don't expect anyone to use it but me. 2010-04-01 20:21:50 +00:00
igvController.py Now can take a VCF file as input 2010-05-18 17:06:12 +00:00
indelVerboseStats.py Minor improvements to simple python code 2010-05-07 21:34:46 +00:00
lsf_post_touch.py Do this the right way 2010-09-19 04:30:48 +00:00
madPipelineUtils.py Improvements to 1KG processing pipeline 2010-07-16 15:33:47 +00:00
makeIGVCategory.py misc. useful updates to python library 2010-06-30 16:33:32 +00:00
makeIndelMask.py Removing unnecessary dependences that were causing problems for Sendu 2010-05-21 13:07:41 +00:00
makeMetricsFilesForFirehose.py Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics 2010-03-29 13:32:00 +00:00
memo.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
mergeVCFInfoFields.py Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics 2010-03-29 13:32:00 +00:00
mergeVCFs.py now supports -o option as well as verbose output mode 2010-08-29 16:00:00 +00:00
picard_utils.py misc. changes to python scripts 2010-07-09 00:13:35 +00:00
pilot2CallingPipeline.py keeping a backup 2010-01-31 15:36:25 +00:00
pushback_file.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
qltout.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
realignBamByChr.py Improvements to 1KG processing pipeline 2010-07-16 15:33:47 +00:00
sample_lister.py module for listing out samples for data processing and firehose reporting 2010-07-21 15:05:41 +00:00
samtooltest.sh Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
setFilterGenotypesToRef.py Quick python scripts for going from genotype VCFs to site-only VCFs, and one to fix BC vcf files (which had "het" genotypes at non-variant sites) 2010-07-12 19:13:32 +00:00
snpSelector.py snpSelector now supports min and max q scores. 2010-01-31 19:38:34 +00:00
splitIntervalsByContig.py Adding this to get around lsf/csh issues (see recent help message). Also seems like a good time to reiterate http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/ 2010-09-19 02:45:16 +00:00
subsetDbSNPWithrsIDs.py simple tool that takes two dbSNP files and subsets the seconds to only include rsID SNPs present in the first. Used to make b129 against b37 by subsetting b131/b37 vs. b129/b36 2010-05-12 13:39:09 +00:00
tgtc2sam.py Move non-java code out of playground. 2009-03-23 19:31:38 +00:00
ucscRepeatMaskToIntervalList.py better farm commands, and simple utility to convert ucsc repeat masks to interval lists 2010-03-19 13:11:06 +00:00
updateTribbleToCorrectSVNVersion.py clean-up the docs a little 2010-09-29 05:02:41 +00:00
vcf2table.py now support -o output option, useful for pipelines 2010-08-06 14:57:04 +00:00
vcfGenotypeToSites.py Quick python scripts for going from genotype VCFs to site-only VCFs, and one to fix BC vcf files (which had "het" genotypes at non-variant sites) 2010-07-12 19:13:32 +00:00
vcfReader.py now support -o output option, useful for pipelines 2010-08-06 14:57:04 +00:00
vcf_b36_to_hg18.py Tired of writing vcf_hg18_to_b36 over and over again when necessary. Added a -r flag to this script that does it. 2010-09-28 14:51:57 +00:00