Commit Graph

10 Commits (1b1aefc38583010bc0cfab5bd90fc7abfd8fc2c3)

Author SHA1 Message Date
carneiro 260301016a cleaned up the scripts and created an interval library to facilitate future reuse.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5895 348d0f76-0448-11de-a6fe-93d51630548a
2011-05-27 19:35:36 +00:00
carneiro 0048f1f6d3 Lots of interval_list file utility scripts
1. findGenes : Parses the Genetic Association Database (from NIH) into an annotated 'genes of interest' interval_list file.
 
2. sortGenesByCoverage : combines the interval_list of the genes of interest with the report from GATK's DepthOfCoverage, generating an annotated interval_list with total and average coverages on each gene.
  
3. hasTheseTargets : Give it a list of targets (example: exon targets) and any interval_list (example: genes of interest) and it will generate an annotated interval_list of all the exons that are contained in the list of genes. 


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5894 348d0f76-0448-11de-a6fe-93d51630548a
2011-05-27 18:31:07 +00:00
carneiro 133712faec Have a list of bam files but Picard updated their versions from v1 to v17 ? This script will update all your v* numbers for you.
PS: don't hate Lua. :-)

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5857 348d0f76-0448-11de-a6fe-93d51630548a
2011-05-23 21:46:14 +00:00
carneiro 7598f5f6a7 forgot to remove a debug line.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5246 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-15 16:25:48 +00:00
carneiro e45b699ac0 standardizing the name of the scripts and fixing some bugs with the remapping.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5245 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-15 16:19:58 +00:00
carneiro 5f10fffa47 merge intervals now prints a sorted list in the end.
added the ccs datasets to the pbCalling pipeline.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5233 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-11 20:57:59 +00:00
carneiro 50c2fa3c3a this -1 made ALL the difference in the world. Minor bug fix.
Regular updates to the pbCalling pipeline.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5232 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-11 19:25:09 +00:00
carneiro e7d38247bb chunkIntervals.lua creates 1Mb interval chunks out of any .intervals file. Useful for methods development pipeline datasets.
remapAmplicons.lua takes a sam file with reads aligned to amplicon references, a reference genome , and an amplicon reference mapping table, and rewrites the sam file with mappings to the reference sequence.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5230 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-11 18:21:31 +00:00
carneiro aab0ec209b small bug fix on chromosome names.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5168 348d0f76-0448-11de-a6fe-93d51630548a
2011-02-01 20:55:19 +00:00
carneiro a2e2a6a9c3 A script to convert an intervals file into a sorted unoverlapping intervals file. Super fast implementation using multiple hashes, should do the job instantly no matter how big the intervals are.
ps: this utility is similar to -im on GATK, but for queue scripts we need the intervals file fixed before we can run Unified Genotyper because of scatter-gather.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5117 348d0f76-0448-11de-a6fe-93d51630548a
2011-01-28 21:46:41 +00:00