Commit Graph

243 Commits (f8e1ea7b64b71c81c5ab2cf86e4311c4a8cceeab)

Author SHA1 Message Date
hanna 596773e6c6 Cleanup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@931 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-07 20:25:08 +00:00
hanna e6aa058ec4 Tighten up error handling a bit.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@920 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-06 03:40:50 +00:00
depristo 819862e04e major restructuring of generalized variant analysis framework. Now trivally easy to add additional analyses. Easy partitioning of all analyses by features, such as singleton status. Now has transition/transversional bias, counting, dbSNP coverage, HWE violation, selecting of variants by presence/absense in dbs. Also restructured the ROD system to make it easier to add tracks. Also, added the interval track -- if you provide an interval list, then the system autoatmically makese this available to you as a bound rod -- you can always find out where you are in the interval at every site. Python scripts improved to handle more merging, etc, into population snps.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@918 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 23:34:37 +00:00
hanna 050d55cdb0 Basic graph support for testing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@916 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 21:04:01 +00:00
hanna 2035d7dfd3 Revert some debug code in RecalQual.py. Make LogisticRegression easier to Ctrl-C out of.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@904 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 01:53:48 +00:00
hanna 61ae00c7bf Lots of cleanup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@903 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-05 01:26:10 +00:00
hanna 9689bb3331 Very early draft of script integrating the covariant counting / logistic regression. Deleted some unused code and spurious debug info.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@902 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 22:52:11 +00:00
hanna 40bc4ae39a The building blocks for segmenting covariate counting data by read group.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@899 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 19:55:24 +00:00
depristo 67112c79a1 More robust individual genotypes to population script
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@893 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 00:12:31 +00:00
andrewk 7755476d36 Updated coverter to reflect change in contig ordering in Geli files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@888 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-03 10:05:28 +00:00
andrewk 080af519cb Added R script and uncommented a line in recal_qual.py
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@886 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-03 03:15:45 +00:00
andrewk b2eb724456 First commit of recalibration master control script for recalibrating quality scores.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@885 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-03 02:17:10 +00:00
depristo 3998085e4b more and better python scripts for dealing with calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@881 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 20:37:19 +00:00
andrewk 587d07da00 Merged functionality of two python scripts into LogRegression.py, some clarity updates to covariate and regression java files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@876 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 16:55:05 +00:00
depristo ae2eddec2d Improving, yet again, the merging of bam files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@874 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-02 13:31:12 +00:00
depristo 543c68cdd8 First version of individual geli files to population SNPS
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@865 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-31 15:29:10 +00:00
depristo 6adef28b97 Now supports automatic merging by population
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@864 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-31 15:28:44 +00:00
depristo e0803eabd9 enabled underlying filtering of zero mapping quality reads, vastly improves system performance
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@853 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-29 14:51:08 +00:00
depristo c72601322a now returns the farm id when submitting a job!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@825 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-26 22:23:24 +00:00
depristo 04e51c8d1d Better version of MergeBAMBatch -- more options for creating the file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@787 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 22:26:19 +00:00
depristo 3b1f84e15b Slightly improved interface to merging utility for multiple bam files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@757 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-20 12:54:41 +00:00
depristo e9f85ef920 Better merge support
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@748 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-18 21:18:51 +00:00
depristo 9dec783a82 Actually writes out a good header now
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@744 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-18 13:34:52 +00:00
depristo 8e9e2f4502 Revised ROD system. Split the system in Basic type and interface. Enabled more control over rod accessing, including an initialize() function to fetch headers and other options from the file. Added general tabular rod, which has a named columns and supports a map<String,String> interface. Comes with shiny new Junit system for RODs. Also, added simple python script for accessing picard data.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@716 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-14 21:06:28 +00:00
hanna de1c282e62 Reference-ordered data relies on bugs in the old command-line argument system to work. Update the ROD system to from -B track1 type1 file1 track2 type2 file2 to -B track1,type1,file1 -B track2,type2,file2.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@640 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 15:28:19 +00:00
depristo 30218ee31a Better validation scripts and data
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@562 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 17:40:07 +00:00
andrewk 58b2578c44 Several changes to CovariateCounter walker to print more tables (called vs. observed Q scores), bug fixes to LogisticRecalibrationWalker and LogisticRegressor, and print string functionality added to Pair.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@550 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-28 00:37:48 +00:00
depristo 3739682bef Actually has working version of the python script to merge multiple bam files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@530 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 19:15:55 +00:00
depristo 40a2b3eeb3 Basic logistic regression support for calibrating qualities; mostly for Andrew to experiment with
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@529 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 19:09:50 +00:00
andrewk 38c2f73457 LogRegression.py script that converts parameter files for each dinucleotide regression into one file to be read in by correction script.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@528 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 18:31:26 +00:00
depristo b8a6f6e830 Support for indexBAM command
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@496 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 19:39:07 +00:00
depristo e842b543c9 Better validation scripts
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@458 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 23:18:00 +00:00
depristo f47f640df6 Better debugging output and testing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@455 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 21:54:56 +00:00
depristo 2eabcfedb7 Fixed potential bug with next() operation returning empty contexts when a read contains a large deletion. We can now use the look ahead safely...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@439 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 21:41:30 +00:00
depristo 72a3d84ed2 General purpose pileup code -- you can use these features to obtain detailed pileup data from reads and offsets. Useful for all pileup based walkers. Expanded support for rodSAMPileup to enable the new ValidatingPileupWalker, which takes a samtools pileup output and checks that GATK gives identical output as samtools on a per base and per qual pileup. It's going to be a very useful validation tool.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@418 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:13:10 +00:00
depristo 49b2622e3d Helper utility for merging BAM files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@345 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:10:41 +00:00
depristo 9d35f0ca67 The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@320 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:21:57 +00:00
andrewk 9dee9ab51c Added Hapmap data track (using rodGFF class for GFF file format) to toolkit as a command line option, Hapmap metrics to AlleleFrequencyMetricsWalker, and a python Geli2GFF file converter.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@163 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 03:58:03 +00:00
hanna 2ee2623926 Move non-java code out of playground.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@154 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 19:31:38 +00:00
hanna 5031875507 Move to new directory organization.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@35 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-11 20:58:01 +00:00
depristo bd1fadd9fe Validating walker for lots of bam files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@10 348d0f76-0448-11de-a6fe-93d51630548a
2009-02-28 17:05:08 +00:00
depristo e892c3fd98 Shouldn't be in the tree
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@9 348d0f76-0448-11de-a6fe-93d51630548a
2009-02-28 15:31:17 +00:00
depristo 17aabb38f9 Basic reorganization of tree
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@8 348d0f76-0448-11de-a6fe-93d51630548a
2009-02-28 15:28:56 +00:00