chartl
|
dc802aa26f
|
Moved CoverageStatistics to core. This will be (soon) renamed DepthOfCoverage; so please use CoverageStatistics
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3090 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-29 13:32:00 +00:00 |
rpoplin
|
06a212e612
|
Adding VariantConcordanceROCCurveWalker to create ROC curves comparing concordance between optimized call sets and validation truth sets in VCF format in order to evaluate performance of variant optimizer independently of achieving a particular novel ti/tv ratio. Added option to ignore only the specified filters in the input call sets via --ignore_filter <String>. Added option to provide a prior estimate of error for known snps via --known_prior <qual>. The het and hom calls are clustered independently. Infrastructure in place to use titv of known snps to inform p(true) of novel snps. Tweaked protection against overfitting based on suggestions from several people. Minor edits to AnalyzeAnnotations.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3071 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-24 19:43:10 +00:00 |
rpoplin
|
c78fc23ec5
|
Minor updates to output of variant optimizer.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3031 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-18 12:46:47 +00:00 |
rpoplin
|
58a31bab6a
|
Variant optimizer now outputs VCF files via ApplyVariantClustersWalker. Documentation to be added to the wiki. It is ready to be used by other people but only with great caution.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3028 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-17 20:41:42 +00:00 |
rpoplin
|
933823c8bc
|
Removed the StingException when mkdir fails for Sendu in AnalyzeCovariates. Incremental updates to VariantOptimizer.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3013 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-16 19:45:02 +00:00 |
chartl
|
ee68e38e02
|
Eliminate the shell items, as FH will be calling this with /broad/tools/apps/R-2.72/bin/Rscript
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2968 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-09 20:15:21 +00:00 |
chartl
|
aa7191353a
|
PlotDepthOfCoverage now produces a set of useful QC plots. Currently a first-draft, and it is unclear how the visualization will scale with increasing sample size and/or depth.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2962 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-09 16:42:35 +00:00 |
chartl
|
81ffb8243d
|
Waypoint commit of plotting R script for Depth Of Coverage/Coverage Statistics
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2958 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-03-08 21:42:51 +00:00 |
kshakir
|
36129e01e4
|
Using bitmap() instead of png() since the former doesn't rely on X11.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2873 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-02-23 05:31:51 +00:00 |
kshakir
|
3738b76320
|
Added a playground concordance analyzer for summarizing VariantEval across a group.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2867 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-02-22 20:28:52 +00:00 |
chartl
|
f02e94ab6f
|
Eliminate the rescale factor -- heatmap automatically normalizes the data
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2845 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-02-16 16:34:33 +00:00 |
chartl
|
37fa1bf0cc
|
Added heatmap function
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2843 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-02-16 15:12:54 +00:00 |
chartl
|
951b7a2433
|
First of what will be an increasingly useful set of tools, compiled into one command-line runnable library -- the goal is to have one plotting library that's callable because of limitations on the number of files you can package with a GenePattern module.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2841 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-02-15 16:51:47 +00:00 |
rpoplin
|
233a652161
|
Making the dotted quartile lines more clear.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2772 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-02-03 22:23:09 +00:00 |
rpoplin
|
64fc76e4bf
|
Added an option to AnalyzeCovariates to set the max value of the histograms to make them easier to directly compare.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2753 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-31 23:13:57 +00:00 |
rpoplin
|
16da5011c0
|
Added a new option for indicating the mean number of variants on the AnalyzeAnnotations plots. This way one can say, for example, filtering at this point will keep 75 percent of all the variants.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2744 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-29 21:58:31 +00:00 |
rpoplin
|
c6cc844e55
|
Added -name argument to AnalyzeAnnotations that allows one to specify the name of the annotation to be used on the plots. Instead of seeing AB and DP, one can add -name AB,AlleleBalance -name DP,Depth
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2742 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-29 20:48:53 +00:00 |
rpoplin
|
4f29a1d4f6
|
AnalyzeAnnotations now plots true positive rate instead of percentage of variants found in the truth set. Committing GCContentCovariate to help people experiment with correcting the pilot3/Kristian base calling error mode in slx.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2740 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-29 20:01:56 +00:00 |
rpoplin
|
79c4cc1db7
|
AnalyzeAnnotations now breaks out titv by calls in hapmap and also plots true positive rates. Any RODs passed in whose name starts with 'truth' is considered to be the truth set.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2726 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-28 21:41:23 +00:00 |
rpoplin
|
b8ae083d1b
|
AnalyzeAnnotations creates a plot of dbsnp rate as a function of the annotations.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2711 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-27 21:08:33 +00:00 |
rpoplin
|
fc4285f9fd
|
AnalyzeAnnotations seems to be popular so I've rewritten the guts to be easier to extend and maintain.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2707 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-27 19:30:31 +00:00 |
rpoplin
|
4bcdab580c
|
--output_dir has been changed to --output_prefix to give the user more control over the names of the resulting mass of files in AnalyzeAnnotations. The fontsize of the axes is increased. Cumulative filtering plots are removed since the binned filtering plots are much more useful.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2700 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-27 04:50:54 +00:00 |
rpoplin
|
24d4082925
|
AnalyzeAnnotations can now process only variants that are found in samples that match the -sampleName argument. X-axis of plots no longer use annoying scientific notation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2684 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-25 20:52:11 +00:00 |
rpoplin
|
2b51cf18f0
|
AnalyzeAnnotations now outputs plots with log x-axis in addition to standard x-axis so things like DP and MQ0 are easier to see. AnalyzeAnnotations now skips over all annotations that aren't floating point values. Recalibrator now warns users if PL tags are missing and so therefore it is reverting to illumina.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2681 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-25 19:39:18 +00:00 |
rpoplin
|
a11503819a
|
AnalyzeAnnotations now breaks out its TiTv plots into novel SNPs, dbSNP sites, and combined.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2659 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-22 19:00:23 +00:00 |
rpoplin
|
d9df72e1b5
|
AnalyzeAnnotations now bins variants per each annotation and outputs plots of TiTv ratio as a function of the annotation's value.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2654 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-21 21:15:11 +00:00 |
rpoplin
|
ba19afd529
|
Draft version of AnalyzeAnnotations which creates plots of cumulative TiTv ratio versus filter value per each annotation in the input VCF rod. Minor cleanup of recalibration walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2623 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-18 20:47:10 +00:00 |
rpoplin
|
7f97041875
|
Update to AnalyzeCovariates to make the histogram of PairedReadOrder look a little nicer
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2575 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-13 20:26:31 +00:00 |
rpoplin
|
cea544871d
|
Fixed an issue with recalibrating original quality scores above Q40. There is a new option -maxQ which sets the maximum quality score possible for when a RecalDatum tries to compute its quality score from the mismatch rate. The same option was added to AnalyzeCovariates to help with plotting q scores above Q40. Added an integration test which makes use of this new -maxQ option.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2534 348d0f76-0448-11de-a6fe-93d51630548a
|
2010-01-07 13:50:30 +00:00 |
rpoplin
|
562db45fa5
|
Sites that were marked NO_DINUC no longer get dinuc-corrected but are still recalibrated using the other available covariates. Solid cycle is now the same as Illumina cycle pending an analysis that looks at the effect of PrimerRoundCovariate. Solid color space methods cleaned up to reduce number of calls to read.getAttribute(). Polished NHashMap sort method in preparation for move to core/utils. Added additional plots in AnalyzeCovariates to look at reported quality as a function of the covariate.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2451 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-28 20:19:37 +00:00 |
aaron
|
1ae333a1c1
|
R script for graphing depth of coverage by sample name, and generating a loess curve for each sample's data.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2317 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-10 21:58:01 +00:00 |
rpoplin
|
088363ce42
|
Added entropy calculation to histogram of quality scores
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2316 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-10 21:57:35 +00:00 |
rpoplin
|
12ec154f01
|
Make the AnalyzeCovariate plots look a little nicer when there are a small number of data points
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2298 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-09 21:22:40 +00:00 |
rpoplin
|
855face681
|
Histogram of covariate values now goes from 0 to max value which makes it look nicer in most cases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2259 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-04 14:44:03 +00:00 |
rpoplin
|
985daec76e
|
Fixed problem with integer overflow in R scripts.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2258 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-04 14:24:49 +00:00 |
rpoplin
|
2508deca37
|
Prevented data points with fewer than N observations from going off the edge of the plots
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2257 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-04 13:55:43 +00:00 |
rpoplin
|
46f3d3e39b
|
Added comments to AnalyzeCovariates and R scripts. R script prevents residuals from going off the edge of the plot. Added skeleton code to the recalibration walkers showing how we plan to handle SOLID reference inserting behavior.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2233 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-02 23:15:52 +00:00 |
rpoplin
|
9c597309d3
|
Changed the sizes of the dots and bars on the plots generated by the R script which is called from AnalyzeCovariates.java
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2223 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-02 19:47:48 +00:00 |
rpoplin
|
67179e2412
|
Initial checkin of AnalyzeCovariates.java which replaces analyzeRecalQuals_1KG.py and is updated to use the new Covariates system. It creates similar plots of residual error for each covariate that was used in the calculation. There is also an option to filter out base qualities below a given threshold.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2215 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-12-02 16:47:35 +00:00 |
andrewk
|
575da25fde
|
R script for selecting a variety of baits (using %GC content and normalized coverage) for Nanostring assessment from those used in the Agilent whole exome hybrid selection design.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1683 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-09-22 18:10:14 +00:00 |
hanna
|
8f9dd03e87
|
Get rid of unnecessary files for generating recalibration data.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1626 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-09-15 15:53:06 +00:00 |
depristo
|
f5b00c20d0
|
Updated python files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1182 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-07-07 14:15:39 +00:00 |
depristo
|
5289230eb8
|
Version 0.2.1 (released) of the TableRecalibrator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1108 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-06-25 22:50:55 +00:00 |
depristo
|
9e26550b0d
|
Apprach v2. Added python analysis script, so java no longer must be used to analyses quality score data. About to refactor out lots of unneeded code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1063 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-06-20 16:00:23 +00:00 |
andrewk
|
080af519cb
|
Added R script and uncommented a line in recal_qual.py
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@886 348d0f76-0448-11de-a6fe-93d51630548a
|
2009-06-03 03:15:45 +00:00 |