Commit Graph

2852 Commits (36129e01e4804925df6527cc197d72dddf0d303d)

Author SHA1 Message Date
kshakir 36129e01e4 Using bitmap() instead of png() since the former doesn't rely on X11.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2873 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-23 05:31:51 +00:00
hanna a0e8de40cf Bug fix: at one locus in the dataset, two reads were dropped.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2872 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 23:54:52 +00:00
aaron 5546aa4416 adding code to deal with the off-spec situation where our minimum likelihood is above the GLF max of 255.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2871 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 22:27:39 +00:00
hanna 88d0677379 Misc correctness enhancements: develop the bin selector into a recursive algorithm and return a shard when reads are missing. Also improve the performance of the read filter that clips reads not actually present in the shard.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2870 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 22:19:06 +00:00
kshakir 5f9c3f3884 Outputing annotated VCF to the current directory instead of attempting to write in the directory next to the original vcf.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2869 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 21:31:24 +00:00
ebanks 8b555ff17c Killed the old cleaner code. Bye bye.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2868 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 20:49:58 +00:00
kshakir 3738b76320 Added a playground concordance analyzer for summarizing VariantEval across a group.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2867 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 20:28:52 +00:00
ebanks a640bd2d79 ignore uninteresting extended events
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2866 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 19:55:46 +00:00
rpoplin 32e5dceef9 Moving comments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2865 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 19:27:31 +00:00
alecw b236714c8a Optimization - Added method to Covariates: void getValues( SAMRecord read, Comparable[] comparable ) which takes an array of size (at least) read.getReadLength() and fills it with covariate values for all positions in the given read. Made CovariateCounterWalker and TableRecalibrationWalker use this method instead of calling getValue(..) for each covariate and each offset.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2863 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 17:35:25 +00:00
ebanks 32d14d988e Overload parseIntervalRegion() to allow for the interval merging rule to be passed in (so one is not required to use the value from the GATK arg collection).
Now the IndelRealigner can use this functionality without being forced to merge  abutting intervals (which was actually causing a problem with the cleaning).



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2862 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-22 04:13:54 +00:00
hanna cc09f48cd8 Correctness fix: index can concat chunks around shard edges, and my code didn't account for that.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2861 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-19 21:44:33 +00:00
chartl 0e05a3acb0 Adding depth of coverage features to firehose summary tools
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2860 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-19 19:47:16 +00:00
hanna 71f18e941f Significant performance improvements made by subtracting out the contents of the prior highest-level bin.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2859 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-19 16:46:16 +00:00
rpoplin 3e0e7aad2d Removing debug statement. oops.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2858 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-19 15:26:22 +00:00
rpoplin 7f19ff1fa1 Added a new option in the recalibrator to be used by people who have SOLiD data in which only a few of the reads have no-calls in the color space. These reads will be skipped over and left in the bam file untouched.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2857 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-19 15:25:23 +00:00
aaron b1a4e6d840 removing non-ascii characters from my Copyright and from VariantEval2Walker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2856 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-18 18:54:36 +00:00
aaron 33ae256186 a start to some of the infrastructure for Tribble, including dynamic detection of new RMD; not nearly wired in or complete yet.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2855 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-18 18:43:52 +00:00
ebanks bbbad79f8c Forgot to remove debugging code
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2854 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-18 18:12:58 +00:00
aaron cc3c18d3e1 a small dbsnp file for Tribble testing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2853 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-18 18:12:55 +00:00
ebanks 7669eaaeb3 Optimizations to the cleaner algorithm; reduce total runtime by almost 20%.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2852 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-18 18:10:56 +00:00
aaron 8fd3351971 adding a stripped down Tribble library for the start of integration
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2851 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-17 21:29:25 +00:00
ebanks 79ab7affda - Change sortOnDisk option to sortInMemory
- Fix horrible cleaner bug
- Trivial optimizations to cleaner code - more significant ones coming soon.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2850 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-17 20:52:57 +00:00
ebanks 2520889cb3 Check for bad intervals and don't emit them
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2849 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 21:42:36 +00:00
kiran f859e14cc7 Allow no-call alleles to propagate through to the MAF file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2848 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 20:46:16 +00:00
kiran 217deb9809 Changed the INFO field delimiter from a comma to a semicolon
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2847 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 20:44:57 +00:00
aaron 653f70efa2 added methods to validate an interval before you try to make a GenomeLoc: boolean validGenomeLoc().
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2846 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 20:35:35 +00:00
chartl f02e94ab6f Eliminate the rescale factor -- heatmap automatically normalizes the data
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2845 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 16:34:33 +00:00
chartl 0e4b5ad9c6 Check to ensure sample status is "Complete" before writing out the bam file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2844 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 15:36:42 +00:00
chartl 37fa1bf0cc Added heatmap function
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2843 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-16 15:12:54 +00:00
chartl 01af3d0663 Update an error message :)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2842 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-15 23:24:06 +00:00
chartl 951b7a2433 First of what will be an increasingly useful set of tools, compiled into one command-line runnable library -- the goal is to have one plotting library that's callable because of limitations on the number of files you can package with a GenePattern module.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2841 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-15 16:51:47 +00:00
jmaguire 81313d9452 added class VCFMerge
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2840 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-15 14:41:50 +00:00
jmaguire 0ef50bcae7 - update to match recent changes in the VCF parser
- compute Het Error Rate in VCFConcordance
- changes to the frequency-specific optimizer




git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2839 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-15 14:27:01 +00:00
depristo 8072e9aed5 should never commit without running intergration tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2838 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 23:42:37 +00:00
depristo a1a3d5fcb0 Support for reading in table of rsIDs -> dbSNP builds to back generate a dbSNP build X from a single file. Very useful indeed. dbSNP -> VC now captures the rsID in the context
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2837 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 22:40:55 +00:00
kcibul 28f24ca2ae made some private member/methods protected to allow for subclassing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2836 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 21:16:00 +00:00
hanna 232d884578 Got back most of the performance lost when I fixed the dropped reads problem.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2835 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 19:59:56 +00:00
chartl 04a2784bf7 Initial commit of tools under development for data QC through firehose.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2834 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 19:13:24 +00:00
hanna 77af5822d4 Correcting my incomplete understanding of how the BAM file index actually works.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2833 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 16:15:19 +00:00
depristo 5f74fffa02 Massive improvements to VE2 infrastructure. Now supports VCF writing of interesting sites; multiple comp and eval tracks. Eric will be taking it over and expanding functionality over the next few weeks until it's ready to replace VE1
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2832 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 15:26:52 +00:00
depristo 197dd540b5 added root GATKData variable
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2831 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 15:25:34 +00:00
ebanks c6f6948f9d Haiku:
Eric is a fool.
Matt found his really dumb bug.
Eric is humbled.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2830 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 04:51:56 +00:00
kiran d4e4120ca1 Some useful changes that I've had laying around for a while - deletes files from failed runs, automatically adds a memory limit to java commands where one isn't specified, and touches files on the local machine after command completion to get around the problem with the times not being perfectly synchronized across LSF nodes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2829 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-12 00:18:34 +00:00
rpoplin ecebf0bc62 Bug fix for null pointer exception in AnalyzeAnnotations if -name argument isn't specified
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2828 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 18:39:26 +00:00
mmelgar ad608d0e9d Cleaned up documentation on SecondaryBaseTransitionTableWalker and added Read Group and Allele Balance to the info.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2827 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 17:20:35 +00:00
hanna 34e566c90d Fixed bug where new sharding system wasn't grabbing the reads that start at the end of a bin. Caused by what I currently believe to be a bug in Picard -- will verify with Alec.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2826 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 17:00:04 +00:00
ebanks 96fee7cf7a Disabling input of known indels for use as alternate consenses. When we get rods in a read traversal, it will be trivial to hook it into the cleaner (the code is already there).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2825 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 15:52:21 +00:00
chartl e491b42951 Dumb little script that grabs Picard metrics (alignment, hybrid selection, insert size) from picard_aggregation given the path to the bam file; zips them up, and spits them out; for use with Firehose
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2824 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 14:09:30 +00:00
ebanks a4a2c9b172 Deal with bad input; also N-way out isn't default.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2823 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-11 03:44:56 +00:00