Commit Graph

3280 Commits (ad11201235e851d56595b609d2f3c24ba7bd3aa4)

Author SHA1 Message Date
aaron ad11201235 adding more ROD pile-up tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3301 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 16:01:11 +00:00
asivache 0338345bee Fixing the issue with reads having insertion immediately followed by a S/H cigar element causing out of window error.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3300 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 15:42:27 +00:00
ebanks 64640d6b17 Complete the switch statement to deal with all possible cigar operators for Kris.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3299 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 13:41:05 +00:00
aaron f75e54e3f7 fixes for new package names in tribble 74
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3298 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 05:47:04 +00:00
aaron 97dd04cbf0 updating Tribble ahead of the big VCF commit
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3297 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 05:17:54 +00:00
chartl 617542853f Walker that can be used with refGene and a TCGA bed file to annotate intervals in an interval list with the genes and exons they overlap.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3296 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 02:55:01 +00:00
chartl 354262eabe New convenience methods to rodRefSeq for dealing with intervals that may be a superset of multiple exons. Needed for next commit.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3295 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 02:54:18 +00:00
chartl d5b675b3e6 Added - Q&D script to gather verbose bed files to a VCF.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3294 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 02:49:16 +00:00
ebanks 03bea70f3a Fixed edge case bug in cleaner: when no -L argument is used and a target interval abuts the end of the reference genome, we'll NullPointer at the first unmapped read.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3293 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-04 16:49:21 +00:00
kiran 510b3efcc2 Fixed an issue where asking for the alternate alleles at hom-ref sites would result in an array out-of-bounds exception.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3292 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 18:46:33 +00:00
weisburd a462b5e1e7 Changed a default path
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3291 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 17:07:21 +00:00
sjia 94b51de401 HLA caller updated to examine class II loci, updated pointers to dictionary, allele frequencies.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3290 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 14:54:52 +00:00
rpoplin 97fdd92e7b Clean up the code to have a unified approach to calculating p(true) for both with and without ti/tv models
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3289 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 13:30:20 +00:00
aaron f497213933 DbSNP moved over to tribble
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3288 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 06:02:35 +00:00
aaron 447081583a rev tribble with updated version
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3287 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 04:07:28 +00:00
rpoplin 9d01670f62 Major update to the Variant Optimizer. It now performs clustering for both the titv and titv-less models simultaneously, outputting the cluster files at every iteration. It makes use of the Jama matrix library to do full inverse and determinant calculation for the covariance matrix where before it was using only approximations.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3286 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-02 19:21:23 +00:00
weisburd a318b1871d Removed unused column
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3285 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 21:29:34 +00:00
ebanks 9dff578706 Added PG tag to bam header to let people know it's been cleaned.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3284 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 17:30:30 +00:00
weisburd 28f746b76a Added option to generate UCSC or NCBI sequence
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3283 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 17:26:00 +00:00
ebanks 0e10359a5e Okay, finished up the ability to cap a base's qual by its read's mapping quality.
This is experimental - I have not tested its performance on SNP calling, or even played around with it.  If you want to test it out, go nuts.  But don't come running to me if your results are not good.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3282 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 16:58:30 +00:00
ebanks 850f36aa61 Changes to the Unified Genotyper's arguments:
1. User can specify 4 confidence thresholds: for calling vs. emitting and at standard vs. 'trigger' sites.
2. User can cap the base quality by the read's mapping quality (not done yet).
3. Default confidence threshold is now Q30.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3281 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 16:44:24 +00:00
weisburd 8b2ce128b5 Optimized the join(..) method.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3280 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:55:07 +00:00
weisburd c214056d88 Script for concatenating results of GenerateTranscriptToInfo.py into one big file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3279 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:47:08 +00:00
hanna 8bb15ef812 Checking in the reference implementation of the downsampler for back comparison.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3278 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:41:13 +00:00
weisburd 0069cb426d Script for spawning LSF jobs that run the TranscriptToInfo.java walker on each of the 50 contigs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3277 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:27:52 +00:00
weisburd ba7fe7c4e1 Renamed
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3276 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:25:07 +00:00
weisburd 4937295a0b Renamed
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3275 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 15:24:56 +00:00
ebanks 1714c322c2 Reorg of UG args; checking in first before upcoming changes that will break integration tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3274 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 14:48:46 +00:00
weisburd ba78d146ec Finished implementing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3273 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 14:14:31 +00:00
weisburd 5d5c7f9d34 Changed short code of stop codon to 'stop'
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3272 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 13:55:52 +00:00
aaron cbed0b1ade Adding GeliText tribble track as the first enabled Tribble track. This mean 'Variants' is no longer valid for a ROD type, use GeliText instead. I've updated all the references in the codebase.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3271 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-29 22:50:17 +00:00
aaron 7fbfd34315 adding the GELI ROD validation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3270 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-29 21:43:00 +00:00
chartl 82818a417b Allow header fields to come in any order...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3269 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-29 18:33:10 +00:00
hanna 4617abf1ff Fix bug in the interval sharder in cases where contigs specified in intervals are not present in any supplied BAM file.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3268 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-28 20:42:04 +00:00
aaron b648e89096 updating Tribble with a bunch of bug and performance fixes found while performance testing GeliText in the GATK
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3267 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-28 18:45:10 +00:00
chartl e2ff4167af Added "#Family ID" as a possible header value for PlinkRod ... since that's in the new sequenom headers for pilot 3 validation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3266 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-28 18:38:33 +00:00
depristo 5dce16a8f1 Better genotype concordance module. Code refactoring for clarity (please see below/after for educational purposes). Now reports variant sensitivity, concordance, and genotype error rate by default. Also aggregates this data across all samples, so you get a per sample and overall stats for each of these in the allSamples row.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3265 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-28 13:10:11 +00:00
aaron 64c5f287c5 fixes for edge-cases when using reflections to find classes outside of the main jar. Will push as a patch to reflections
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3264 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-27 17:46:46 +00:00
ebanks ca649d13aa Adding the post-processing indel filter to GATK
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3263 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-27 14:43:39 +00:00
aaron c647153b10 Adding Jama for Ryan.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3262 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-27 14:30:36 +00:00
aaron f6468f9143 a fix for a bug we've worked around in the reflections package: previously it didn't find classes that weren't in the main jar. Fixed in this version.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3261 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-27 04:49:49 +00:00
depristo bf3dbd8401 some useful routines for working with project processing. madPipeline contains a bunch of useful routines for building pipelines that I finally put into one file. Let's just say that I'm really looking forward to the new pipeline system...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3260 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 12:34:04 +00:00
ebanks df31eeff9f minor change
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3259 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 06:05:29 +00:00
aaron 68bdac254b a utility walker for validating changes made to the underlying ROD system in the transistion to Tribble.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3258 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 05:21:24 +00:00
ebanks d9bf441391 Have UG emit calls at sites from one or more 'trigger' tracks when provided
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3257 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 05:04:43 +00:00
ebanks 8f2bfac7a6 Bug fix for NullPointerException
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3256 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 05:02:09 +00:00
ebanks f5a3b128c8 Fixing bug that's not caught by integration tests:
If the first eval seen has one or more no-calls, then that's the 2N chromosome count that gets set as the max for the metrics.  Instead, just check that any eval's no-call count is 0.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3255 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 02:40:34 +00:00
depristo 29ab59a7b3 Bug fix for Kiran; insertions now get a null reference allele even if the ref input object is null
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3254 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-24 21:31:03 +00:00
aaron c8d09a29ed some quick changes to the VE output system - more to come.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3253 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-23 21:55:08 +00:00
depristo 7f4d5d9973 Ti/Tv by AC
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3252 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-23 17:56:29 +00:00