Commit Graph

2932 Commits (0b03e28b60876c00370b577016b4e5d0ea5e6202)

Author SHA1 Message Date
weisburd 4aa749c709 Moved AnnotatorInputTableFeature and Codec to org.broadinstitute.sting.gatk.refdata.features.annotator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3426 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-24 14:38:07 +00:00
weisburd aca3bcb193 Moved AnnotatorInputTableFeature and Codec to org.broadinstitute.sting.gatk.refdata.features.annotator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3425 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-24 14:37:17 +00:00
weisburd 64ed770250 Moved AnnotatorInputTableFeature and Codec to org.broadinstitute.sting.gatk.refdata.features.annotator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3424 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-24 14:36:28 +00:00
hanna ee3f2eb1d0 Don't output traversal reduce result in the logger. In many cases, the reduce
result is tangential to the product of the analysis and having the logger always
emit it can confuse the output (such as in the new reduceByInterval 
DepthOfCoverage walker).  If users want to emit it, they can choose not override
onTraversalDone, or override onTraversalDone and write results to the output
stream / logger / whatever their choice.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3422 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-23 22:41:43 +00:00
hanna a40e64e47b A downsampling validator. Compares the generated pileup passed in from the alignment context to the reads,
passed in as a Tribble SAM text feature.  If the generated pileup contains a valid set of reads according to
the downsampling rules, the test passes.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3421 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-23 21:49:54 +00:00
delangel a280a0ff0d a) Made HaplotypeScore default annotation. This changed several integration tests, whose MD5 is now updated.
b) Disabled BaseQualRankSumTest, the returned p-values differ wildly from Matlab/R-provided ones, cause TBD.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3419 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 22:25:17 +00:00
hanna b10950c691 Simple performance optimization -- cache the number of reads in the locus hanger.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3417 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 19:26:16 +00:00
delangel 355396109b Bug fix to avoid build failure (class changed under me??)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3416 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 18:48:56 +00:00
delangel 1753d07b02 Added AnnotationByAlleleFrequencyWalker - walker takes an input vcf, a reference vcf and a list of annotations (with the -A argument). For each site present in both VCF's, it outputs the given annotations into the screen as well as allele frequency. Since HapMap vcf reference doesn't include AF in annotations, it computes it from Chromosome, Het and HomVar counts.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3415 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 18:31:34 +00:00
chartl 745d7c582f added integration test for intervals with no coverage due to filtering
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3414 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 16:52:42 +00:00
chartl 7fb3f2d3eb Annotator now buffers indel calls (prevents double-output from double-calls to map)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3413 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 16:34:34 +00:00
chartl 4e834b5e35 VFW now uses a ref window and thus is compatible with indels.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3412 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 15:59:42 +00:00
chartl 88cb93cc3c Changes to Depth of Coverage (added maximum base and mapping quality flags; with new integration tests -- because they use b36, and the other test uses hg18, it's in a different class (integration test system can't change refs on the fly). Initial change to VariantAnnotator to allow it to see extended event pilups; you currently have to throw the -dels flag; and it's specified as "very experimental". Yet,all the integration tests pass.
Homopolymer Run now does the "right" thing (e.g. single bases are represented as HRun = 0 rather than HRun = 1) for indels. AlleleBalance now does something close enough to correct.

Added a convenience method to VariantContext that will return the indel length (or lengths if a site is not biallelic).



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3409 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 13:02:01 +00:00
depristo 6faf101c6c Minor improvements to Callable Loci for public consumption
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3408 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 12:50:11 +00:00
hanna 388dd8d64d Fixing bugs in downsampler introduced when I added Ryan's dup eliminator.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3407 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-21 02:53:12 +00:00
depristo a10fca0d5c Genotyper now is using bytes not chars. Passes all tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3406 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-20 21:02:44 +00:00
hanna 7389077b3b A few misc usability fixes:
- Clarify the message emitted when -XL is supplied so I don't spend another half day chasing a bug that doesn't exist.  
- Crash with a helpful message when running -nt with non-TreeReducible walkers.
- Crash with a helpful message when running -nt with reduceByInterval walkers.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3405 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-20 19:02:02 +00:00
aaron b543dd4ac4 more aggressive checks for the locking, and some more documentation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3404 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-20 16:16:36 +00:00
depristo 1ab00e5895 Retiring multi-sample genotyper
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3401 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-20 14:10:56 +00:00
depristo 727822adb4 BaseUtils has more clear distinction between byte and char routines. All char routines are @Depreciated now. Please use bytes. Better organization of reverse(), now in Utils not BaseUtils.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3400 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-20 14:05:13 +00:00
depristo 6ce3835622 Removing unused methods in QualityUtils; ReferenceContext now converting all bases to upper case, but can be disabled with static boolean
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3399 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-20 12:38:06 +00:00
depristo 5abac5c057 A few more char -> byte cleanups
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3398 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-20 00:02:06 +00:00
depristo 8a725b6c93 Restructuring of ReferenceContext and ReadWalkers to accept a ReferenceContext. Now ReferenceContext is byte[] backed not char[]. Please no more chars for the reference. All of the tests pass now. Coming check-ins are going to clean up the char / byte problems in the GATK
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3397 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 23:27:55 +00:00
aaron 02cc1afdc8 remove RodBed and all it's dependencies.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3396 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 19:12:30 +00:00
chartl ffb1b46166 Added a GCCalculatorWalker for a oneoff analysis for Mark Daly (GC content of agilent 1.1 targets)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3395 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 18:49:51 +00:00
aaron 0036df7b03 adding a convenience method for getting at the RODs that overlap a specific locaiton as GATKFeatures.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3394 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 17:40:20 +00:00
aaron ca386439be only emit a warning if the tribble index is out of date, don't remove and replace it for them. Added a test case where the log4j appender checks the logging messages for the appropriate output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3393 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 15:12:48 +00:00
hanna 017ab6b690 Experimental versions of downsampler and Ryan's deduper are now available either
as walker attributes or from the command-line.  Not ready yet!  Downsampling/deduping 
works in a general sense, but this approach has not been completely optimized or validated.
Use with caution.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3392 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 05:40:05 +00:00
weisburd 46ba88018d Updated to the new readHeader(..) api
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3391 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 04:06:34 +00:00
weisburd 984c51efd3 Updated to use Tribble-based GATKFeature instead of TabularROD
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3390 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 03:42:12 +00:00
weisburd 42ee16f256 Updated to use Tribble-based GATKFeature instead of TabularROD
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3389 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 03:41:37 +00:00
weisburd d8469e2fba Updated to use Tribble-based GATKFeature instead of TabularROD
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3388 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 03:40:47 +00:00
weisburd d65b2d32d1 Removed AnnotatorROD which has been ported to Tribble
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3387 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 03:39:34 +00:00
weisburd b82116f488 Removed AnnotatorROD which has been ported to Tribble
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3386 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 03:39:20 +00:00
weisburd 6b96f025f5 Tribble integration for indexing the AnnotatorInputTable format
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3385 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 03:37:54 +00:00
weisburd 2f3933148d Added fast split(str, delimiter) methodf
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3384 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 03:37:26 +00:00
hanna aedb9f6734 Bring SAMPileupCodec into compliance with new interface.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3383 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 01:23:29 +00:00
aaron 7cfb9ff3dc updates for Tribble 82, fixes for Ryans case where multiple processes would attempt to read/write to the same index, and a couple other Tribble-centric bug fixes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3382 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 19:34:45 +00:00
chartl 635f61c22d Clone the other guy too
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3381 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 18:56:01 +00:00
rpoplin 9e15299475 Misc cleanup in variant recalibrator.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3380 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 17:37:01 +00:00
chartl eb200e4cce Hrumph. Don't just add pointers to the same objects, actually clone the underlying arrays.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3379 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 17:13:44 +00:00
chartl e016491a3d Major refactoring of Depth of Coverage to allow for more extensible partitions of data (now can do read group, sample, and library; in any combination; adding more is fairly easy). Changed the by-gene code to use clones of stats objects, rather than munging the interval DoCs. (Fix for Avinash. Who, hilariously, thinks my name is Carl.) Added sorting methods to ensure static ordering of header and body fields.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3377 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 16:58:13 +00:00
weisburd 3c022e4b0c Improved command-line-arg validation at startup.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3374 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 02:46:17 +00:00
weisburd 35b4bba35e Refactored so it could be used for knownGene and CCDS as well as refGene
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3372 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 02:44:10 +00:00
weisburd bb86c0e03a Improved error message
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3371 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 02:43:13 +00:00
weisburd 68719615be For multiple matches, shifted counter to be 1-based
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3370 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 02:41:50 +00:00
hanna 73e2e32837 Fix typo.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3369 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-17 21:04:00 +00:00
chartl ebd0fabf86 First pass updates to annotations to work with indels. HomopolymerRun indel behavior is currently turned off by a global boolean until it's ready to go live.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3368 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-17 21:02:13 +00:00
hanna 0791beab8f Checking in downsampling iterator alongside LocusIteratorByState, and removing
the reference implementation.  Also implemented a heap size monitor that can
be used to programmatically report the current heap size.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3367 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-17 21:00:44 +00:00
chartl b7d21627ab Changes to DepthOfCoverage (JIRA items) and added back an integration test to cover it. Alterations to the design file generator to output all transcripts (rather than choosing one at random).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3366 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-17 17:23:00 +00:00
kiran 4235164359 Removed the confusionMatrix column (of *course* this is a confusion matrix... what else would it be?!).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3365 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-14 21:55:37 +00:00
kiran 95b29f608b Specify default values.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3364 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-14 21:42:53 +00:00
rpoplin 6efd05831b Encapsulating annotation decoding function in order to use same fixed random seed in both VariantOptimizer and ApplyVariantClusters
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3363 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-14 20:03:38 +00:00
ebanks 32389dc0a9 Fixed GQ estimate when chosen genotype isn't the most likely according to the GLs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3362 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-14 19:17:46 +00:00
depristo 1538dc0144 optimizer now uses -an arguments instead of exclude and force for clarity. command-line length reduced by 50%
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3361 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-14 15:41:44 +00:00
hanna 88bd7a2045 Reenabling UG parallelization performance tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3360 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 16:28:08 +00:00
hanna 0490909285 Fixed epic generic paths fail.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3359 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 15:59:57 +00:00
hanna 7ef87e5126 An integration test based on validating pileup to test parallelism in reads, reference, and RODs. This test runs in less
than a minute and fell over instantly in the case of the Tribble parallelism issue.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3358 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 15:40:43 +00:00
hanna ceec525420 Got rid of stray unicode characters in copyright message.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3357 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 14:47:39 +00:00
hanna 3e9ad4bbd0 Porting SAM pileup ROD to Tribble as a case study.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3356 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 00:22:59 +00:00
aaron 6839c194cb although holding on to memories can be fun, it's bound to hurt performance.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3355 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 19:26:58 +00:00
ebanks c81b910f73 Commenting out the parallelization test which is failing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3354 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 18:39:53 +00:00
aaron cac98ba5ef a couple of small documentation fixes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3353 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 17:40:27 +00:00
depristo 3f07611187 Added support for -nSamples to varianteval (and getNSamplesForEval function). Allows you to calculate AC based metrics for files without genotypes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3350 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 13:36:31 +00:00
aaron 2c55ac1374 fixes for parallel processing problems with Tribble, a small bug in the resource pool, and some more documentation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3349 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 06:13:26 +00:00
hanna 6868ce988f Fix hanging bug reported by Susanne Pfeifer (tiffy @ get satisfaction) where, if the last read(s) in a shard all have an
indel in roughly the same location and that indel isn't covered by any other reads, LocusIteratorByState goes into an infinite
loop.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3348 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 17:31:19 +00:00
ebanks 34969f304c Adding dbsnp to all UG performance tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3347 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 15:48:05 +00:00
ebanks 140e43b93b Checking in to see whether it fails. If I start getting bombarded with Bamboo error reports, I'm commenting it out...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3346 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 15:39:42 +00:00
ebanks 572b383fe2 Make VA annotate dbsnp again
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3345 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 14:06:53 +00:00
rpoplin b09e7231d1 A quick implementation of the experimental covariates for the TGen folks to work with.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3344 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 01:08:52 +00:00
kiran aec5f7b630 Can now threshold results based on minimum base and/or mapping quality.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3343 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 19:58:07 +00:00
kiran 13fd182b7c For dealing with slightly malformatted BAMs - mark every alignment as primary, or in the case of some BAM files from UWash, supply the sample information for each read group.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3340 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 15:17:05 +00:00
kiran 4a7902bb8e Bases 'A' and 'a' (etc.) no longer considered different.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3339 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 14:53:38 +00:00
kiran ec543b7b62 The Complete Genomics confusion matrix rates.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3338 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 14:52:10 +00:00
kiran b223b04331 Don't list '.' as an alternate allele, dummy!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3337 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 14:51:18 +00:00
kiran 98718d0faa Computes the error rate per cycle
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3336 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 14:50:22 +00:00
kiran 7527f950d1 Computes the quality score distribution per readgroup (one column per readgroup)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3335 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 14:49:38 +00:00
kiran c111c15072 Computes the distribution of insert size per library (for now, one output file per library)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3334 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 14:48:35 +00:00
ebanks a51bd57566 First version of the smart batch merging tool.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3333 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-10 02:18:48 +00:00
rpoplin 33a9549896 Variant Optimizer accepts a dbSNP rod arugment to use in determining known/novel status as opposed to using the rsID in the vcf record. VO generates plots of annotation values used in clustering broken out by knowns and novels. Useful for showing which annotations are approximately Gaussian.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3332 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-09 16:48:07 +00:00
hanna 76efa757f0 Switched over to reviewed version of Picard patch. In process, did some optimization to the IntervalSharder
which improved startup time 5-10x when dynamically merging many BAMs.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3331 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-08 14:12:22 +00:00
depristo 504103bd15 Misc. additions to correct utilities
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3329 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 21:34:18 +00:00
depristo 64ccaa4c6a Walkers and integration tests that calculate and compare callable bases
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3328 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 21:33:47 +00:00
depristo d070554329 A walker that calculates read lengths, number and size of clipping events
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3327 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 21:32:51 +00:00
chartl 1749a49042 Mapping and base quality thresholds for DoC default to none
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3326 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 18:08:13 +00:00
aaron 7d2df3f511 example windowed ROD walker for Kristian, and updates to Tribble
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3325 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 17:12:50 +00:00
rpoplin 57f254b13a VE integration test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3324 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 13:58:25 +00:00
ebanks 44de92e09d Checking in the liftover script. I am including a post-processing walker to filter out bad records written in under 10 minutes as per my agreement with Mark.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3321 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 12:31:56 +00:00
ebanks 18f1d31a22 Moving to and organizing in core.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3320 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 04:05:36 +00:00
aaron 06ea65e60b again for JIRA GSA-320
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3319 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 03:47:58 +00:00
aaron ac9b32db88 a bug fix for Kiran; putting JIRA in for better type determination system for the new Tribble tracks so this doesn't happen again.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3318 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 03:31:43 +00:00
hanna 4e0019b04f Repair code that sorts and merges intervals.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3317 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 22:37:25 +00:00
aaron 72e030a670 require that snps be biallelic before we pass them to the TiTv calculation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3316 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 22:33:00 +00:00
rpoplin 7cecec7d00 Removing zero no-calls restriction in AC stats
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3314 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 18:55:07 +00:00
ebanks 0e58fb7cc0 Moved over to be a walker inside the GATK
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3313 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 18:28:03 +00:00
aaron 78409dca0d turned off the progress output from tribble when making an index, and fixing a case where the index file isn't writable so we instead make the index in memory.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3312 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 16:36:58 +00:00
ebanks bacc507a48 Don't worry about sorting anymore in the liftover tool. That will come later.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3311 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 15:00:30 +00:00
ebanks 5df0361bd2 trivial removal of unnecessary comments
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3309 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 03:51:14 +00:00
ebanks 2975e3a4e8 picard Intervals don't sort right - switching to GenomeLocs
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3308 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 03:50:28 +00:00
ebanks 1a99fb9318 First pass at liftover tool. Passing buck over to Aaron...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3306 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 20:38:19 +00:00