andrewk
4e7e0432a2
Updated SNP calling power from coverage tools to work with new UnifiedGenotyper and DepthOfCoverage tools.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2378 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 20:44:30 +00:00
andrewk
f5e547ed6e
Add ability for flat file table parsing module to skip ahead to first occurence of a regular expression (use case: consistently parsing DepthOfCoverage output for histogram section of file across file format changes)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2377 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 20:38:50 +00:00
ebanks
b626fc0684
Joint Estimate is now the default calculation model.
...
Reworked all of the integration tests so that they're now more comprehensive, cover more of what we wan to test, and don't take forever to run.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2376 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 19:41:02 +00:00
andrewk
bf76019f22
Minor change to coverage evalution script, to update for new file format and add output fields
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2375 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 18:06:08 +00:00
ebanks
e051311e8c
Added convenience methods in RodVCF to pull out all of the VCF data from the VCFRecord (e.g. getID(), getSamples(), getInfoValues())
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2374 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 17:58:41 +00:00
ebanks
bb312814a2
UG is now officially in the business of making good SNP calls (as opposed to being hyper-aggressive in its calls and expecting the end-user to filter).
...
Bad/suspicious bases/reads (high mismatch rate, low MQ, low BQ, bad mates) are now filtered out by default (and not used for the annotations either), although this can all be turned off.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2373 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-16 17:28:09 +00:00
aaron
af440943a4
Fixing a bug that Steven uncovered; we had an abigous contract for peek() in PushbackIterator, and SeekableRODIterator wasn't checking to see if it's PushbackIterator hasNext() was true before calling peek().
...
Changed peek() to element() to be consistant with the Java standards of the Queue and Stack classes (element() throws an exception if a record isn't available).
Also updated some of the ROD iterator next() methods to throw NoSuchElementException if next() is called when a record isn't available.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2372 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 23:04:40 +00:00
andrewk
1035abc85f
Add minimum base quality thresholding to depth of coverage via getBaseAndMappingFilteredPileup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2371 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 22:58:30 +00:00
sjia
2deae95df9
Updated documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2370 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 21:31:47 +00:00
hanna
555976d575
One more walker with formatting to fix.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2369 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 21:23:13 +00:00
hanna
cf46472419
Fix up Sherman's new docs in compliance with javadoc specs.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2368 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 21:20:38 +00:00
sjia
df79ed8db1
Updated documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2367 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 20:53:41 +00:00
sjia
a80a5f1036
Updated documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2366 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 20:52:08 +00:00
sjia
18f61d2586
Updated documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2365 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 20:45:19 +00:00
sjia
5974c42468
Updated documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2364 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 20:41:35 +00:00
sjia
d8cfd707bc
Updated documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2363 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 20:35:18 +00:00
sjia
4322beeb35
Updated documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2362 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 20:33:38 +00:00
sjia
4148991d81
Now also encodes amino acids, includes documentation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2361 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 20:26:56 +00:00
ebanks
9b0bdbbf29
Fix for homopolymer bug: ref was lowercase, alt allele was uppercase, so alt != ref. Yuck.
...
This is a temporary fix - pushed more elegant solution over to Matt.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2360 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 19:02:23 +00:00
depristo
a810586418
Check-in without javadoc = smackdown
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2359 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 15:32:39 +00:00
ebanks
b234019cf5
Readded locus printing suppression to DoC walker
...
(and removed unused import from UG)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2358 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 14:50:56 +00:00
depristo
0d2a761460
Bugfix for minBaseQuality to ignore deletion reads. LocusMismatch walker now allows us to skip every nths eligable site
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2357 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 14:38:39 +00:00
ebanks
bf7bab754e
Made getPileupWithoutMappingQualityZeroReads() and getPileupWithoutDeletions() more efficient, per Mark's cue.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2356 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 04:35:21 +00:00
ebanks
874552ff75
Pull the genotype (and genotype quality) calculation out of the VCF code and into the Genotyper.
...
[Also, enable Mark's new UG arguments]
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2355 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 04:29:28 +00:00
depristo
2cbc85cc7a
min mapping quality and min base quality arguments for UG
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2354 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 03:57:27 +00:00
depristo
faa638532a
Correct location
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2353 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 02:42:21 +00:00
depristo
1da97ebb85
Walker for calculating non-independent base errors, v1. Will be moved to somewhere not in core
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2352 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-15 02:40:15 +00:00
chartl
1389ac6bdf
Hurrr -- this uses power as part of its output. Changes to the power calculation broke the md5s RIGHT AFTER I HAD FIXED THEM arghflrg.
...
Will fix again.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2351 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-14 22:42:50 +00:00
chartl
b42fc905e8
Added - new tests (Hapmap was re-added)
...
Modified - Hapmap now takes a -q command to filter out variants by quality
Modified - MathUtils - cumBinomialProbLog now uses BigDecimal to handle some numerical imprecisions
Modified - PowerBelowFrequency - returns 0.0 if called with a negative number (can't be done from inside the walker itself, but since it's called elsewhere one can't be too careful)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2350 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-14 21:57:20 +00:00
rpoplin
8e44bfd2ef
CycleCovariate and PrimerRoundCovariate now correctly handle negative strand 454 and SOLID reads.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2349 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-14 21:52:30 +00:00
ebanks
c7b23d6ca5
Now that VCFGenotypeRecords implement SampleBacked (as they should), a quick fix was needed to get the GenotypeConcordance working when no direct samples were provided in a samples file.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2348 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-14 04:27:16 +00:00
asivache
bd7b07f3f1
added PrimitivePair.Long and a few shortcut utility methods to PrimitivePairs: add(pair), subtract(pair), assignFrom(pair)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2347 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-14 00:15:44 +00:00
ebanks
97618663ef
Refactored and generalized the VCF header info code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2346 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 21:02:45 +00:00
depristo
05b8782d5f
Documentation updates. Moved CountX.java walkers to QC
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2345 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 18:40:22 +00:00
depristo
92307361a4
In preparation for move
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2344 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 18:28:06 +00:00
depristo
56467df49a
minor improvements to snpSelector to work with hapmap chip VCF files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2343 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 17:59:32 +00:00
ebanks
45199136f0
Completed my documentation responsibilities - based on Mark's reasonable assignment and not the one Matt made up while on Meth.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2342 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 04:13:30 +00:00
ebanks
bd2a46ab4c
I want to move over to hpprojects tonight, so I'm checking in various changes all in one go:
...
1. Initial code for annotating calls with the base mismatch rate within a reference window (still needs analysis).
2. Move error checking code from rodVCF to VCFRecord.
3. More improvements to SNP Genotype callset concordance.
4. Fixed some comments in Variation/Genotype
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2341 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-13 02:52:18 +00:00
kiran
2748eb60e1
Added short documentation for each class so that it appears in the walker command-line documentation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2340 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-12 21:41:07 +00:00
rpoplin
78e94b5a84
TableRecalibration now puts the full list of walker arguments into the PG tag of the bam file it creates. Thanks Matt and Eric. Also, the default nback for the HomopolymerCovariate is 8, down from 10.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2339 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-12 17:29:41 +00:00
rpoplin
014013630f
Added hieracrchy to the covariate classes: Required, Standard, and Experimental. Required covariates (rg and reported quality) are added for the user whether or not they are specified in the -cov list. There is now a -standard option in CountCovariates which will add in all of the standard covariates so the user doesn't have to type them all out or even know which ones are the standard. There is logger output to say which covariates are being used of course. The list of covariates used is also added to the PG tag in the bam file produced by TableRecalibration.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2338 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-12 16:34:05 +00:00
hanna
6955b5bf53
Cleanup of the doc system, and introduce Kiran's concept of a detailed summary
...
below the specific command-line arguments for the walker. Also introduced
@help.summary to override summary descriptions if required.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2337 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-12 04:04:37 +00:00
hanna
cdfe204d19
Incorporated feedback from Kiran. Use the Javadoc first sentence extraction capability to just show the first sentence from each line of Javadoc. @help.description can still be used to produce exceptionally verbose descriptions.
...
Also increased the line width as much as I could tolerate (100 characters -> 120 characters).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2336 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 21:59:55 +00:00
rpoplin
4fa4e95fbc
Updated AnalyzeCovariates to extend org.broadinstitute.sting.utils.cmdLine.CommandLineProgram and use the standard argument parsing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2335 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 21:57:18 +00:00
kiran
38d9f7b903
Renamed ReferenceContext's getSimpleBase() method to getBaseIndex()
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2334 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 20:14:39 +00:00
aaron
09811b9f34
Now that we always output the VCF header, make sure that we correctly handle the situation where there are no records in the file. Added unit tests as well.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2333 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 19:51:05 +00:00
hanna
0da2105e3c
Moving DuplicateQualsWalker to oneoffprojects.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2332 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 19:22:32 +00:00
rpoplin
60c3eb4b60
Added help.description to the recalibration walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2331 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 19:02:29 +00:00
ebanks
2ea7632b76
The SNP genotype concordance module is now more comprehensive.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2330 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 18:34:33 +00:00
hanna
590aeee7d2
Documentation for more basic walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2329 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-11 18:15:40 +00:00