Commit Graph

124 Commits (decf45664aebe5ea37f5fdca782f14c7d8d4f987)

Author SHA1 Message Date
kiran 7d889c0661 Refactored into oblivon.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@276 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:12:15 +00:00
kiran dffc879240 Should now be appropriately using Bustard data to call bases (there are some mathematical subtleties that arise when no longer using ICs as initialization data. Also writes some more relevant fields in the SAM records. WAAAAAY simpler than old version. Like, super way.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@275 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:10:13 +00:00
kiran 59334b0270 A convenience class for manipulation base probability distributions.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@274 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:08:31 +00:00
kiran 399d9b8c1e A class that represents the model parameters for all of the Gaussian models for all cycles.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@273 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:08:10 +00:00
kiran f0f94b6c72 A class that represents the model parameters for all of the Gaussian models at a given cycle. Handles the accumulation of parameter initialization data and provides for efficient computation of base probability distribution.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@272 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:07:47 +00:00
kiran a8a6c63a32 A class with some static methods that aid the manipulation of quality scores and probabilities (including a method to compress a base and quality score into a byte for SAM output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@271 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:06:15 +00:00
jmaguire b7a67da775 Expose the underlying SAM reader to the walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@270 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 21:38:00 +00:00
jmaguire 8ce4dabd7c Print coverage per reference base for each sample in a merged BAM file.
This  is a good example for how to untangle a merged BAM file.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@269 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 21:35:31 +00:00
asivache 5d9b068b8b generic declarations added here and there to eliminate a few annoying warnings; no consequential changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@268 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:53:01 +00:00
asivache 4bc035d919 half-way through making rodDbSNP implement AllelicVariant interface; does not work yet
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@267 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:48:59 +00:00
ebanks 4faa680887 *Massive* speed-up for interval-based by-read traversals.
[Could do more optimizing, but this simple fix was good enough for now]


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@266 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:19:39 +00:00
kcibul c192a95998 changes in three files to make the HapMap RODs work:
- HapMapAlleleFrequenciesROD.java - the referenceOrderedDatum implementation
 - PrepareROD.java - has a static block that loads the known ROD classes, had to add the above
 - GenomeAnalysisTK.java - when supplied a hapmap argument... loads the ROD

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@265 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 19:55:19 +00:00
asivache b4cdd1d9a1 correct package name
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@264 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:09:31 +00:00
depristo 93fc768c38 Fixing problems with SAMQueryIterator and reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@263 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:04:28 +00:00
jmaguire d202264b23 initial add of pooled calling experiment walker.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@262 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 17:55:40 +00:00
ebanks 3248176118 Die with appropriate error message if we try to read past the end
of a chromosome.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@261 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:32 +00:00
depristo 24e8581c30 Slight improvements to allele caller interface; fixed problem with printing progress
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@260 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:12 +00:00
asivache 20d4bcbb2e I said - delete!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@259 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:21:21 +00:00
jmaguire 25ace306b9 GenomeAnalysisTK: better documentation of validation option.
AlleleFrequencyWalker: output the last reference interval if it's left hanging open.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@258 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:11:20 +00:00
asivache 816e768a74 move interface from playground
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@257 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:58:01 +00:00
asivache f26055c926 interface representing allele variants/genotype calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@256 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:57:19 +00:00
jmaguire f42b75da72 restore GFF_OUTPUT_FILE to a required argument.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@255 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 14:34:08 +00:00
depristo 2cd9a1597f Simple improvements to allele caller
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@254 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 14:09:14 +00:00
depristo d952790258 GFF now parses attributes correctly and efficiently. Slightly better interface to Utils.join
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@253 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 22:54:38 +00:00
hanna ce57fed2fb Hack to work around an Apache CLI bug, where core arguments couldn't be commingled with walker arguments. These arguments can commingle now. Everybody into the pool.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@252 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:56:42 +00:00
ebanks 6cc2fa24d5 Added ability to downsample to a particular coverage
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@250 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:27:06 +00:00
jmaguire bb3dbb5756 change default onTraversalDone to use the new output streams
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@249 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:31 +00:00
jmaguire 4faacac315 Now handle the case where we don't actually SEE all of the positions.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@248 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:07 +00:00
jmaguire 675505646d now makes confident reference intervals.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@247 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 18:46:14 +00:00
ebanks 6994cca988 added precision
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@246 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 16:21:29 +00:00
hanna 16c2ea4673 Invalid arguments are not always flagged when stopAtNonOption is false. Make sure stopAnNonOption is true when we do final argument validation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@245 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 15:58:57 +00:00
hanna 7ee792df04 Print correct help if core arguments (--input-file et al) aren't correctly specified.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@244 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 15:16:49 +00:00
ebanks 3af4290a49 Added iterator to randomly downsample to a given fraction of the reads.
Also, updated sort iterator to allow user to input max sorts.
Put in placeholder for downsampling to given coverage.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@243 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 02:11:13 +00:00
depristo 385736469c High performance pileup code and utilities
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@242 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 00:47:47 +00:00
aaron ad63633b1c forgot to change the chunks dir to shards before
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@241 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 20:28:20 +00:00
jmaguire ede52f7359 - take command line arguments
- output GFF lines to a file (specified by a command line argument)
- improve the GFF output string


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@240 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 18:43:00 +00:00
ebanks 8d601a6a42 unbox
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@239 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:51:59 +00:00
ebanks 234137dee8 use boolean instead of String for flag to suppress printing in map
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@236 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:14:00 +00:00
ebanks 907c183242 update walkers so that onTraversalDone works (it now takes an arg)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@235 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:05:33 +00:00
ebanks 3896cc8f17 Moved avg depth of coverage functionality into the core depth of coverage
walker.  Used new command line args for walkers.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@234 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 05:02:33 +00:00
ebanks 007ecc8616 Added a stateless walker to give the average depth of coverage for given reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@233 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:33:59 +00:00
ebanks 89c1762aa9 Apparently, no one else has tried to create a stateless walker over loci until
now, as this should have come up: make sure reduce sums get transferred to the
next reduce.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@232 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:31:51 +00:00
aaron ba99e9f648 checking in some of the more static Data Source dependent code at this point. They don't do much on their own, but are need for the base data source code I'm writing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@231 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 00:04:03 +00:00
hanna 7fda409f4e Fixed bug where read traversals would fail with an exception when not called with a genome_region (-L) argument. From TraversalEngine, line 455, looks like Mark intended an invariant where the list of locations is 0 length if not specified. Made GenomeLoc code compliant with that.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@230 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:43:12 +00:00
hanna e812cfbf55 Refactor common functionality out of WalkerManager and into JVMUtils and PathUtils. Add support for loading walkers from a jar.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@229 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:20:55 +00:00
hanna 36f851362e Oops. While writing command-line argument docs, I realized I introduced
a regression in default value handling.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@226 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 18:51:39 +00:00
jmaguire 875802e8fc print output as a GFF line.
still need to add printing GFF intervals for stretches of confident reference calls.

does the GFF ROD class handle intervals?? We'll find out. >:)


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@225 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:47:35 +00:00
jmaguire b752960586 rearranged some stuff and eliminated the binomial prior in the N!=2 case. Much faster.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@224 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:26:05 +00:00
hanna 7c6455fe36 Handle the case where a walker is being run outside of the GATK framework, such as JUnit tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@222 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-29 01:50:27 +00:00
depristo d7c0bcc223 Reorganized GenomeLoc code to more clearly and better use the picard SequenceDictionary information.
All GenomeLoc[] are not ArrayList<GenomeLoc> for clarity and consistency
Parsing now recursively merges contiguous elements chr1:1-10;chr1:11-20 => chr1:1-20
Added support for TraversingByLoci over all reference positions specified by the provided location array.  System dynamically determines which traversal system to use.
Pileup now marks, very clearly, reference positions without covered reads.
Made changes around the codebase to deal with new GenomeLoc structure.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@218 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-28 20:37:27 +00:00