Commit Graph

114 Commits (4faa680887875a0239acd69901fcf5d557d052a6)

Author SHA1 Message Date
ebanks 4faa680887 *Massive* speed-up for interval-based by-read traversals.
[Could do more optimizing, but this simple fix was good enough for now]


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@266 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:19:39 +00:00
kcibul c192a95998 changes in three files to make the HapMap RODs work:
- HapMapAlleleFrequenciesROD.java - the referenceOrderedDatum implementation
 - PrepareROD.java - has a static block that loads the known ROD classes, had to add the above
 - GenomeAnalysisTK.java - when supplied a hapmap argument... loads the ROD

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@265 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 19:55:19 +00:00
asivache b4cdd1d9a1 correct package name
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@264 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:09:31 +00:00
depristo 93fc768c38 Fixing problems with SAMQueryIterator and reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@263 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:04:28 +00:00
jmaguire d202264b23 initial add of pooled calling experiment walker.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@262 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 17:55:40 +00:00
ebanks 3248176118 Die with appropriate error message if we try to read past the end
of a chromosome.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@261 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:32 +00:00
depristo 24e8581c30 Slight improvements to allele caller interface; fixed problem with printing progress
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@260 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:12 +00:00
asivache 20d4bcbb2e I said - delete!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@259 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:21:21 +00:00
jmaguire 25ace306b9 GenomeAnalysisTK: better documentation of validation option.
AlleleFrequencyWalker: output the last reference interval if it's left hanging open.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@258 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:11:20 +00:00
asivache 816e768a74 move interface from playground
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@257 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:58:01 +00:00
asivache f26055c926 interface representing allele variants/genotype calls
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@256 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:57:19 +00:00
jmaguire f42b75da72 restore GFF_OUTPUT_FILE to a required argument.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@255 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 14:34:08 +00:00
depristo 2cd9a1597f Simple improvements to allele caller
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@254 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 14:09:14 +00:00
depristo d952790258 GFF now parses attributes correctly and efficiently. Slightly better interface to Utils.join
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@253 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 22:54:38 +00:00
hanna ce57fed2fb Hack to work around an Apache CLI bug, where core arguments couldn't be commingled with walker arguments. These arguments can commingle now. Everybody into the pool.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@252 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:56:42 +00:00
ebanks 6cc2fa24d5 Added ability to downsample to a particular coverage
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@250 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:27:06 +00:00
jmaguire bb3dbb5756 change default onTraversalDone to use the new output streams
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@249 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:31 +00:00
jmaguire 4faacac315 Now handle the case where we don't actually SEE all of the positions.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@248 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:07 +00:00
jmaguire 675505646d now makes confident reference intervals.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@247 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 18:46:14 +00:00
ebanks 6994cca988 added precision
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@246 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 16:21:29 +00:00
hanna 16c2ea4673 Invalid arguments are not always flagged when stopAtNonOption is false. Make sure stopAnNonOption is true when we do final argument validation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@245 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 15:58:57 +00:00
hanna 7ee792df04 Print correct help if core arguments (--input-file et al) aren't correctly specified.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@244 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 15:16:49 +00:00
ebanks 3af4290a49 Added iterator to randomly downsample to a given fraction of the reads.
Also, updated sort iterator to allow user to input max sorts.
Put in placeholder for downsampling to given coverage.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@243 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 02:11:13 +00:00
depristo 385736469c High performance pileup code and utilities
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@242 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 00:47:47 +00:00
aaron ad63633b1c forgot to change the chunks dir to shards before
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@241 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 20:28:20 +00:00
jmaguire ede52f7359 - take command line arguments
- output GFF lines to a file (specified by a command line argument)
- improve the GFF output string


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@240 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 18:43:00 +00:00
ebanks 8d601a6a42 unbox
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@239 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:51:59 +00:00
ebanks 234137dee8 use boolean instead of String for flag to suppress printing in map
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@236 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:14:00 +00:00
ebanks 907c183242 update walkers so that onTraversalDone works (it now takes an arg)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@235 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:05:33 +00:00
ebanks 3896cc8f17 Moved avg depth of coverage functionality into the core depth of coverage
walker.  Used new command line args for walkers.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@234 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 05:02:33 +00:00
ebanks 007ecc8616 Added a stateless walker to give the average depth of coverage for given reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@233 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:33:59 +00:00
ebanks 89c1762aa9 Apparently, no one else has tried to create a stateless walker over loci until
now, as this should have come up: make sure reduce sums get transferred to the
next reduce.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@232 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:31:51 +00:00
aaron ba99e9f648 checking in some of the more static Data Source dependent code at this point. They don't do much on their own, but are need for the base data source code I'm writing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@231 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 00:04:03 +00:00
hanna 7fda409f4e Fixed bug where read traversals would fail with an exception when not called with a genome_region (-L) argument. From TraversalEngine, line 455, looks like Mark intended an invariant where the list of locations is 0 length if not specified. Made GenomeLoc code compliant with that.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@230 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:43:12 +00:00
hanna e812cfbf55 Refactor common functionality out of WalkerManager and into JVMUtils and PathUtils. Add support for loading walkers from a jar.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@229 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:20:55 +00:00
hanna 36f851362e Oops. While writing command-line argument docs, I realized I introduced
a regression in default value handling.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@226 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 18:51:39 +00:00
jmaguire 875802e8fc print output as a GFF line.
still need to add printing GFF intervals for stretches of confident reference calls.

does the GFF ROD class handle intervals?? We'll find out. >:)


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@225 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:47:35 +00:00
jmaguire b752960586 rearranged some stuff and eliminated the binomial prior in the N!=2 case. Much faster.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@224 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:26:05 +00:00
hanna 7c6455fe36 Handle the case where a walker is being run outside of the GATK framework, such as JUnit tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@222 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-29 01:50:27 +00:00
depristo d7c0bcc223 Reorganized GenomeLoc code to more clearly and better use the picard SequenceDictionary information.
All GenomeLoc[] are not ArrayList<GenomeLoc> for clarity and consistency
Parsing now recursively merges contiguous elements chr1:1-10;chr1:11-20 => chr1:1-20
Added support for TraversingByLoci over all reference positions specified by the provided location array.  System dynamically determines which traversal system to use.
Pileup now marks, very clearly, reference positions without covered reads.
Made changes around the codebase to deal with new GenomeLoc structure.

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@218 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-28 20:37:27 +00:00
depristo c2ae6765a3 Removed unnecessary dependence on playground code...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@217 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 22:48:51 +00:00
hanna b17a03abbd Fix argument parser test case.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@215 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 16:05:18 +00:00
depristo cfee59e0e6 New type hierarchy for Traversals. There's a new package to hold them (traversals) and an easy system to create new ones. We are now one step closer to supporting the execution manager (a totally non-functional version is included here) that actually executes walkers in parallel using N threads.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@214 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:40:45 +00:00
hanna 4a6be896b9 Provide out and err PrintStreams to the walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@213 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:03:32 +00:00
asivache c6d9848d08 synchronizing latest changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@212 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 14:15:44 +00:00
aaron 230c1ad161 moved a bunch of files over to the logging system. In some cases I ballparked the severity level of an error, so if you see something wrong feel free to make changes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@211 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 14:02:55 +00:00
depristo 826781a760 The traversal engine now passes the reduce result to OnTraversalDone() in the walker base class
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@210 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:44:46 +00:00
aaron d115209e86 moved a bunch of files over to the logging system. In some cases I ballparked the severity level of an error, so if you see something wrong feel free to make changes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@209 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:27:04 +00:00
aaron 935a4d81c9 fixed the problem where you could specify a logging level that didn't exist
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@208 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 04:29:27 +00:00
depristo 3abaaa3cc3 Tried to add a poor man's version of seeing all reference sites in an interval, and failed. However, I did add the command line argument and a few pieces of useful code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@206 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 00:12:35 +00:00