aaron
ad63633b1c
forgot to change the chunks dir to shards before
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@241 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 20:28:20 +00:00
jmaguire
ede52f7359
- take command line arguments
...
- output GFF lines to a file (specified by a command line argument)
- improve the GFF output string
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@240 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 18:43:00 +00:00
ebanks
8d601a6a42
unbox
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@239 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:51:59 +00:00
ebanks
234137dee8
use boolean instead of String for flag to suppress printing in map
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@236 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:14:00 +00:00
ebanks
907c183242
update walkers so that onTraversalDone works (it now takes an arg)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@235 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:05:33 +00:00
ebanks
3896cc8f17
Moved avg depth of coverage functionality into the core depth of coverage
...
walker. Used new command line args for walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@234 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 05:02:33 +00:00
ebanks
007ecc8616
Added a stateless walker to give the average depth of coverage for given reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@233 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:33:59 +00:00
ebanks
89c1762aa9
Apparently, no one else has tried to create a stateless walker over loci until
...
now, as this should have come up: make sure reduce sums get transferred to the
next reduce.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@232 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:31:51 +00:00
aaron
ba99e9f648
checking in some of the more static Data Source dependent code at this point. They don't do much on their own, but are need for the base data source code I'm writing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@231 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 00:04:03 +00:00
hanna
7fda409f4e
Fixed bug where read traversals would fail with an exception when not called with a genome_region (-L) argument. From TraversalEngine, line 455, looks like Mark intended an invariant where the list of locations is 0 length if not specified. Made GenomeLoc code compliant with that.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@230 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:43:12 +00:00
hanna
e812cfbf55
Refactor common functionality out of WalkerManager and into JVMUtils and PathUtils. Add support for loading walkers from a jar.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@229 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:20:55 +00:00
hanna
36f851362e
Oops. While writing command-line argument docs, I realized I introduced
...
a regression in default value handling.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@226 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 18:51:39 +00:00
jmaguire
875802e8fc
print output as a GFF line.
...
still need to add printing GFF intervals for stretches of confident reference calls.
does the GFF ROD class handle intervals?? We'll find out. >:)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@225 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:47:35 +00:00
jmaguire
b752960586
rearranged some stuff and eliminated the binomial prior in the N!=2 case. Much faster.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@224 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:26:05 +00:00
hanna
7c6455fe36
Handle the case where a walker is being run outside of the GATK framework, such as JUnit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@222 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-29 01:50:27 +00:00
depristo
d7c0bcc223
Reorganized GenomeLoc code to more clearly and better use the picard SequenceDictionary information.
...
All GenomeLoc[] are not ArrayList<GenomeLoc> for clarity and consistency
Parsing now recursively merges contiguous elements chr1:1-10;chr1:11-20 => chr1:1-20
Added support for TraversingByLoci over all reference positions specified by the provided location array. System dynamically determines which traversal system to use.
Pileup now marks, very clearly, reference positions without covered reads.
Made changes around the codebase to deal with new GenomeLoc structure.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@218 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-28 20:37:27 +00:00
depristo
c2ae6765a3
Removed unnecessary dependence on playground code...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@217 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 22:48:51 +00:00
hanna
b17a03abbd
Fix argument parser test case.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@215 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 16:05:18 +00:00
depristo
cfee59e0e6
New type hierarchy for Traversals. There's a new package to hold them (traversals) and an easy system to create new ones. We are now one step closer to supporting the execution manager (a totally non-functional version is included here) that actually executes walkers in parallel using N threads.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@214 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:40:45 +00:00
hanna
4a6be896b9
Provide out and err PrintStreams to the walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@213 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:03:32 +00:00
asivache
c6d9848d08
synchronizing latest changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@212 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 14:15:44 +00:00
aaron
230c1ad161
moved a bunch of files over to the logging system. In some cases I ballparked the severity level of an error, so if you see something wrong feel free to make changes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@211 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 14:02:55 +00:00
depristo
826781a760
The traversal engine now passes the reduce result to OnTraversalDone() in the walker base class
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@210 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:44:46 +00:00
aaron
d115209e86
moved a bunch of files over to the logging system. In some cases I ballparked the severity level of an error, so if you see something wrong feel free to make changes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@209 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:27:04 +00:00
aaron
935a4d81c9
fixed the problem where you could specify a logging level that didn't exist
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@208 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 04:29:27 +00:00
depristo
3abaaa3cc3
Tried to add a poor man's version of seeing all reference sites in an interval, and failed. However, I did add the command line argument and a few pieces of useful code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@206 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 00:12:35 +00:00
hanna
f7097c8ee7
Cleanup.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@205 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 21:24:12 +00:00
hanna
728f932ecf
Fix exclusive options.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@204 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 20:59:32 +00:00
hanna
53fe9acf65
Make command-line arguments available in walker constructor, provide back door from
...
walker into GATK itself, do some cleanup of output messages, and add some bug fixes.
Command-line arguments in walkers are now feature-complete, but still a bit messy.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@203 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 20:45:27 +00:00
hanna
5f9010116a
Collapse the walker hierarchy, in preparation for in-walker output streams less hokey walker args.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@201 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 16:22:35 +00:00
depristo
7cad3acc61
Support for dynamically merging data files. Preliminary only -- everything in these systems is still being tested
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@200 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 14:40:50 +00:00
hanna
2808fd4bbd
Better support for required mutually exclusive options.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@199 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 03:22:30 +00:00
hanna
08ece8df79
Bug fixes and support for mutually exclusive options. Still a bit rough, but will
...
be easier to clean up after a walker refactoring.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@198 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 03:11:56 +00:00
asivache
f47a214f96
massive changes everywhere; lots of bugs fixed; methods moved around; computation and printout of overall stats added; now decides whether to accept or reject 'improvement'; writes alignments into two output sam files (unmodified reads/failed piles into one, realigned piles into the other); special treat for paranoids: writes third sam file with all the analyzed reads, unmodified
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@197 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 02:26:17 +00:00
andrewk
0331cd8e95
Updated AlleleFrequency* classes to calculate separate lods for VarVsRef and BestVsNextBest mixture (qstar) theories; AFWMetrics now reports single sample performance w.r.t. Hapmap chip using the appropriate lod for gentoyping (BestVsNextBest) or variant / reference calling (VarVsRef).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@196 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 02:10:18 +00:00
andrewk
c88a17dfee
AlleleFrequencyWalker now can parse 4-base probs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@195 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 20:33:05 +00:00
hanna
4b7bfb284a
Support for more complex command-line types: arrays, untyped collections, typed collections, interfaces to typed and untyped collections.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@194 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 20:11:31 +00:00
jmaguire
2ed63fe17c
a bunch of changes that support pools.
...
they don't appear to break single sample:
Allele Frequency Metrics (LOD >= 5)
-------------------------------------------------
Total loci : 9000
Total called with confidence : 8138 (90.42%)
Number of variants : 11 (0.14%) (1/739)
Fraction of variant sites in dbSNP : 81.82%
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@192 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 18:52:42 +00:00
depristo
d457778283
Unified byLoci and byLociByInterval traversals. It now figures out what to do for you based on the presence of an index and set of required locations to process.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@191 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 16:01:58 +00:00
depristo
c18f8fbf5f
Documentation and cleanup of xReadLines.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@190 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:36:21 +00:00
kiran
607731da91
Fixed a harmless (but annoying) bug wherein the read name for the SAMRecords increases by two on every iteration rather than one.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@189 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:20:29 +00:00
jmaguire
44acc358b7
Add a "notes" member to the AlleleFreqencyEstimate, e.g. for hapmap metadata.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@188 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:18:10 +00:00
depristo
d11bb0fc64
Added xReadLines class to utils. It is a iterator<string> and iterable<string> so you can easily read all lines from a file. It's been used to simplify the code to process intervals, and will be used to add merging data support to the system...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@187 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:17:38 +00:00
asivache
4c29dca70d
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@186 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 09:23:42 +00:00
asivache
71d3e8e99b
fixed another bug in gapped alignment computation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@185 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 08:33:57 +00:00
asivache
40f45c2333
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@184 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 05:48:10 +00:00
depristo
8bdf49a01f
added slightly more useful output to Depth of Coverage walker. (now prints number of loci). Traversal engine now actually prints the reduce result (key) and no longer prints millions of locus interval updates
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@183 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 03:12:54 +00:00
depristo
ff98e28abf
High-performance interval list implement -- uses StringBuilder to avoid n^2 calculation. Can handle millions of locations quickly now
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@182 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 02:17:48 +00:00
andrewk
30babbf5b9
Restructured AlleleFrequencyMetricsWalker to correctly report Hapmap concordance numbers for genotyping and added reporting for Hapmap reference/variant calling. Also, tiny bugfix in interval code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@181 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 01:12:05 +00:00
hanna
9e2a373184
Prototype, buggy implementation of walker command-line arguments. Doesn't
...
(yet) deal elegantly with even simple cases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@180 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 00:12:00 +00:00