hanna
8a1207e4db
Bringing up scaffolding for integration of locus traversals by reference with Aaron's data source code.
...
Reverts to original TraverseByLociByReference behavior unless a special combination of command-line flags are used.
Lightly tested at best, and major flaws include:
- MicroManager is not doing MicroScheduling right now; it's driving the traversals.
- New database-ish data providers imply by their interface that they're stateless, but they're highly stateful.
- Using static objects to circumvent encapsulation.
- Code duplication is rampant.
- Plus more!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@346 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:28:17 +00:00
aaron
8e2f5471a1
Some cleanup to the data source, and another JUnit test case.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@344 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:58:05 +00:00
aaron
d56193b6df
Cleanup of a couple of output statements
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@343 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:09:07 +00:00
aaron
12752cf893
Added a bunch of fixes: MSRI wasn't working, sharding had broken edge cases, and SAMBAM DS needed to close the file handles.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@341 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 00:20:15 +00:00
aaron
d4ab95c098
Added a constructor, took out a copy constructor, and changed some SAMBAM code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@335 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:53:20 +00:00
kcibul
0b81a76420
added support for Picard IntervalList files to --interval_file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@334 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:49:43 +00:00
aaron
295c269a64
Remove the main() I put in for debugging
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@333 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:43:44 +00:00
aaron
d517245beb
Fixes for shattering, added JUnit test case
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@332 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:37:34 +00:00
asivache
453d13415d
count variant as biallelic if it's just a non-ref homogeneous site!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@326 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:57:27 +00:00
depristo
b49f713336
Enabled multiple argument for GATK driver; first step towards generalized -rods <name> <type> <file> argument structure
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@325 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:52:13 +00:00
asivache
1ade22121b
cruel hack: new toolkit-wide optional cmdline arguments added to allow for loading trio genotyping tracks; to be moved back to walker when walkers can register their data needs with the toolkit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@324 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:33:26 +00:00
asivache
8ec427ab66
latest version... still under dev/testing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@323 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:31:06 +00:00
depristo
00722e19bc
The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@319 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:19:54 +00:00
asivache
b64e4d1a04
seekForwardOffset changed (improved?): first, compareContigs does *not*, in general, return -1,0 or 1 if no dictionary is available; second, be more flexible in trying to jump to the right contig (current implementation of FastaFile2 will still through an exception if there's no dictionary, but iterator itself behaves transparently)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@317 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:42:33 +00:00
aaron
2663ac3e4a
documentation fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@316 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:39:50 +00:00
aaron
8a357a88a2
right...exponential should be exponential, so I might want to increment the exponent
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@315 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 20:12:05 +00:00
aaron
6ce9e0f941
delete the old strategy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@314 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:40:03 +00:00
aaron
08fddd43af
-Replaced adaptive and linear strategies with an adaptive linear strategy
...
-Added the exponential growth strategy
-Added factory code that allows you to transitition between strategies, so if you want to move from linear to exp at a point, and then back when you've hit a runtime threshold, it will take care of it for you.
-Changed the code to return a Shard instead of a GenomeLoc
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@313 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:37:38 +00:00
aaron
6369d23b43
renamed; these files are more strategy than actual shards
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@312 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 16:50:56 +00:00
asivache
e95f427965
Added isReference() to AllelicVariant and updated rodDbSNP accordingly
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@311 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 14:49:20 +00:00
aaron
b42d8df646
the new shatter method, independent of the underlying data. The only thing needed to create a Shard is the reference seq, which may be a problem in reference less traversals, so the builder class is there so we can make different construction schemes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@308 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:32:57 +00:00
aaron
150bca30aa
typO in the documentation...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@306 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:05:59 +00:00
aaron
4aa9c0d591
Matt make a good point that the Reference Iterator we were using wasn't bounded; The BoundedReferenceIterator takes a GenomeLoc to bound the iterations by
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@305 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:03:56 +00:00
aaron
0fc8a90553
removing some files from the old approach to dataSource
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@303 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:57:34 +00:00
aaron
5feb7ee627
temperary fix, relying on a old reference order data constructor
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@302 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:38:41 +00:00
aaron
af5a443e5a
add Synchronized to the has_next and next methods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@301 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:17:11 +00:00
aaron
97d14abe85
Interface check-in for Matt
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@300 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:14:19 +00:00
asivache
0d25e71953
a declaration is made generic
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@295 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:55:02 +00:00
asivache
551ce9130f
added isBiallelic() to the AllelicVariant interface and to rodDbSNP implementation. We probably don't really know how to deal with non-biallelic sites just as yet...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@294 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:31:16 +00:00
depristo
4eac3193f7
Added RefMetaDataTracker system as a replacement for the List<RefenenceOrderedData> going into walkers. This system allows you to more easily get a tracker for processing using the lookup(name, default) system. See Pileup for an example.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@292 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:54:54 +00:00
ebanks
42eb356782
1. modifed by read traversals with indexes to be more general
...
2. GenomeLocs for reads should have ends spanning the read
(moved it to GenomeLoc from Utils)
3. Got rid of those stupid unmappable characters from comments in various files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@289 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 18:24:08 +00:00
andrewk
86fc18e9fc
Fixed merge bug
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@288 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 17:41:58 +00:00
andrewk
bef475778f
- Updated --hapmap switch to --hapmap-chip to reflect the data being chip data for an individual rather than population allele frequency data in Hapmap
...
- Corrected some bugs to get metrics logging working
- Added a switch --force_1base_probs to ignore 4-base probalities if they exist
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@287 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 17:32:31 +00:00
depristo
edc44807af
rod's now have names. Use getName() to access it. Next step is better interface to accessing rods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@286 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 16:41:33 +00:00
depristo
f031d882c6
ByReference traversals!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@281 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 13:23:18 +00:00
asivache
c6ab60ee04
change variable type to Boolean from boolean to make cmdline parser happy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@279 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:35:30 +00:00
asivache
16aa979e34
make -A a true flag not an argument that asks for 'true/false' value!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@278 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:23:46 +00:00
jmaguire
b7a67da775
Expose the underlying SAM reader to the walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@270 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 21:38:00 +00:00
asivache
5d9b068b8b
generic declarations added here and there to eliminate a few annoying warnings; no consequential changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@268 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:53:01 +00:00
asivache
4bc035d919
half-way through making rodDbSNP implement AllelicVariant interface; does not work yet
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@267 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:48:59 +00:00
ebanks
4faa680887
*Massive* speed-up for interval-based by-read traversals.
...
[Could do more optimizing, but this simple fix was good enough for now]
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@266 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:19:39 +00:00
kcibul
c192a95998
changes in three files to make the HapMap RODs work:
...
- HapMapAlleleFrequenciesROD.java - the referenceOrderedDatum implementation
- PrepareROD.java - has a static block that loads the known ROD classes, had to add the above
- GenomeAnalysisTK.java - when supplied a hapmap argument... loads the ROD
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@265 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 19:55:19 +00:00
asivache
b4cdd1d9a1
correct package name
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@264 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:09:31 +00:00
depristo
93fc768c38
Fixing problems with SAMQueryIterator and reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@263 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:04:28 +00:00
ebanks
3248176118
Die with appropriate error message if we try to read past the end
...
of a chromosome.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@261 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:32 +00:00
depristo
24e8581c30
Slight improvements to allele caller interface; fixed problem with printing progress
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@260 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:12 +00:00
jmaguire
25ace306b9
GenomeAnalysisTK: better documentation of validation option.
...
AlleleFrequencyWalker: output the last reference interval if it's left hanging open.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@258 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:11:20 +00:00
asivache
816e768a74
move interface from playground
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@257 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:58:01 +00:00
depristo
d952790258
GFF now parses attributes correctly and efficiently. Slightly better interface to Utils.join
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@253 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 22:54:38 +00:00
ebanks
6cc2fa24d5
Added ability to downsample to a particular coverage
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@250 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:27:06 +00:00