hanna
7f8850a8a2
Argument validation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@631 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 20:28:56 +00:00
depristo
5a6892900e
fixing oddities in duplicates
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@628 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:55:45 +00:00
depristo
2204be43eb
System for traversing duplicate reads, along with a walker to compute quality scores among duplicates and a smarter method to combine quality scores across duplicates -- v1
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@624 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:06:02 +00:00
hanna
752928df94
Switch to better mechanism for supplying a default.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@615 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 01:22:01 +00:00
hanna
dc944ec69b
First stage of ROD plumbing for MicroScheduler.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@614 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 23:26:21 +00:00
aaron
5136724884
Added code to the schedulers, one step closer to turning on the new reads traversals
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@613 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 22:36:25 +00:00
aaron
0aba688e6f
Added a interface that all our SAMRecord iterators should try to code to. This is in the effort to keep our code generic
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@609 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 21:40:41 +00:00
hanna
98716138e9
Cleanup: add support for non-public fields. Track matches as state of parsing engine as well as definitions.
...
Made fields of command-line argument system non-public by default.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@606 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 19:38:05 +00:00
aaron
f5eae98af2
Fixed a bug where we could ask for a read when there were none in the pool (that's a bad thing).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@605 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 18:40:55 +00:00
hanna
ef211f96b1
Remove old Apache CLI-based arg system.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@604 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 18:37:51 +00:00
hanna
521aa40baa
Bring new command-line argument parsing system live.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@603 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 18:16:11 +00:00
hanna
4ac9e72739
Migrate default and GATK arguments over to new attribute system in preparation for conversion.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@600 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 23:57:48 +00:00
hanna
b0cdba8bb3
Acting on Kiran's suggestion to make the doc tag in the @Argument annotation required.x
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@598 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 22:43:40 +00:00
aaron
f5880109a7
Added TraverseReads test, some bug fixes discovered in the traversal test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@594 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 20:36:00 +00:00
aaron
daa2163ee8
Made the MergingSamIterator2 peekable. This iterator is being a ducktaped together swiss army knife, the iterators could use a redo soon.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@593 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 19:15:07 +00:00
aaron
09b0b6b57d
Fixes to try and speed up unmapped read traversals. Still not nearly as fast as they should be, but the next step would be to modify samtools code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@592 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 18:17:07 +00:00
hanna
6e38966349
Rename some key classes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@587 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 22:01:04 +00:00
hanna
5bdf653919
Cleanup: prepare for better output handling.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@586 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 21:40:46 +00:00
hanna
9f5f6f9bc7
N-way parallelism. Works for small test cases. Untested for large test cases.
...
-Needs more comprehensive unit testing.
-Needs some basic refactoring.
-Needs rethink of interface boundaries.
-Needs to play more nicely in the /tmp sandbox.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@583 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 19:34:09 +00:00
depristo
84dae06d5a
Initial version of ByDuplicates traversal, as well as a duplicate quality score estimator
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@576 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 22:16:21 +00:00
depristo
ff420f5f6f
Enabled iterator() function
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@575 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 22:15:14 +00:00
aaron
63403d32cd
Changes to the interface to the simple data source rippled out to a bunch of files.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@572 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 20:35:56 +00:00
hanna
7f173af2ea
Encapsulate output tracking a bit.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@570 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 15:12:13 +00:00
hanna
ba9a0b5da8
Break out some of the weird inner classes out of the HierachicalMicroScheduler.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@566 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 21:07:07 +00:00
hanna
95d10ba314
Sketch of hierarchical reduce process, with unit tests for some core classes. Requires breakout of inner classes, testing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@565 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 20:26:16 +00:00
ebanks
7de5da7065
Start getting the cleaner working in Walker
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@561 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 14:59:53 +00:00
hanna
6ecc43f385
Provide a default logger, some config settings, and some doc updates.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@557 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 02:06:05 +00:00
aaron
b836761104
removed the test cases from the bottom of this file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@556 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-28 21:50:22 +00:00
aaron
d4de68e260
added changes for the readsTraversal to accomidate design changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@553 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-28 19:49:58 +00:00
aaron
b6874f30cb
Added changes to bounded read iterator, it now explicitly takes a MSRI2 instead of the interfaces ClosableIterator<SAMRecord>. It would be good to fix this in the future with an interface that lets you get the (possibly merged) header.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@552 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-28 17:57:54 +00:00
aaron
395aaf48b0
Added the new by reads traversal, still needs to be sewn into the micromanager code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@551 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-28 17:55:08 +00:00
aaron
a343f3eab7
Fixed bug where we weren't setting the reads group correctly. Also added code to set the printMetrics field of the singleSampleGenotyper from the Pool caller, it was null excepting out for me without that set.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@548 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-27 15:17:20 +00:00
hanna
9a8902571c
Placeholder for parallel MicroManager.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@542 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 23:08:12 +00:00
hanna
1daa011387
Interval-based traversals were bleeding file handles. Fixed.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@541 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 18:35:54 +00:00
hanna
1e2e78265d
Inadvertently removed interval file support in new TbLbR. Fixed.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@540 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 18:15:42 +00:00
hanna
c9e9731495
More cleanup.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@539 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 17:46:52 +00:00
hanna
4036f24909
Documentation and cleanup work in preparation for parallelism.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@538 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 17:42:00 +00:00
ebanks
0c76a70313
Renamed traversal by "interval" to "locusWindow"
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@537 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 02:26:08 +00:00
depristo
9a299c11d3
Oops, typo and build problems. FYI, fixing typos is better than packing...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@536 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-25 01:37:17 +00:00
depristo
ce470702fc
consistency with java naming conventions
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@535 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 21:44:48 +00:00
depristo
bfce0c93ab
removing bad file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@534 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 21:40:04 +00:00
depristo
05c6679321
Enabled ReduceByInterval
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@533 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 21:39:44 +00:00
hanna
ee2f022c71
Make new TraverseByLociByReference the default.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@532 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 19:50:11 +00:00
hanna
e50ae97fe1
Introduce new index-based fasta reader. Clean up MicroManager code, pushing necessary code back into TraversalEngine.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@531 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 19:40:21 +00:00
jmaguire
dd408a2a9a
First draft of actual pooled EM caller.
...
Produces sane looking output on region of 1kG pilot1:
CALL NA12813.SRP000031.2009_02.bam CC 0.609084 0.609084
CALL NA12003.SRP000031.2009_02.bam CC 2.114234 2.114234 CCCCC
CALL NA06994.SRP000031.2009_02.bam CC 0.910114 0.910114 C
CALL NA18940.SRP000031.2009_02.bam CT 2.589749 0.910114 T
CALL NA18555.SRP000031.2009_02.bam CC 0.609084 0.609084
Next up, eval vs. Baseline pilot1 calls and pilot3 deep-coverage truth.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@525 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 13:42:15 +00:00
ebanks
13d4692d2e
1. Added a by-interval traversal.
...
2. Added a shell for the indel cleaner walker (it's currently being used to test the interval traversal).
3. Fixed small bug in downsampling (make sure to downsample the offsets too)
4. GenomeAnalysisTK.execute => anyone object to my change to "instanceof" instead of trying to catch a ClassCastException (yuck)?
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@524 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 04:33:35 +00:00
aaron
bd4cacb832
Added code to make a read group and sample name for BAM files that don't annotate them on reads. The defaults for both are now the filename, but this may be shortened in the future.
...
The sample name for a read can be retrieved with the command:
read.getAttribute(SAMTag.RG.toString());
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@518 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 00:31:00 +00:00
aaron
635bfd8604
Added a little bit of hack to get the header back to the walker by initialization time, which was before sharding in the last version.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@516 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-23 21:07:11 +00:00
aaron
0208d201c7
Forgot this in the last commit...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@515 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-23 20:47:22 +00:00
aaron
3dc2afd7ab
Added the ability to get a merged header in a LociByReference traversal
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@514 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-23 20:34:52 +00:00
hanna
282f1d88b8
Make the operation 'read from the iterator and place on the queue' atomic with respect to hasNext(), next().
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@513 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-23 20:16:26 +00:00
aaron
8c13940c5a
A lot of changes to support by-read sharding and some from debugging of the by loci traversals
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@511 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-23 19:03:14 +00:00
hanna
3d7575bbb8
Oops...omitted walker.initialize().
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@504 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-23 17:35:28 +00:00
hanna
1bf4d040d8
Increase default shard size from 5 to 100000.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@494 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 18:29:44 +00:00
hanna
3af66a462e
Make PrintLocusContextWalker less verbose.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@493 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 18:28:02 +00:00
hanna
4cafb95be8
TraverseByLoci / TraverseByLociByReference suffered from the same sam-triggered off-by-one (?) bug as TraverseByReference; it was just less obvious here because these versions don't shard.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@491 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 15:48:20 +00:00
kcibul
cb2f621d01
reverting accidental commit of change to shard size
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@490 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 00:33:28 +00:00
kcibul
b820130dce
* added ability to load multiple BAM files from command line
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@489 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 00:28:08 +00:00
kiran
5abfc7d079
Added an argument ('extended' or 'ext') that outputs the four-base probs in a long format.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@485 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 22:27:26 +00:00
asivache
521e202a10
updated interface
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@482 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 21:07:20 +00:00
asivache
55ca272919
reimplemented; now implements Genotype interface instead of AllelicVariant
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@481 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 21:06:42 +00:00
hanna
eafb4633ba
Temporary workaround for samtools index bug: there seems to be an off-by-one error. Will file bug report.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@470 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 23:14:41 +00:00
asivache
f2f9fa3ed4
doc added
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@464 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 16:43:25 +00:00
hanna
d639ec3776
Remove some copied code to make sure the traversal engine stays in sync with the locus context provider.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@463 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 16:41:56 +00:00
depristo
50ae1763f7
Support for -continue_after_errors flag in the validating pileup walker in case you want to see errors as they arise, rather than aborting greedily
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@461 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 03:13:11 +00:00
depristo
ee5ab9536f
trivial checking / flagging issues to enable testing of merging iterator performance
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@460 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 03:11:59 +00:00
depristo
dbf2344cef
Fixes for including duplicate reads in the locus traversal; now checks that the ref arg is provided when needed
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@459 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 01:27:36 +00:00
hanna
01be8f09e3
Exception cleanup. All our non-runtime exceptions should extend from StingException, StingException needs to be lower in the tree to build.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@457 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 22:17:25 +00:00
aaron
e5c80e59dc
fixed the case when you're not seeking, it didn't initalize
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@456 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 22:16:03 +00:00
hanna
165e504d1c
Turn on new TraverseLociByReference is now only dependent on the -et flag. REGION_STR does not matter.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@454 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 19:45:47 +00:00
aaron
12e1f192c4
Fixed a bug in this code where it would eat reads that didn't start at the beginning of the provided interval. This should fix / help fix Kristian problem
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@453 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 18:42:00 +00:00
asivache
835f1067d8
added isHom() and isHet() queries to the Genotype interface (with the obvious meaning)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@452 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 18:41:39 +00:00
kcibul
d35a542bb9
* fixed bug where the merged header was not being set on the read (although the read group was)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@445 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 12:53:07 +00:00
asivache
0d324354ae
separate interface for genotypes as opposed to (population) allelic variants
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@443 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 03:55:16 +00:00
depristo
7261787b71
Fixed potential bug with next() operation returning empty contexts when a read contains a large deletion. We can now use the look ahead safely...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@438 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 21:38:28 +00:00
aaron
e70aecf518
bug fix, but important
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@437 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 21:07:20 +00:00
aaron
67ea66c866
Bug fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@434 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 19:12:18 +00:00
depristo
1edfe48194
Better debugging output with .debug
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@433 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 19:09:18 +00:00
depristo
9cc808104e
Fixed subtle bug in permitting EXPAND_WINDOW to be > 1. We now use the right window size so we avoid including empty hangers. There's still a rare bug to sort out, which occurs in the case where a read with an indel can generate empty hangers.
...
Also cleaned up the debugging output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@432 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 19:08:26 +00:00
aaron
180ff13290
Added a bunch of changes to support the new MicroManager code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@431 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 18:29:38 +00:00
aaron
12407b5b1a
Deleted the old file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@427 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 13:55:01 +00:00
aaron
6db9127f90
Added changes to shattering, refactored SAMBAM into SAM
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@426 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 13:52:56 +00:00
depristo
24722a442e
Slight code cleanup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@421 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:21:36 +00:00
aaron
13b0995d54
Adding an iterator that bounds the number of reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@419 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:18:31 +00:00
depristo
72a3d84ed2
General purpose pileup code -- you can use these features to obtain detailed pileup data from reads and offsets. Useful for all pileup based walkers. Expanded support for rodSAMPileup to enable the new ValidatingPileupWalker, which takes a samtools pileup output and checks that GATK gives identical output as samtools on a per base and per qual pileup. It's going to be a very useful validation tool.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@418 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:13:10 +00:00
hanna
0629f79049
Moved fasta support files into their own package.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@408 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 18:13:23 +00:00
aaron
eb4b4a053b
A bunch of updates to the SAM/BAM data source, along with test cases for the merging of multiple files (it works!).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@399 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 14:19:20 +00:00
kiran
30121534ed
Outputs the secondary bases and quals (if available) in verbose mode. Prefixed with the tag 'SQ='.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@398 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 13:58:28 +00:00
depristo
8b2c2e677b
Uses the cleaner new GenomeLoc(read) syntax
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@396 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:55:43 +00:00
depristo
1cee7948ab
Added lots of assertions to check for problems.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@395 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:55:19 +00:00
depristo
794360c410
Added verbose option to show mapping qualities and base qualities as ints!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@394 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:54:48 +00:00
depristo
cc75e8f712
Uses the cleaner new GenomeLoc(read) syntax
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@393 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:53:58 +00:00
asivache
8e6093d5a5
remove mom/dad/kid cmd line arguments that were needed for mendelian walker; now we can use generic track binding!!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@389 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:45:34 +00:00
aaron
887adcfc7f
Some minor fixes to the last check-in
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@387 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 18:24:51 +00:00
aaron
f2d0d73309
removed old shard strategy code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@386 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 18:13:45 +00:00
aaron
dd604799dc
Added some new code for shard support over reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@385 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 18:11:43 +00:00
hanna
e91a429c58
A class to print out as much context about the given locus site as is possible. Useful for testing traversal engines -- run old and new code across a given region and diff the output to make sure they have the same context.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@383 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 15:29:55 +00:00
asivache
b4136b6d6e
a few tweaks to make it more robust: ignore reads with cigars containing anything but I,D,M; don't set up contig ordering manually, rely upon reference sequence and its dictionary; don't die if a record does not have NM tag, but faal back to direct counting instead; now requires reference as a cmdline arg
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@378 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 04:49:19 +00:00
kiran
756e6c61d8
Strictness args are presented as lowercase in the help, but only accepted if uppercase. Changed help to list the valid arguments in uppercase.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@376 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 00:50:19 +00:00
kcibul
c7777d46d6
* re-enabled setting of sequence dictionary information on GenomeLoc
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@366 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 02:44:14 +00:00
kcibul
ce72932a45
* refactored GenomeLoc to use contigIndex internally for performance and fixed several calling classes
...
* added basic unit test for GenomeLoc
* fixed bug when parsing genome locations like chr1:5000 the start position was being left as maxint rather than being set to the same as the stop position.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@365 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 02:25:17 +00:00
hanna
608a66e6ab
TbyLocibyRef previously didn't seem to support traversals with no interval specified. Put in a temporary fix until the threaded approach is in place.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@363 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 22:14:06 +00:00
hanna
c2669021b8
Cleanup, and support either by-interval traversals or full traversals in data source-backed code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@362 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 22:09:01 +00:00
hanna
2322bb7d86
Workaround: use a single ReferenceIterator for an entire micromanaged traversal. We'll have to
...
do something about ReferenceIterator thread safety later.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@361 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 20:50:28 +00:00
hanna
95753e1b34
Should've been calling queryOverlapping in locus mode.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@360 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 20:22:04 +00:00
depristo
17b3d5b554
New ROD accessing system, including a generalized interface for binding ROD on the command line that doesn't require you to chance GenomeAnalysisTK.java
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@355 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 22:04:59 +00:00
hanna
0d825ccfc1
Oops. Fixed duplicate reference to the reference.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@353 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 21:27:57 +00:00
aaron
9afa101465
Add interval support to the
...
.__ __ __
_____| |__ _____ _/ |__/ |_ ___________
/ ___/ | \\__ \\ __\ __\/ __ \_ __ \
\___ \| Y \/ __ \| | | | \ ___/| | \/
/____ >___| (____ /__| |__| \___ >__|
\/ \/ \/ \/
classes!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@352 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 21:23:43 +00:00
hanna
8a1207e4db
Bringing up scaffolding for integration of locus traversals by reference with Aaron's data source code.
...
Reverts to original TraverseByLociByReference behavior unless a special combination of command-line flags are used.
Lightly tested at best, and major flaws include:
- MicroManager is not doing MicroScheduling right now; it's driving the traversals.
- New database-ish data providers imply by their interface that they're stateless, but they're highly stateful.
- Using static objects to circumvent encapsulation.
- Code duplication is rampant.
- Plus more!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@346 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:28:17 +00:00
aaron
8e2f5471a1
Some cleanup to the data source, and another JUnit test case.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@344 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:58:05 +00:00
aaron
d56193b6df
Cleanup of a couple of output statements
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@343 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:09:07 +00:00
aaron
12752cf893
Added a bunch of fixes: MSRI wasn't working, sharding had broken edge cases, and SAMBAM DS needed to close the file handles.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@341 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 00:20:15 +00:00
aaron
d4ab95c098
Added a constructor, took out a copy constructor, and changed some SAMBAM code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@335 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:53:20 +00:00
kcibul
0b81a76420
added support for Picard IntervalList files to --interval_file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@334 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:49:43 +00:00
aaron
295c269a64
Remove the main() I put in for debugging
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@333 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:43:44 +00:00
aaron
d517245beb
Fixes for shattering, added JUnit test case
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@332 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:37:34 +00:00
asivache
453d13415d
count variant as biallelic if it's just a non-ref homogeneous site!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@326 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:57:27 +00:00
depristo
b49f713336
Enabled multiple argument for GATK driver; first step towards generalized -rods <name> <type> <file> argument structure
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@325 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:52:13 +00:00
asivache
1ade22121b
cruel hack: new toolkit-wide optional cmdline arguments added to allow for loading trio genotyping tracks; to be moved back to walker when walkers can register their data needs with the toolkit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@324 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:33:26 +00:00
asivache
8ec427ab66
latest version... still under dev/testing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@323 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:31:06 +00:00
depristo
00722e19bc
The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@319 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:19:54 +00:00
asivache
b64e4d1a04
seekForwardOffset changed (improved?): first, compareContigs does *not*, in general, return -1,0 or 1 if no dictionary is available; second, be more flexible in trying to jump to the right contig (current implementation of FastaFile2 will still through an exception if there's no dictionary, but iterator itself behaves transparently)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@317 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:42:33 +00:00
aaron
2663ac3e4a
documentation fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@316 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:39:50 +00:00
aaron
8a357a88a2
right...exponential should be exponential, so I might want to increment the exponent
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@315 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 20:12:05 +00:00
aaron
6ce9e0f941
delete the old strategy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@314 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:40:03 +00:00
aaron
08fddd43af
-Replaced adaptive and linear strategies with an adaptive linear strategy
...
-Added the exponential growth strategy
-Added factory code that allows you to transitition between strategies, so if you want to move from linear to exp at a point, and then back when you've hit a runtime threshold, it will take care of it for you.
-Changed the code to return a Shard instead of a GenomeLoc
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@313 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:37:38 +00:00
aaron
6369d23b43
renamed; these files are more strategy than actual shards
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@312 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 16:50:56 +00:00
asivache
e95f427965
Added isReference() to AllelicVariant and updated rodDbSNP accordingly
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@311 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 14:49:20 +00:00
aaron
b42d8df646
the new shatter method, independent of the underlying data. The only thing needed to create a Shard is the reference seq, which may be a problem in reference less traversals, so the builder class is there so we can make different construction schemes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@308 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:32:57 +00:00
aaron
150bca30aa
typO in the documentation...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@306 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:05:59 +00:00
aaron
4aa9c0d591
Matt make a good point that the Reference Iterator we were using wasn't bounded; The BoundedReferenceIterator takes a GenomeLoc to bound the iterations by
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@305 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:03:56 +00:00
aaron
0fc8a90553
removing some files from the old approach to dataSource
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@303 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:57:34 +00:00
aaron
5feb7ee627
temperary fix, relying on a old reference order data constructor
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@302 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:38:41 +00:00
aaron
af5a443e5a
add Synchronized to the has_next and next methods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@301 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:17:11 +00:00
aaron
97d14abe85
Interface check-in for Matt
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@300 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:14:19 +00:00
asivache
0d25e71953
a declaration is made generic
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@295 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:55:02 +00:00
asivache
551ce9130f
added isBiallelic() to the AllelicVariant interface and to rodDbSNP implementation. We probably don't really know how to deal with non-biallelic sites just as yet...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@294 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:31:16 +00:00
depristo
4eac3193f7
Added RefMetaDataTracker system as a replacement for the List<RefenenceOrderedData> going into walkers. This system allows you to more easily get a tracker for processing using the lookup(name, default) system. See Pileup for an example.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@292 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:54:54 +00:00
ebanks
42eb356782
1. modifed by read traversals with indexes to be more general
...
2. GenomeLocs for reads should have ends spanning the read
(moved it to GenomeLoc from Utils)
3. Got rid of those stupid unmappable characters from comments in various files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@289 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 18:24:08 +00:00
andrewk
86fc18e9fc
Fixed merge bug
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@288 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 17:41:58 +00:00
andrewk
bef475778f
- Updated --hapmap switch to --hapmap-chip to reflect the data being chip data for an individual rather than population allele frequency data in Hapmap
...
- Corrected some bugs to get metrics logging working
- Added a switch --force_1base_probs to ignore 4-base probalities if they exist
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@287 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 17:32:31 +00:00
depristo
edc44807af
rod's now have names. Use getName() to access it. Next step is better interface to accessing rods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@286 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 16:41:33 +00:00
depristo
f031d882c6
ByReference traversals!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@281 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 13:23:18 +00:00
asivache
c6ab60ee04
change variable type to Boolean from boolean to make cmdline parser happy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@279 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:35:30 +00:00
asivache
16aa979e34
make -A a true flag not an argument that asks for 'true/false' value!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@278 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:23:46 +00:00
jmaguire
b7a67da775
Expose the underlying SAM reader to the walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@270 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 21:38:00 +00:00
asivache
5d9b068b8b
generic declarations added here and there to eliminate a few annoying warnings; no consequential changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@268 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:53:01 +00:00
asivache
4bc035d919
half-way through making rodDbSNP implement AllelicVariant interface; does not work yet
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@267 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:48:59 +00:00
ebanks
4faa680887
*Massive* speed-up for interval-based by-read traversals.
...
[Could do more optimizing, but this simple fix was good enough for now]
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@266 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:19:39 +00:00
kcibul
c192a95998
changes in three files to make the HapMap RODs work:
...
- HapMapAlleleFrequenciesROD.java - the referenceOrderedDatum implementation
- PrepareROD.java - has a static block that loads the known ROD classes, had to add the above
- GenomeAnalysisTK.java - when supplied a hapmap argument... loads the ROD
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@265 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 19:55:19 +00:00
asivache
b4cdd1d9a1
correct package name
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@264 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:09:31 +00:00
depristo
93fc768c38
Fixing problems with SAMQueryIterator and reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@263 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:04:28 +00:00
ebanks
3248176118
Die with appropriate error message if we try to read past the end
...
of a chromosome.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@261 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:32 +00:00
depristo
24e8581c30
Slight improvements to allele caller interface; fixed problem with printing progress
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@260 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:12 +00:00
jmaguire
25ace306b9
GenomeAnalysisTK: better documentation of validation option.
...
AlleleFrequencyWalker: output the last reference interval if it's left hanging open.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@258 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:11:20 +00:00
asivache
816e768a74
move interface from playground
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@257 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:58:01 +00:00
depristo
d952790258
GFF now parses attributes correctly and efficiently. Slightly better interface to Utils.join
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@253 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 22:54:38 +00:00
ebanks
6cc2fa24d5
Added ability to downsample to a particular coverage
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@250 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:27:06 +00:00
jmaguire
bb3dbb5756
change default onTraversalDone to use the new output streams
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@249 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:31 +00:00
ebanks
6994cca988
added precision
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@246 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 16:21:29 +00:00
ebanks
3af4290a49
Added iterator to randomly downsample to a given fraction of the reads.
...
Also, updated sort iterator to allow user to input max sorts.
Put in placeholder for downsampling to given coverage.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@243 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 02:11:13 +00:00
depristo
385736469c
High performance pileup code and utilities
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@242 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 00:47:47 +00:00
aaron
ad63633b1c
forgot to change the chunks dir to shards before
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@241 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 20:28:20 +00:00
ebanks
8d601a6a42
unbox
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@239 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:51:59 +00:00
ebanks
234137dee8
use boolean instead of String for flag to suppress printing in map
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@236 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:14:00 +00:00
ebanks
3896cc8f17
Moved avg depth of coverage functionality into the core depth of coverage
...
walker. Used new command line args for walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@234 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 05:02:33 +00:00
ebanks
89c1762aa9
Apparently, no one else has tried to create a stateless walker over loci until
...
now, as this should have come up: make sure reduce sums get transferred to the
next reduce.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@232 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:31:51 +00:00
aaron
ba99e9f648
checking in some of the more static Data Source dependent code at this point. They don't do much on their own, but are need for the base data source code I'm writing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@231 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 00:04:03 +00:00
hanna
e812cfbf55
Refactor common functionality out of WalkerManager and into JVMUtils and PathUtils. Add support for loading walkers from a jar.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@229 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:20:55 +00:00
hanna
7c6455fe36
Handle the case where a walker is being run outside of the GATK framework, such as JUnit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@222 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-29 01:50:27 +00:00
depristo
d7c0bcc223
Reorganized GenomeLoc code to more clearly and better use the picard SequenceDictionary information.
...
All GenomeLoc[] are not ArrayList<GenomeLoc> for clarity and consistency
Parsing now recursively merges contiguous elements chr1:1-10;chr1:11-20 => chr1:1-20
Added support for TraversingByLoci over all reference positions specified by the provided location array. System dynamically determines which traversal system to use.
Pileup now marks, very clearly, reference positions without covered reads.
Made changes around the codebase to deal with new GenomeLoc structure.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@218 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-28 20:37:27 +00:00
depristo
c2ae6765a3
Removed unnecessary dependence on playground code...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@217 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 22:48:51 +00:00
depristo
cfee59e0e6
New type hierarchy for Traversals. There's a new package to hold them (traversals) and an easy system to create new ones. We are now one step closer to supporting the execution manager (a totally non-functional version is included here) that actually executes walkers in parallel using N threads.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@214 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:40:45 +00:00
hanna
4a6be896b9
Provide out and err PrintStreams to the walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@213 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:03:32 +00:00
depristo
826781a760
The traversal engine now passes the reduce result to OnTraversalDone() in the walker base class
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@210 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:44:46 +00:00
aaron
d115209e86
moved a bunch of files over to the logging system. In some cases I ballparked the severity level of an error, so if you see something wrong feel free to make changes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@209 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:27:04 +00:00
depristo
3abaaa3cc3
Tried to add a poor man's version of seeing all reference sites in an interval, and failed. However, I did add the command line argument and a few pieces of useful code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@206 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 00:12:35 +00:00
hanna
53fe9acf65
Make command-line arguments available in walker constructor, provide back door from
...
walker into GATK itself, do some cleanup of output messages, and add some bug fixes.
Command-line arguments in walkers are now feature-complete, but still a bit messy.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@203 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 20:45:27 +00:00
hanna
5f9010116a
Collapse the walker hierarchy, in preparation for in-walker output streams less hokey walker args.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@201 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 16:22:35 +00:00
depristo
7cad3acc61
Support for dynamically merging data files. Preliminary only -- everything in these systems is still being tested
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@200 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 14:40:50 +00:00
depristo
d457778283
Unified byLoci and byLociByInterval traversals. It now figures out what to do for you based on the presence of an index and set of required locations to process.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@191 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 16:01:58 +00:00
depristo
d11bb0fc64
Added xReadLines class to utils. It is a iterator<string> and iterable<string> so you can easily read all lines from a file. It's been used to simplify the code to process intervals, and will be used to add merging data support to the system...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@187 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:17:38 +00:00
depristo
8bdf49a01f
added slightly more useful output to Depth of Coverage walker. (now prints number of loci). Traversal engine now actually prints the reduce result (key) and no longer prints millions of locus interval updates
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@183 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 03:12:54 +00:00
depristo
ff98e28abf
High-performance interval list implement -- uses StringBuilder to avoid n^2 calculation. Can handle millions of locations quickly now
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@182 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 02:17:48 +00:00
andrewk
30babbf5b9
Restructured AlleleFrequencyMetricsWalker to correctly report Hapmap concordance numbers for genotyping and added reporting for Hapmap reference/variant calling. Also, tiny bugfix in interval code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@181 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 01:12:05 +00:00
hanna
9e2a373184
Prototype, buggy implementation of walker command-line arguments. Doesn't
...
(yet) deal elegantly with even simple cases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@180 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 00:12:00 +00:00
depristo
919a86e876
Cleaned up code for by interval traversals for Jared. Initialization code refactored and made clear. by loci and by loci by interval use the same underlying code now. Everyone uses the same initialization code to set things up. It's a party in the TraversalEngine and everyone's invited...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@179 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 22:32:45 +00:00
depristo
6df19ab793
Support for byInterval traversals for Jared. Do not use them.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@175 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 20:55:34 +00:00
depristo
9f500215da
Support for reseting the system; Cleanup later
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@174 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 20:52:11 +00:00
andrewk
9dee9ab51c
Added Hapmap data track (using rodGFF class for GFF file format) to toolkit as a command line option, Hapmap metrics to AlleleFrequencyMetricsWalker, and a python Geli2GFF file converter.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@163 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 03:58:03 +00:00
hanna
f7363cf935
Support for loading from either a jar or a class directory. Fixes troubles with IntelliJ debugging.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@162 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 03:56:49 +00:00
hanna
63cd1fe201
Push core / playground lower into the tree.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@160 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-23 23:19:54 +00:00