kiran
c5220c0822
Four-base probs are now decoded with the relevant method in QualityUtils
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@351 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:52:17 +00:00
kiran
9bc763a835
A better (aka 'working') tool for combining four-base probs with an aligned sam file.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@350 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:51:37 +00:00
kiran
b7a2e82b46
Can optionally process raw or corrected intensities.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@349 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:50:11 +00:00
kiran
6cdad10dd1
Make output type identical to the bustard parser so the values can be easily swapped for one another.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@348 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:49:34 +00:00
kiran
d0ce56e018
Remember to take the strand flag into account when calculating error rate per cycle as a surrogate for instrument performance.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@347 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:48:45 +00:00
hanna
8a1207e4db
Bringing up scaffolding for integration of locus traversals by reference with Aaron's data source code.
...
Reverts to original TraverseByLociByReference behavior unless a special combination of command-line flags are used.
Lightly tested at best, and major flaws include:
- MicroManager is not doing MicroScheduling right now; it's driving the traversals.
- New database-ish data providers imply by their interface that they're stateless, but they're highly stateful.
- Using static objects to circumvent encapsulation.
- Code duplication is rampant.
- Plus more!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@346 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:28:17 +00:00
depristo
49b2622e3d
Helper utility for merging BAM files
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@345 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:10:41 +00:00
aaron
8e2f5471a1
Some cleanup to the data source, and another JUnit test case.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@344 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:58:05 +00:00
aaron
d56193b6df
Cleanup of a couple of output statements
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@343 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:09:07 +00:00
kcibul
c556a97f17
Skeleton of Somatic Coverage tool
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@342 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 02:34:03 +00:00
aaron
12752cf893
Added a bunch of fixes: MSRI wasn't working, sharding had broken edge cases, and SAMBAM DS needed to close the file handles.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@341 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 00:20:15 +00:00
hanna
8efedacabf
Bump sam jdk to svn rev 207.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@340 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 22:16:46 +00:00
kiran
089bf30cf4
Send things to the out file via the logger.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@339 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 21:49:03 +00:00
kiran
6db9a00a0b
SAMFileWriter doesn't appear to flush the buffer when its destructor is called. You have to call the close() method. Also, choose a random base for Ns in the forward and reverse strands so that samtools doesn't pitch a fit.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@338 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 21:48:24 +00:00
kiran
eb2f0ebd62
If the first base of a read is 'N', and the alignment cigar says every base matches, samtools calls shennanigans. Now I just output an A, but the real way to do this is to modify the cigar string accordingly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@337 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:58:18 +00:00
kiran
0e7d962eca
Oops. Slight twiddle of the math here so that I'm not asking if bestBase == nextBestBase.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@336 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:56:54 +00:00
aaron
d4ab95c098
Added a constructor, took out a copy constructor, and changed some SAMBAM code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@335 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:53:20 +00:00
kcibul
0b81a76420
added support for Picard IntervalList files to --interval_file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@334 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:49:43 +00:00
aaron
295c269a64
Remove the main() I put in for debugging
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@333 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:43:44 +00:00
aaron
d517245beb
Fixes for shattering, added JUnit test case
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@332 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:37:34 +00:00
kiran
62ac7366ed
A quick hack to ensure that the sequence, qualities, and secondary qualities are in accordance with the strand flag.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@331 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:57:28 +00:00
kiran
25474ebe7e
Computes the read error rate for a bam file. Ignores reads with indels, treats low-quality and high-quality reference bases the same. Does not count ambiguous reference bases as mismatches. Optionally allows for best two bases in read to be used.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@330 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:56:10 +00:00
kiran
59b2e6a90f
Added some stuff for retreiving the base index and probability of a compressed base.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@329 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:52:58 +00:00
asivache
8d48bdc9ec
it walks... the version committed actually counts snps only
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@328 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 02:00:41 +00:00
asivache
62d75ced3c
nothing fancy, just a wrapper (aka struct) to pass around a bunch of counts
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@327 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:58:57 +00:00
asivache
453d13415d
count variant as biallelic if it's just a non-ref homogeneous site!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@326 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:57:27 +00:00
depristo
b49f713336
Enabled multiple argument for GATK driver; first step towards generalized -rods <name> <type> <file> argument structure
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@325 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:52:13 +00:00
asivache
1ade22121b
cruel hack: new toolkit-wide optional cmdline arguments added to allow for loading trio genotyping tracks; to be moved back to walker when walkers can register their data needs with the toolkit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@324 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:33:26 +00:00
asivache
8ec427ab66
latest version... still under dev/testing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@323 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:31:06 +00:00
hanna
202c501939
Added a sample xml marshaller / unmarshaller.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@322 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:28:16 +00:00
hanna
abe2d25f10
Added castor dependency.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@321 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:27:39 +00:00
depristo
9d35f0ca67
The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@320 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:21:57 +00:00
depristo
00722e19bc
The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@319 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:19:54 +00:00
asivache
9c4fc633aa
Make it symmetric: if there is no sequence dictionary, also send a message to the logger, just like we do when we find the dict
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@318 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:44:39 +00:00
asivache
b64e4d1a04
seekForwardOffset changed (improved?): first, compareContigs does *not*, in general, return -1,0 or 1 if no dictionary is available; second, be more flexible in trying to jump to the right contig (current implementation of FastaFile2 will still through an exception if there's no dictionary, but iterator itself behaves transparently)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@317 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:42:33 +00:00
aaron
2663ac3e4a
documentation fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@316 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:39:50 +00:00
aaron
8a357a88a2
right...exponential should be exponential, so I might want to increment the exponent
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@315 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 20:12:05 +00:00
aaron
6ce9e0f941
delete the old strategy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@314 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:40:03 +00:00
aaron
08fddd43af
-Replaced adaptive and linear strategies with an adaptive linear strategy
...
-Added the exponential growth strategy
-Added factory code that allows you to transitition between strategies, so if you want to move from linear to exp at a point, and then back when you've hit a runtime threshold, it will take care of it for you.
-Changed the code to return a Shard instead of a GenomeLoc
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@313 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:37:38 +00:00
aaron
6369d23b43
renamed; these files are more strategy than actual shards
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@312 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 16:50:56 +00:00
asivache
e95f427965
Added isReference() to AllelicVariant and updated rodDbSNP accordingly
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@311 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 14:49:20 +00:00
kiran
99579a1ef8
Math correction.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@310 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 02:18:13 +00:00
kiran
9be978e006
Intermediate commit (debugging info).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@309 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 01:20:15 +00:00
aaron
b42d8df646
the new shatter method, independent of the underlying data. The only thing needed to create a Shard is the reference seq, which may be a problem in reference less traversals, so the builder class is there so we can make different construction schemes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@308 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:32:57 +00:00
aaron
0baa8c0f76
We need a base exception so we can differentiate between exceptions we've generated and those external to our code. All our exceptions should extend this exception. I'll migrate the ones I can find later on.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@307 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:13:45 +00:00
aaron
150bca30aa
typO in the documentation...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@306 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:05:59 +00:00
aaron
4aa9c0d591
Matt make a good point that the Reference Iterator we were using wasn't bounded; The BoundedReferenceIterator takes a GenomeLoc to bound the iterations by
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@305 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:03:56 +00:00
kiran
5a5c6d1276
Added some debugging stuff (writes model parameters to one file per cycle).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@304 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 22:00:58 +00:00
aaron
0fc8a90553
removing some files from the old approach to dataSource
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@303 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:57:34 +00:00
aaron
5feb7ee627
temperary fix, relying on a old reference order data constructor
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@302 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:38:41 +00:00