hanna
8efedacabf
Bump sam jdk to svn rev 207.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@340 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 22:16:46 +00:00
kiran
089bf30cf4
Send things to the out file via the logger.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@339 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 21:49:03 +00:00
kiran
6db9a00a0b
SAMFileWriter doesn't appear to flush the buffer when its destructor is called. You have to call the close() method. Also, choose a random base for Ns in the forward and reverse strands so that samtools doesn't pitch a fit.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@338 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 21:48:24 +00:00
kiran
eb2f0ebd62
If the first base of a read is 'N', and the alignment cigar says every base matches, samtools calls shennanigans. Now I just output an A, but the real way to do this is to modify the cigar string accordingly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@337 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:58:18 +00:00
kiran
0e7d962eca
Oops. Slight twiddle of the math here so that I'm not asking if bestBase == nextBestBase.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@336 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:56:54 +00:00
aaron
d4ab95c098
Added a constructor, took out a copy constructor, and changed some SAMBAM code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@335 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:53:20 +00:00
kcibul
0b81a76420
added support for Picard IntervalList files to --interval_file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@334 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:49:43 +00:00
aaron
295c269a64
Remove the main() I put in for debugging
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@333 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:43:44 +00:00
aaron
d517245beb
Fixes for shattering, added JUnit test case
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@332 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:37:34 +00:00
kiran
62ac7366ed
A quick hack to ensure that the sequence, qualities, and secondary qualities are in accordance with the strand flag.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@331 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:57:28 +00:00
kiran
25474ebe7e
Computes the read error rate for a bam file. Ignores reads with indels, treats low-quality and high-quality reference bases the same. Does not count ambiguous reference bases as mismatches. Optionally allows for best two bases in read to be used.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@330 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:56:10 +00:00
kiran
59b2e6a90f
Added some stuff for retreiving the base index and probability of a compressed base.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@329 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:52:58 +00:00
asivache
8d48bdc9ec
it walks... the version committed actually counts snps only
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@328 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 02:00:41 +00:00
asivache
62d75ced3c
nothing fancy, just a wrapper (aka struct) to pass around a bunch of counts
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@327 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:58:57 +00:00
asivache
453d13415d
count variant as biallelic if it's just a non-ref homogeneous site!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@326 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:57:27 +00:00
depristo
b49f713336
Enabled multiple argument for GATK driver; first step towards generalized -rods <name> <type> <file> argument structure
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@325 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:52:13 +00:00
asivache
1ade22121b
cruel hack: new toolkit-wide optional cmdline arguments added to allow for loading trio genotyping tracks; to be moved back to walker when walkers can register their data needs with the toolkit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@324 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:33:26 +00:00
asivache
8ec427ab66
latest version... still under dev/testing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@323 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:31:06 +00:00
hanna
202c501939
Added a sample xml marshaller / unmarshaller.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@322 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:28:16 +00:00
hanna
abe2d25f10
Added castor dependency.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@321 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:27:39 +00:00
depristo
9d35f0ca67
The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@320 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:21:57 +00:00
depristo
00722e19bc
The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@319 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:19:54 +00:00
asivache
9c4fc633aa
Make it symmetric: if there is no sequence dictionary, also send a message to the logger, just like we do when we find the dict
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@318 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:44:39 +00:00
asivache
b64e4d1a04
seekForwardOffset changed (improved?): first, compareContigs does *not*, in general, return -1,0 or 1 if no dictionary is available; second, be more flexible in trying to jump to the right contig (current implementation of FastaFile2 will still through an exception if there's no dictionary, but iterator itself behaves transparently)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@317 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:42:33 +00:00
aaron
2663ac3e4a
documentation fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@316 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:39:50 +00:00
aaron
8a357a88a2
right...exponential should be exponential, so I might want to increment the exponent
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@315 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 20:12:05 +00:00
aaron
6ce9e0f941
delete the old strategy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@314 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:40:03 +00:00
aaron
08fddd43af
-Replaced adaptive and linear strategies with an adaptive linear strategy
...
-Added the exponential growth strategy
-Added factory code that allows you to transitition between strategies, so if you want to move from linear to exp at a point, and then back when you've hit a runtime threshold, it will take care of it for you.
-Changed the code to return a Shard instead of a GenomeLoc
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@313 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:37:38 +00:00
aaron
6369d23b43
renamed; these files are more strategy than actual shards
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@312 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 16:50:56 +00:00
asivache
e95f427965
Added isReference() to AllelicVariant and updated rodDbSNP accordingly
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@311 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 14:49:20 +00:00
kiran
99579a1ef8
Math correction.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@310 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 02:18:13 +00:00
kiran
9be978e006
Intermediate commit (debugging info).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@309 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 01:20:15 +00:00
aaron
b42d8df646
the new shatter method, independent of the underlying data. The only thing needed to create a Shard is the reference seq, which may be a problem in reference less traversals, so the builder class is there so we can make different construction schemes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@308 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:32:57 +00:00
aaron
0baa8c0f76
We need a base exception so we can differentiate between exceptions we've generated and those external to our code. All our exceptions should extend this exception. I'll migrate the ones I can find later on.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@307 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:13:45 +00:00
aaron
150bca30aa
typO in the documentation...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@306 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:05:59 +00:00
aaron
4aa9c0d591
Matt make a good point that the Reference Iterator we were using wasn't bounded; The BoundedReferenceIterator takes a GenomeLoc to bound the iterations by
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@305 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:03:56 +00:00
kiran
5a5c6d1276
Added some debugging stuff (writes model parameters to one file per cycle).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@304 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 22:00:58 +00:00
aaron
0fc8a90553
removing some files from the old approach to dataSource
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@303 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:57:34 +00:00
aaron
5feb7ee627
temperary fix, relying on a old reference order data constructor
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@302 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:38:41 +00:00
aaron
af5a443e5a
add Synchronized to the has_next and next methods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@301 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:17:11 +00:00
aaron
97d14abe85
Interface check-in for Matt
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@300 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:14:19 +00:00
hanna
820cf09198
Updated with last week / next week for 6 April.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@299 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 14:05:20 +00:00
ebanks
d1c5e986d5
Another check to deal with bad reads (BWA output throws bad exceptions)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@298 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 04:58:22 +00:00
ebanks
3f75fc4e83
Unfortunately, because BWA occasionally outputs crazy reads, we need
...
to make sure not to have an ArrayIndexOutOfBoundsException thrown.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@297 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 03:51:35 +00:00
kiran
f12d40dde8
Simplified SAMRecord construction and emission.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@296 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-05 04:48:31 +00:00
asivache
0d25e71953
a declaration is made generic
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@295 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:55:02 +00:00
asivache
551ce9130f
added isBiallelic() to the AllelicVariant interface and to rodDbSNP implementation. We probably don't really know how to deal with non-biallelic sites just as yet...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@294 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:31:16 +00:00
ebanks
2e89d5e46f
That was an annoying bug to find. Mark, I want a beer.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@293 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 20:05:24 +00:00
depristo
4eac3193f7
Added RefMetaDataTracker system as a replacement for the List<RefenenceOrderedData> going into walkers. This system allows you to more easily get a tracker for processing using the lookup(name, default) system. See Pileup for an example.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@292 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:54:54 +00:00
depristo
c1abcfb014
Fixed problem where we were considering reads out of order because their stop positions where out of order, but with equal starts. This involved a change in the ordering feature of GenomeLoc, which now no longer sorts by both start and stop. So as long as the start positions are equal, things are considered "in order". Perhaps this isn't a good idea to change...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@291 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:53:33 +00:00