kcibul
7e05b43f40
* added some error checking for read groups
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@442 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 03:22:49 +00:00
hanna
56f6847456
Changed interface from contig,pos,length to more common contig,start,stop interface.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@441 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 00:04:41 +00:00
depristo
7261787b71
Fixed potential bug with next() operation returning empty contexts when a read contains a large deletion. We can now use the look ahead safely...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@438 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 21:38:28 +00:00
aaron
e70aecf518
bug fix, but important
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@437 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 21:07:20 +00:00
depristo
76d833e39b
Meaningful assert messages
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@435 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 19:12:28 +00:00
aaron
67ea66c866
Bug fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@434 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 19:12:18 +00:00
depristo
1edfe48194
Better debugging output with .debug
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@433 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 19:09:18 +00:00
depristo
9cc808104e
Fixed subtle bug in permitting EXPAND_WINDOW to be > 1. We now use the right window size so we avoid including empty hangers. There's still a rare bug to sort out, which occurs in the case where a read with an indel can generate empty hangers.
...
Also cleaned up the debugging output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@432 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 19:08:26 +00:00
aaron
180ff13290
Added a bunch of changes to support the new MicroManager code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@431 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 18:29:38 +00:00
hanna
339261c4a9
Load the dictionary and sanity check it against the index.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@430 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 18:04:13 +00:00
hanna
26e84d7fd6
Added index iteration for ReferenceSequenceFile interface compatibility.
...
Added better error checking for querying past the end of a contig.
Lots more testing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@429 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 17:17:11 +00:00
kcibul
3fda8613c3
* minor formatting changes
...
* support for "extended" output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@428 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 15:11:05 +00:00
aaron
12407b5b1a
Deleted the old file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@427 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 13:55:01 +00:00
aaron
6db9127f90
Added changes to shattering, refactored SAMBAM into SAM
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@426 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 13:52:56 +00:00
hanna
182626576f
Basic indexed fasta POC in place. Requires a more complete implementation of the ReferenceSequenceFile interface,
...
and much more testing.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@425 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 13:46:56 +00:00
kiran
7949e377e4
Intermediate commit. Refactored some simple base manipulation stuff into BaseUtils.java. Generalized some likelihood computation logic to make future possible EM-ing easier.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@424 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 04:18:07 +00:00
kiran
d0b8d311e6
Can now optionally print the read and the alignment region of the reference.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@423 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 04:10:30 +00:00
kcibul
d4aaa1bef4
* fixed (with Matt's help) the argument parsing
...
* outputting UCSC wiggle format
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@422 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-15 02:17:39 +00:00
depristo
24722a442e
Slight code cleanup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@421 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:21:36 +00:00
aaron
e6fb122d7d
Added some fixes and new iterator tests
...
--This lin e, and those below, will be ignored--
A gatk/iterators
AM gatk/iterators/BoundedReadIteratorTest.java
M gatk/dataSources/simpleDataSources/SAMBAMDataSourceTest.java
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@420 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:19:36 +00:00
aaron
13b0995d54
Adding an iterator that bounds the number of reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@419 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:18:31 +00:00
depristo
72a3d84ed2
General purpose pileup code -- you can use these features to obtain detailed pileup data from reads and offsets. Useful for all pileup based walkers. Expanded support for rodSAMPileup to enable the new ValidatingPileupWalker, which takes a samtools pileup output and checks that GATK gives identical output as samtools on a per base and per qual pileup. It's going to be a very useful validation tool.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@418 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:13:10 +00:00
asivache
baae98c6d5
and don't allocate new 200M string every time please, just pass byte array!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@417 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 21:55:33 +00:00
asivache
9d56355abe
bug fixed when reference name was passed as a string instead of actual reference bases
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@416 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 21:46:27 +00:00
kiran
222c4e5865
Commented out some debugging lines
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@415 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 20:15:41 +00:00
kiran
49d76014d1
Commented out a debugging line
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@414 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 20:15:11 +00:00
kiran
b39e584787
Primary or secondary bases that got a quality score of literally zero led to unfortunate infinities. Added an epsilon (1e-5) to every prob.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@413 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 20:04:49 +00:00
jmaguire
d28e9f9b98
search over q's for finding argmax[q] p(D|q)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@412 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 19:15:45 +00:00
aaron
96248cdec4
Added some output to all the classes, including build in runtime analysis
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@411 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 19:14:53 +00:00
ebanks
647827b18c
Transitioned indel code to use GATK and Walkers
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@410 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 19:14:15 +00:00
ebanks
b363eedd2c
Deal with screwy reads by changing logic to determine whether we are
...
past the last interval
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@409 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 19:13:16 +00:00
hanna
0629f79049
Moved fasta support files into their own package.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@408 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 18:13:23 +00:00
hanna
7a4a5a17c0
Made sequence index compatible with Aaron's junit changes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@407 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 17:53:20 +00:00
aaron
88ebf1a05b
Fied some documentation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@406 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 17:41:38 +00:00
hanna
186c799ffc
Class to read an .fai file.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@405 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 17:37:18 +00:00
aaron
704f1bd634
It helps if I check the new base class in with my changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@404 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 17:18:16 +00:00
aaron
4b3578e1de
Added the base test case, fixed the rest of the test cases to follow suit. Added more verbose output to ant for junit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@403 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 17:11:38 +00:00
jmaguire
961dbbd4ef
Now output bases and qhat and qstar into the GFF.
...
Quals coming soon (four-base)
QHAT : Most likely alt allele freq (unconstrained by number of chromosomes).
QSTAR : Most likely alt allele freq (constrained by number of chromosomes).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@402 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 15:23:00 +00:00
kiran
dafdff1974
All bases are now indexed as A:0, C:1, G:2, T:3.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@401 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 14:49:43 +00:00
kiran
40ea22eb17
Added some methods to return the cross-talk partner base of a given base or base index.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@400 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 14:49:12 +00:00
aaron
eb4b4a053b
A bunch of updates to the SAM/BAM data source, along with test cases for the merging of multiple files (it works!).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@399 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 14:19:20 +00:00
kiran
30121534ed
Outputs the secondary bases and quals (if available) in verbose mode. Prefixed with the tag 'SQ='.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@398 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 13:58:28 +00:00
kiran
998fad76c6
Some utility methods for creating pileups of secondary bases and secondary quals.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@397 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 13:57:54 +00:00
depristo
8b2c2e677b
Uses the cleaner new GenomeLoc(read) syntax
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@396 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:55:43 +00:00
depristo
1cee7948ab
Added lots of assertions to check for problems.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@395 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:55:19 +00:00
depristo
794360c410
Added verbose option to show mapping qualities and base qualities as ints!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@394 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:54:48 +00:00
depristo
cc75e8f712
Uses the cleaner new GenomeLoc(read) syntax
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@393 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:53:58 +00:00
depristo
11377ef390
Added lots of assertions to check for problems. The current GenomeLoc needs to be cleaned up and refactored but at least it runs. We need unit tests ASAP
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@392 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:53:08 +00:00
depristo
bb666ce392
Added mappingQualPileup function for use in the verbose mode of Pileup
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@391 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:51:26 +00:00
asivache
bc43c0eefc
there are really cases when we can not merge until we get just two pilesant now we do not crash in those cases but print a warning and just show the resulting n piles even when n>2
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@390 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:45:47 +00:00
asivache
8e6093d5a5
remove mom/dad/kid cmd line arguments that were needed for mendelian walker; now we can use generic track binding!!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@389 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:45:34 +00:00
kiran
f838a5e511
Changed some double comparisons of the form a == b to abs(a - b) <= precision. Now we shouldn't be passing or failing some if conditions due to floating-point precision.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@388 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 20:05:46 +00:00
aaron
887adcfc7f
Some minor fixes to the last check-in
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@387 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 18:24:51 +00:00
aaron
f2d0d73309
removed old shard strategy code
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@386 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 18:13:45 +00:00
aaron
dd604799dc
Added some new code for shard support over reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@385 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 18:11:43 +00:00
asivache
d44c30154a
added MAX_READ_LENGTH - now we can ignore long reads (454?); a bad idea in general, but the performance hit is to hard to take, at least for preliminary testing runs...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@384 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 16:53:12 +00:00
hanna
e91a429c58
A class to print out as much context about the given locus site as is possible. Useful for testing traversal engines -- run old and new code across a given region and diff the output to make sure they have the same context.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@383 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 15:29:55 +00:00
jmaguire
6652f13a17
more verbose gff output!
...
EVEN MORE verbosity to come!
Tremble in anticipation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@382 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 15:21:23 +00:00
hanna
cf929a8275
Get rid of test case's dependence on transient methods.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@381 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 15:16:42 +00:00
jmaguire
6e180ed44e
Unified caller is go.
...
AlleleFrequencyWalker and related classes work equally well for 2 or 200 chromosomes.
Single Sample Calling:
Allele Frequency Metrics (LOD >= 5)
-------------------------------------------------
Total loci : 171575
Total called with confidence : 168615 (98.27%)
Number of variants : 111 (0.07%) (1/1519)
Fraction of variant sites in dbSNP : 87.39%
-------------------------------------------------
Hapmap metrics are coming up all zero. Will fix.
Pooled Calling:
AAF r-squared after EM is 0.99.
AAF r-squared after EM for alleles < 20% (in pools of ~100-200 chromosomes) is 0.95 (0.75 before EM)
Still not using fractional genotype counts in EM. That should improve r-squared for low frequency alleles.
Chores still outstanding:
- make a real pooled caller walker (as opposed to my experiment framework).
- add fractional genotype counts to EM cycle.
- add pool metrics to the metrics class? *shrug* we don't really have truth outside of a contrived experiment...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@380 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 12:29:51 +00:00
jmaguire
f39092526d
Added function RandomSubset
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@379 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 12:14:53 +00:00
asivache
b4136b6d6e
a few tweaks to make it more robust: ignore reads with cigars containing anything but I,D,M; don't set up contig ordering manually, rely upon reference sequence and its dictionary; don't die if a record does not have NM tag, but faal back to direct counting instead; now requires reference as a cmdline arg
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@378 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 04:49:19 +00:00
kiran
756e6c61d8
Strictness args are presented as lowercase in the help, but only accepted if uppercase. Changed help to list the valid arguments in uppercase.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@376 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 00:50:19 +00:00
kiran
c51f51f255
Make sure we always write at least 1000 points per base in each cycle's scatterplot. Print the disagreement rate between Bustard and FourBaseRecaller.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@375 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 00:49:41 +00:00
kiran
1fb16d54e0
For SAM files that have no alignments and when no reference is specified, contigInfo.getSequence() is null, causing an error when getSequenceName() is called on the resulting null pointer. Check for null instead and return that instead of barfing here.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@374 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 00:48:21 +00:00
kiran
5e96ab6161
Helpful functions for converting a base (char) to a base index (A:0, C:1, G:2, T:3, alphabetical and consistent with Illumina conventions to minimize confusion.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@373 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 00:46:23 +00:00
kiran
35fc002d5d
Debugging information is now written in such a way to make it easier to import into R.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@372 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 19:45:33 +00:00
kiran
6ee4fe5a20
Fixed a Bustard/Firecrest file synchronization bug.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@371 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 19:44:07 +00:00
kiran
817278be46
If a SAMRecord is on the negative strand, reverse complement the SQ tag.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@370 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 19:42:24 +00:00
kiran
1d5a22cacf
Extracts a Fastq file and the SQ tags to a separate file.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@369 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 19:41:44 +00:00
kiran
e410c005c0
A debugging tool to ensure the SQ tag in a four-prob SAM file matches the SAMRecord strand orientation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@368 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 19:40:42 +00:00
hanna
9c37400c4f
Added basic performance testing so I can make sure concurrent access doesn't slow down overall fasta access.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@367 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 18:05:56 +00:00
kcibul
c7777d46d6
* re-enabled setting of sequence dictionary information on GenomeLoc
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@366 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 02:44:14 +00:00
kcibul
ce72932a45
* refactored GenomeLoc to use contigIndex internally for performance and fixed several calling classes
...
* added basic unit test for GenomeLoc
* fixed bug when parsing genome locations like chr1:5000 the start position was being left as maxint rather than being set to the same as the stop position.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@365 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-12 02:25:17 +00:00
hanna
49fd951d8c
Initial test suite for FastaSequenceFile2, so I can add parallelism support with abandon.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@364 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-11 21:10:42 +00:00
hanna
608a66e6ab
TbyLocibyRef previously didn't seem to support traversals with no interval specified. Put in a temporary fix until the threaded approach is in place.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@363 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 22:14:06 +00:00
hanna
c2669021b8
Cleanup, and support either by-interval traversals or full traversals in data source-backed code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@362 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 22:09:01 +00:00
hanna
2322bb7d86
Workaround: use a single ReferenceIterator for an entire micromanaged traversal. We'll have to
...
do something about ReferenceIterator thread safety later.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@361 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 20:50:28 +00:00
hanna
95753e1b34
Should've been calling queryOverlapping in locus mode.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@360 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 20:22:04 +00:00
kiran
2b59110dca
CombineSamAndFourProbs is better.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@358 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 04:19:53 +00:00
kiran
56aa98ad30
Ignore null values.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@357 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 04:18:20 +00:00
kiran
2ef2c9e121
Fixed an issue wherein the SQ field was only being pulled from the first read of the pileup, no matter what. Fixed an issue wherein Andrew enumerates his bases as A:0, C:1, T:2, G:3, and Kiran's QualityUtils methods enumerate bases as A:0, C:1, G:2, T:3 (we should standardize this). Fixed an issue wherein the remaining probability was being divided by 3 rather than 2 when four-base probs are enabled.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@356 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-10 04:17:53 +00:00
depristo
17b3d5b554
New ROD accessing system, including a generalized interface for binding ROD on the command line that doesn't require you to chance GenomeAnalysisTK.java
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@355 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 22:04:59 +00:00
kiran
f5cc2d8b0b
Commented out import of IlluminaParser.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@354 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 21:30:29 +00:00
hanna
0d825ccfc1
Oops. Fixed duplicate reference to the reference.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@353 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 21:27:57 +00:00
aaron
9afa101465
Add interval support to the
...
.__ __ __
_____| |__ _____ _/ |__/ |_ ___________
/ ___/ | \\__ \\ __\ __\/ __ \_ __ \
\___ \| Y \/ __ \| | | | \ ___/| | \/
/____ >___| (____ /__| |__| \___ >__|
\/ \/ \/ \/
classes!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@352 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 21:23:43 +00:00
kiran
c5220c0822
Four-base probs are now decoded with the relevant method in QualityUtils
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@351 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:52:17 +00:00
kiran
9bc763a835
A better (aka 'working') tool for combining four-base probs with an aligned sam file.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@350 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:51:37 +00:00
kiran
b7a2e82b46
Can optionally process raw or corrected intensities.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@349 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:50:11 +00:00
kiran
6cdad10dd1
Make output type identical to the bustard parser so the values can be easily swapped for one another.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@348 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:49:34 +00:00
kiran
d0ce56e018
Remember to take the strand flag into account when calculating error rate per cycle as a surrogate for instrument performance.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@347 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:48:45 +00:00
hanna
8a1207e4db
Bringing up scaffolding for integration of locus traversals by reference with Aaron's data source code.
...
Reverts to original TraverseByLociByReference behavior unless a special combination of command-line flags are used.
Lightly tested at best, and major flaws include:
- MicroManager is not doing MicroScheduling right now; it's driving the traversals.
- New database-ish data providers imply by their interface that they're stateless, but they're highly stateful.
- Using static objects to circumvent encapsulation.
- Code duplication is rampant.
- Plus more!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@346 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 20:28:17 +00:00
aaron
8e2f5471a1
Some cleanup to the data source, and another JUnit test case.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@344 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:58:05 +00:00
aaron
d56193b6df
Cleanup of a couple of output statements
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@343 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 14:09:07 +00:00
kcibul
c556a97f17
Skeleton of Somatic Coverage tool
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@342 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 02:34:03 +00:00
aaron
12752cf893
Added a bunch of fixes: MSRI wasn't working, sharding had broken edge cases, and SAMBAM DS needed to close the file handles.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@341 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-09 00:20:15 +00:00
kiran
089bf30cf4
Send things to the out file via the logger.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@339 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 21:49:03 +00:00
kiran
6db9a00a0b
SAMFileWriter doesn't appear to flush the buffer when its destructor is called. You have to call the close() method. Also, choose a random base for Ns in the forward and reverse strands so that samtools doesn't pitch a fit.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@338 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 21:48:24 +00:00
kiran
eb2f0ebd62
If the first base of a read is 'N', and the alignment cigar says every base matches, samtools calls shennanigans. Now I just output an A, but the real way to do this is to modify the cigar string accordingly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@337 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:58:18 +00:00
kiran
0e7d962eca
Oops. Slight twiddle of the math here so that I'm not asking if bestBase == nextBestBase.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@336 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:56:54 +00:00
aaron
d4ab95c098
Added a constructor, took out a copy constructor, and changed some SAMBAM code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@335 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 19:53:20 +00:00
kcibul
0b81a76420
added support for Picard IntervalList files to --interval_file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@334 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:49:43 +00:00
aaron
295c269a64
Remove the main() I put in for debugging
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@333 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:43:44 +00:00
aaron
d517245beb
Fixes for shattering, added JUnit test case
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@332 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 16:37:34 +00:00
kiran
62ac7366ed
A quick hack to ensure that the sequence, qualities, and secondary qualities are in accordance with the strand flag.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@331 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:57:28 +00:00
kiran
25474ebe7e
Computes the read error rate for a bam file. Ignores reads with indels, treats low-quality and high-quality reference bases the same. Does not count ambiguous reference bases as mismatches. Optionally allows for best two bases in read to be used.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@330 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:56:10 +00:00
kiran
59b2e6a90f
Added some stuff for retreiving the base index and probability of a compressed base.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@329 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 15:52:58 +00:00
asivache
8d48bdc9ec
it walks... the version committed actually counts snps only
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@328 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 02:00:41 +00:00
asivache
62d75ced3c
nothing fancy, just a wrapper (aka struct) to pass around a bunch of counts
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@327 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:58:57 +00:00
asivache
453d13415d
count variant as biallelic if it's just a non-ref homogeneous site!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@326 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:57:27 +00:00
depristo
b49f713336
Enabled multiple argument for GATK driver; first step towards generalized -rods <name> <type> <file> argument structure
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@325 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-08 01:52:13 +00:00
asivache
1ade22121b
cruel hack: new toolkit-wide optional cmdline arguments added to allow for loading trio genotyping tracks; to be moved back to walker when walkers can register their data needs with the toolkit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@324 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:33:26 +00:00
asivache
8ec427ab66
latest version... still under dev/testing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@323 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:31:06 +00:00
hanna
202c501939
Added a sample xml marshaller / unmarshaller.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@322 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:28:16 +00:00
depristo
00722e19bc
The system now requires a dictionary file for a fasta file, or it throws an error. You can't just operate without a sequence dictionary any longer. We will transition to a GenomeLoc system that assumes a dictionary is available.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@319 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 22:19:54 +00:00
asivache
9c4fc633aa
Make it symmetric: if there is no sequence dictionary, also send a message to the logger, just like we do when we find the dict
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@318 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:44:39 +00:00
asivache
b64e4d1a04
seekForwardOffset changed (improved?): first, compareContigs does *not*, in general, return -1,0 or 1 if no dictionary is available; second, be more flexible in trying to jump to the right contig (current implementation of FastaFile2 will still through an exception if there's no dictionary, but iterator itself behaves transparently)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@317 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:42:33 +00:00
aaron
2663ac3e4a
documentation fix
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@316 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 21:39:50 +00:00
aaron
8a357a88a2
right...exponential should be exponential, so I might want to increment the exponent
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@315 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 20:12:05 +00:00
aaron
6ce9e0f941
delete the old strategy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@314 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:40:03 +00:00
aaron
08fddd43af
-Replaced adaptive and linear strategies with an adaptive linear strategy
...
-Added the exponential growth strategy
-Added factory code that allows you to transitition between strategies, so if you want to move from linear to exp at a point, and then back when you've hit a runtime threshold, it will take care of it for you.
-Changed the code to return a Shard instead of a GenomeLoc
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@313 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 19:37:38 +00:00
aaron
6369d23b43
renamed; these files are more strategy than actual shards
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@312 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 16:50:56 +00:00
asivache
e95f427965
Added isReference() to AllelicVariant and updated rodDbSNP accordingly
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@311 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 14:49:20 +00:00
kiran
99579a1ef8
Math correction.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@310 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 02:18:13 +00:00
kiran
9be978e006
Intermediate commit (debugging info).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@309 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 01:20:15 +00:00
aaron
b42d8df646
the new shatter method, independent of the underlying data. The only thing needed to create a Shard is the reference seq, which may be a problem in reference less traversals, so the builder class is there so we can make different construction schemes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@308 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:32:57 +00:00
aaron
0baa8c0f76
We need a base exception so we can differentiate between exceptions we've generated and those external to our code. All our exceptions should extend this exception. I'll migrate the ones I can find later on.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@307 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-07 00:13:45 +00:00
aaron
150bca30aa
typO in the documentation...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@306 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:05:59 +00:00
aaron
4aa9c0d591
Matt make a good point that the Reference Iterator we were using wasn't bounded; The BoundedReferenceIterator takes a GenomeLoc to bound the iterations by
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@305 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 23:03:56 +00:00
kiran
5a5c6d1276
Added some debugging stuff (writes model parameters to one file per cycle).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@304 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 22:00:58 +00:00
aaron
0fc8a90553
removing some files from the old approach to dataSource
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@303 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:57:34 +00:00
aaron
5feb7ee627
temperary fix, relying on a old reference order data constructor
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@302 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:38:41 +00:00
aaron
af5a443e5a
add Synchronized to the has_next and next methods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@301 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:17:11 +00:00
aaron
97d14abe85
Interface check-in for Matt
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@300 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 21:14:19 +00:00
ebanks
d1c5e986d5
Another check to deal with bad reads (BWA output throws bad exceptions)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@298 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 04:58:22 +00:00
ebanks
3f75fc4e83
Unfortunately, because BWA occasionally outputs crazy reads, we need
...
to make sure not to have an ArrayIndexOutOfBoundsException thrown.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@297 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-06 03:51:35 +00:00
kiran
f12d40dde8
Simplified SAMRecord construction and emission.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@296 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-05 04:48:31 +00:00
asivache
0d25e71953
a declaration is made generic
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@295 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:55:02 +00:00
asivache
551ce9130f
added isBiallelic() to the AllelicVariant interface and to rodDbSNP implementation. We probably don't really know how to deal with non-biallelic sites just as yet...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@294 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-04 21:31:16 +00:00
ebanks
2e89d5e46f
That was an annoying bug to find. Mark, I want a beer.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@293 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 20:05:24 +00:00
depristo
4eac3193f7
Added RefMetaDataTracker system as a replacement for the List<RefenenceOrderedData> going into walkers. This system allows you to more easily get a tracker for processing using the lookup(name, default) system. See Pileup for an example.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@292 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:54:54 +00:00
depristo
c1abcfb014
Fixed problem where we were considering reads out of order because their stop positions where out of order, but with equal starts. This involved a change in the ordering feature of GenomeLoc, which now no longer sorts by both start and stop. So as long as the start positions are equal, things are considered "in order". Perhaps this isn't a good idea to change...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@291 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:53:33 +00:00
kiran
ef06924f73
JavaDocs!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@290 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:19:17 +00:00
ebanks
42eb356782
1. modifed by read traversals with indexes to be more general
...
2. GenomeLocs for reads should have ends spanning the read
(moved it to GenomeLoc from Utils)
3. Got rid of those stupid unmappable characters from comments in various files
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@289 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 18:24:08 +00:00
andrewk
86fc18e9fc
Fixed merge bug
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@288 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 17:41:58 +00:00
andrewk
bef475778f
- Updated --hapmap switch to --hapmap-chip to reflect the data being chip data for an individual rather than population allele frequency data in Hapmap
...
- Corrected some bugs to get metrics logging working
- Added a switch --force_1base_probs to ignore 4-base probalities if they exist
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@287 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 17:32:31 +00:00
depristo
edc44807af
rod's now have names. Use getName() to access it. Next step is better interface to accessing rods
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@286 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 16:41:33 +00:00
kiran
5019971290
Now outputs four-base SAM record (read name prefixed with KIR) and bustard SAM record (prefixed with BUS) for easy debugging.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@285 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 15:48:51 +00:00
kiran
15151ac125
Corrected the use of the prior.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@284 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 15:47:47 +00:00
kiran
b854c24575
Oops. I gave this method the wrong name first time around.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@283 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 15:46:26 +00:00
kcibul
9bbce32064
Basic dbSNP and HapMap frequency aware SNP caller... still in progress
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@282 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 14:24:09 +00:00
depristo
f031d882c6
ByReference traversals!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@281 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 13:23:18 +00:00
andrewk
e3ac0cb500
- A lot of code cleaned up; separated metrics code from AlleleFrequencyMetricsWalker into AlleleMetrics and eliminated the former class. AFMW (aside from being a name so long that it warrants an acronym) can now be implemented by passing an option to AlleleFreqeuncyWalker that logs metrics to a file.
...
- AlleleMetrics and AlleleMetricrsWalker are now ready to take a list of clasess that implement the AllelicVariant interface
- Switched a genome location in AlleleFrequencyEstimate from String to GenomeLoc which makes way more sense.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@280 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 02:09:10 +00:00
asivache
c6ab60ee04
change variable type to Boolean from boolean to make cmdline parser happy
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@279 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:35:30 +00:00
asivache
16aa979e34
make -A a true flag not an argument that asks for 'true/false' value!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@278 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:23:46 +00:00
kiran
7d889c0661
Refactored into oblivon.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@276 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:12:15 +00:00
kiran
dffc879240
Should now be appropriately using Bustard data to call bases (there are some mathematical subtleties that arise when no longer using ICs as initialization data. Also writes some more relevant fields in the SAM records. WAAAAAY simpler than old version. Like, super way.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@275 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:10:13 +00:00
kiran
59334b0270
A convenience class for manipulation base probability distributions.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@274 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:08:31 +00:00
kiran
399d9b8c1e
A class that represents the model parameters for all of the Gaussian models for all cycles.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@273 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:08:10 +00:00
kiran
f0f94b6c72
A class that represents the model parameters for all of the Gaussian models at a given cycle. Handles the accumulation of parameter initialization data and provides for efficient computation of base probability distribution.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@272 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:07:47 +00:00
kiran
a8a6c63a32
A class with some static methods that aid the manipulation of quality scores and probabilities (including a method to compress a base and quality score into a byte for SAM output.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@271 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 22:06:15 +00:00
jmaguire
b7a67da775
Expose the underlying SAM reader to the walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@270 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 21:38:00 +00:00
jmaguire
8ce4dabd7c
Print coverage per reference base for each sample in a merged BAM file.
...
This is a good example for how to untangle a merged BAM file.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@269 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 21:35:31 +00:00
asivache
5d9b068b8b
generic declarations added here and there to eliminate a few annoying warnings; no consequential changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@268 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:53:01 +00:00
asivache
4bc035d919
half-way through making rodDbSNP implement AllelicVariant interface; does not work yet
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@267 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:48:59 +00:00
ebanks
4faa680887
*Massive* speed-up for interval-based by-read traversals.
...
[Could do more optimizing, but this simple fix was good enough for now]
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@266 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 20:19:39 +00:00
kcibul
c192a95998
changes in three files to make the HapMap RODs work:
...
- HapMapAlleleFrequenciesROD.java - the referenceOrderedDatum implementation
- PrepareROD.java - has a static block that loads the known ROD classes, had to add the above
- GenomeAnalysisTK.java - when supplied a hapmap argument... loads the ROD
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@265 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 19:55:19 +00:00
asivache
b4cdd1d9a1
correct package name
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@264 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:09:31 +00:00
depristo
93fc768c38
Fixing problems with SAMQueryIterator and reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@263 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:04:28 +00:00
jmaguire
d202264b23
initial add of pooled calling experiment walker.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@262 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 17:55:40 +00:00
ebanks
3248176118
Die with appropriate error message if we try to read past the end
...
of a chromosome.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@261 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:32 +00:00
depristo
24e8581c30
Slight improvements to allele caller interface; fixed problem with printing progress
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@260 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:44:12 +00:00
asivache
20d4bcbb2e
I said - delete!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@259 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:21:21 +00:00
jmaguire
25ace306b9
GenomeAnalysisTK: better documentation of validation option.
...
AlleleFrequencyWalker: output the last reference interval if it's left hanging open.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@258 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 16:11:20 +00:00
asivache
816e768a74
move interface from playground
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@257 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:58:01 +00:00
asivache
f26055c926
interface representing allele variants/genotype calls
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@256 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 15:57:19 +00:00
jmaguire
f42b75da72
restore GFF_OUTPUT_FILE to a required argument.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@255 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 14:34:08 +00:00
depristo
2cd9a1597f
Simple improvements to allele caller
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@254 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 14:09:14 +00:00
depristo
d952790258
GFF now parses attributes correctly and efficiently. Slightly better interface to Utils.join
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@253 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 22:54:38 +00:00
hanna
ce57fed2fb
Hack to work around an Apache CLI bug, where core arguments couldn't be commingled with walker arguments. These arguments can commingle now. Everybody into the pool.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@252 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:56:42 +00:00
ebanks
6cc2fa24d5
Added ability to downsample to a particular coverage
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@250 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 20:27:06 +00:00
jmaguire
bb3dbb5756
change default onTraversalDone to use the new output streams
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@249 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:31 +00:00
jmaguire
4faacac315
Now handle the case where we don't actually SEE all of the positions.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@248 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:07 +00:00
jmaguire
675505646d
now makes confident reference intervals.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@247 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 18:46:14 +00:00
ebanks
6994cca988
added precision
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@246 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 16:21:29 +00:00
hanna
16c2ea4673
Invalid arguments are not always flagged when stopAtNonOption is false. Make sure stopAnNonOption is true when we do final argument validation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@245 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 15:58:57 +00:00
hanna
7ee792df04
Print correct help if core arguments (--input-file et al) aren't correctly specified.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@244 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 15:16:49 +00:00
ebanks
3af4290a49
Added iterator to randomly downsample to a given fraction of the reads.
...
Also, updated sort iterator to allow user to input max sorts.
Put in placeholder for downsampling to given coverage.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@243 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 02:11:13 +00:00
depristo
385736469c
High performance pileup code and utilities
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@242 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 00:47:47 +00:00
aaron
ad63633b1c
forgot to change the chunks dir to shards before
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@241 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 20:28:20 +00:00
jmaguire
ede52f7359
- take command line arguments
...
- output GFF lines to a file (specified by a command line argument)
- improve the GFF output string
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@240 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 18:43:00 +00:00
ebanks
8d601a6a42
unbox
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@239 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:51:59 +00:00
ebanks
234137dee8
use boolean instead of String for flag to suppress printing in map
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@236 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:14:00 +00:00
ebanks
907c183242
update walkers so that onTraversalDone works (it now takes an arg)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@235 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:05:33 +00:00
ebanks
3896cc8f17
Moved avg depth of coverage functionality into the core depth of coverage
...
walker. Used new command line args for walkers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@234 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 05:02:33 +00:00
ebanks
007ecc8616
Added a stateless walker to give the average depth of coverage for given reads
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@233 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:33:59 +00:00
ebanks
89c1762aa9
Apparently, no one else has tried to create a stateless walker over loci until
...
now, as this should have come up: make sure reduce sums get transferred to the
next reduce.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@232 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 02:31:51 +00:00
aaron
ba99e9f648
checking in some of the more static Data Source dependent code at this point. They don't do much on their own, but are need for the base data source code I'm writing.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@231 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 00:04:03 +00:00
hanna
7fda409f4e
Fixed bug where read traversals would fail with an exception when not called with a genome_region (-L) argument. From TraversalEngine, line 455, looks like Mark intended an invariant where the list of locations is 0 length if not specified. Made GenomeLoc code compliant with that.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@230 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:43:12 +00:00
hanna
e812cfbf55
Refactor common functionality out of WalkerManager and into JVMUtils and PathUtils. Add support for loading walkers from a jar.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@229 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 23:20:55 +00:00
hanna
36f851362e
Oops. While writing command-line argument docs, I realized I introduced
...
a regression in default value handling.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@226 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 18:51:39 +00:00
jmaguire
875802e8fc
print output as a GFF line.
...
still need to add printing GFF intervals for stretches of confident reference calls.
does the GFF ROD class handle intervals?? We'll find out. >:)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@225 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:47:35 +00:00
jmaguire
b752960586
rearranged some stuff and eliminated the binomial prior in the N!=2 case. Much faster.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@224 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-30 17:26:05 +00:00
hanna
7c6455fe36
Handle the case where a walker is being run outside of the GATK framework, such as JUnit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@222 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-29 01:50:27 +00:00
depristo
d7c0bcc223
Reorganized GenomeLoc code to more clearly and better use the picard SequenceDictionary information.
...
All GenomeLoc[] are not ArrayList<GenomeLoc> for clarity and consistency
Parsing now recursively merges contiguous elements chr1:1-10;chr1:11-20 => chr1:1-20
Added support for TraversingByLoci over all reference positions specified by the provided location array. System dynamically determines which traversal system to use.
Pileup now marks, very clearly, reference positions without covered reads.
Made changes around the codebase to deal with new GenomeLoc structure.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@218 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-28 20:37:27 +00:00
depristo
c2ae6765a3
Removed unnecessary dependence on playground code...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@217 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 22:48:51 +00:00
hanna
b17a03abbd
Fix argument parser test case.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@215 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 16:05:18 +00:00
depristo
cfee59e0e6
New type hierarchy for Traversals. There's a new package to hold them (traversals) and an easy system to create new ones. We are now one step closer to supporting the execution manager (a totally non-functional version is included here) that actually executes walkers in parallel using N threads.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@214 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:40:45 +00:00
hanna
4a6be896b9
Provide out and err PrintStreams to the walkers.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@213 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 15:03:32 +00:00
asivache
c6d9848d08
synchronizing latest changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@212 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 14:15:44 +00:00
aaron
230c1ad161
moved a bunch of files over to the logging system. In some cases I ballparked the severity level of an error, so if you see something wrong feel free to make changes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@211 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 14:02:55 +00:00
depristo
826781a760
The traversal engine now passes the reduce result to OnTraversalDone() in the walker base class
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@210 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:44:46 +00:00
aaron
d115209e86
moved a bunch of files over to the logging system. In some cases I ballparked the severity level of an error, so if you see something wrong feel free to make changes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@209 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 13:27:04 +00:00
aaron
935a4d81c9
fixed the problem where you could specify a logging level that didn't exist
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@208 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 04:29:27 +00:00
depristo
3abaaa3cc3
Tried to add a poor man's version of seeing all reference sites in an interval, and failed. However, I did add the command line argument and a few pieces of useful code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@206 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-27 00:12:35 +00:00
hanna
f7097c8ee7
Cleanup.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@205 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 21:24:12 +00:00
hanna
728f932ecf
Fix exclusive options.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@204 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 20:59:32 +00:00
hanna
53fe9acf65
Make command-line arguments available in walker constructor, provide back door from
...
walker into GATK itself, do some cleanup of output messages, and add some bug fixes.
Command-line arguments in walkers are now feature-complete, but still a bit messy.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@203 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 20:45:27 +00:00
hanna
5f9010116a
Collapse the walker hierarchy, in preparation for in-walker output streams less hokey walker args.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@201 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 16:22:35 +00:00
depristo
7cad3acc61
Support for dynamically merging data files. Preliminary only -- everything in these systems is still being tested
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@200 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 14:40:50 +00:00
hanna
2808fd4bbd
Better support for required mutually exclusive options.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@199 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 03:22:30 +00:00
hanna
08ece8df79
Bug fixes and support for mutually exclusive options. Still a bit rough, but will
...
be easier to clean up after a walker refactoring.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@198 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 03:11:56 +00:00
asivache
f47a214f96
massive changes everywhere; lots of bugs fixed; methods moved around; computation and printout of overall stats added; now decides whether to accept or reject 'improvement'; writes alignments into two output sam files (unmodified reads/failed piles into one, realigned piles into the other); special treat for paranoids: writes third sam file with all the analyzed reads, unmodified
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@197 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 02:26:17 +00:00
andrewk
0331cd8e95
Updated AlleleFrequency* classes to calculate separate lods for VarVsRef and BestVsNextBest mixture (qstar) theories; AFWMetrics now reports single sample performance w.r.t. Hapmap chip using the appropriate lod for gentoyping (BestVsNextBest) or variant / reference calling (VarVsRef).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@196 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-26 02:10:18 +00:00
andrewk
c88a17dfee
AlleleFrequencyWalker now can parse 4-base probs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@195 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 20:33:05 +00:00
hanna
4b7bfb284a
Support for more complex command-line types: arrays, untyped collections, typed collections, interfaces to typed and untyped collections.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@194 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 20:11:31 +00:00
jmaguire
2ed63fe17c
a bunch of changes that support pools.
...
they don't appear to break single sample:
Allele Frequency Metrics (LOD >= 5)
-------------------------------------------------
Total loci : 9000
Total called with confidence : 8138 (90.42%)
Number of variants : 11 (0.14%) (1/739)
Fraction of variant sites in dbSNP : 81.82%
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@192 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 18:52:42 +00:00
depristo
d457778283
Unified byLoci and byLociByInterval traversals. It now figures out what to do for you based on the presence of an index and set of required locations to process.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@191 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 16:01:58 +00:00
depristo
c18f8fbf5f
Documentation and cleanup of xReadLines.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@190 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:36:21 +00:00
kiran
607731da91
Fixed a harmless (but annoying) bug wherein the read name for the SAMRecords increases by two on every iteration rather than one.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@189 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:20:29 +00:00
jmaguire
44acc358b7
Add a "notes" member to the AlleleFreqencyEstimate, e.g. for hapmap metadata.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@188 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:18:10 +00:00
depristo
d11bb0fc64
Added xReadLines class to utils. It is a iterator<string> and iterable<string> so you can easily read all lines from a file. It's been used to simplify the code to process intervals, and will be used to add merging data support to the system...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@187 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 15:17:38 +00:00
asivache
4c29dca70d
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@186 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 09:23:42 +00:00
asivache
71d3e8e99b
fixed another bug in gapped alignment computation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@185 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 08:33:57 +00:00
asivache
40f45c2333
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@184 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 05:48:10 +00:00
depristo
8bdf49a01f
added slightly more useful output to Depth of Coverage walker. (now prints number of loci). Traversal engine now actually prints the reduce result (key) and no longer prints millions of locus interval updates
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@183 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 03:12:54 +00:00
depristo
ff98e28abf
High-performance interval list implement -- uses StringBuilder to avoid n^2 calculation. Can handle millions of locations quickly now
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@182 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 02:17:48 +00:00
andrewk
30babbf5b9
Restructured AlleleFrequencyMetricsWalker to correctly report Hapmap concordance numbers for genotyping and added reporting for Hapmap reference/variant calling. Also, tiny bugfix in interval code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@181 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 01:12:05 +00:00
hanna
9e2a373184
Prototype, buggy implementation of walker command-line arguments. Doesn't
...
(yet) deal elegantly with even simple cases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@180 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-25 00:12:00 +00:00
depristo
919a86e876
Cleaned up code for by interval traversals for Jared. Initialization code refactored and made clear. by loci and by loci by interval use the same underlying code now. Everyone uses the same initialization code to set things up. It's a party in the TraversalEngine and everyone's invited...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@179 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 22:32:45 +00:00
kiran
28c1330b4b
Fixed a bug wherein the loop variable for the second end of the pair was actually looping over the entire raw read (first and second ends combined).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@178 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 21:59:25 +00:00
aaron
c047b53d6b
added some cleanup of code, and new junit targets to the build file
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@177 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 21:16:12 +00:00
aaron
c2b2ed8e1d
added our first junit test, for the argument parser
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@176 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 21:14:30 +00:00
depristo
6df19ab793
Support for byInterval traversals for Jared. Do not use them.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@175 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 20:55:34 +00:00
depristo
9f500215da
Support for reseting the system; Cleanup later
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@174 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 20:52:11 +00:00
kiran
499c422de6
A version of the four-base caller that computes the probability distribution over base call space by initializing off the Bustard calls rather than the ICs.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@173 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 20:11:39 +00:00
asivache
4222016bf5
stop printing sw matrix and other debug infoant
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@171 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 18:15:52 +00:00
asivache
8ea8a74fbf
fixed bug in calculation of alignment start offset for negative offsets; toString() added
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@170 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 18:05:28 +00:00
asivache
9aa1ccd9b7
fixed some bugs in calling the optimal path; parameters adjusted (?)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@169 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 17:27:51 +00:00
kiran
88d94d407a
Fixed a bug in the parsing of the second end of the pair.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@168 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-24 14:34:37 +00:00