Commit Graph

490 Commits (b8a6f6e83046bf990d69bfbe2ca401b3f685cda9)

Author SHA1 Message Date
depristo b8a6f6e830 Support for indexBAM command
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@496 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 19:39:07 +00:00
ebanks d99d67d51c Refactored to clean it up a bit
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@495 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 19:18:46 +00:00
hanna 1bf4d040d8 Increase default shard size from 5 to 100000.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@494 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 18:29:44 +00:00
hanna 3af66a462e Make PrintLocusContextWalker less verbose.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@493 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 18:28:02 +00:00
kiran ffcd672c1c Intermediate commit while working on getting four-base probs to work in the single sample genotyper. Has infrastructure for the new combinatorial approach and just choosing the best base more intelligently given a probability distribution over bases and the reference base.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@492 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 18:06:50 +00:00
hanna 4cafb95be8 TraverseByLoci / TraverseByLociByReference suffered from the same sam-triggered off-by-one (?) bug as TraverseByReference; it was just less obvious here because these versions don't shard.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@491 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 15:48:20 +00:00
kcibul cb2f621d01 reverting accidental commit of change to shard size
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@490 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 00:33:28 +00:00
kcibul b820130dce * added ability to load multiple BAM files from command line
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@489 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 00:28:08 +00:00
kiran 5b8502745a Added an epsilon (1e-4) to the tertiary and quaternary base hypotheses.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@488 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 00:01:37 +00:00
kiran 2ac240d78b Removed an extraneous print statement.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@487 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 23:36:36 +00:00
kiran 0149c887ff Fixed a bug wherein the residual probability was not being distributed properly when a file had secondary probs and the best and next-best base agreed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@486 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 23:36:09 +00:00
kiran 5abfc7d079 Added an argument ('extended' or 'ext') that outputs the four-base probs in a long format.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@485 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 22:27:26 +00:00
kiran dac76f041b Added some methods to retreive the probability distributions of individual bases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@484 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 22:26:25 +00:00
kiran 5b2a7c9c23 Added some methods to complement a single simple base ([AaCcGgTt]) and reverse-complement a byte-array of bases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@483 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 22:25:33 +00:00
asivache 521e202a10 updated interface
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@482 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 21:07:20 +00:00
asivache 55ca272919 reimplemented; now implements Genotype interface instead of AllelicVariant
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@481 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 21:06:42 +00:00
asivache 5f37ba8f26 now can be asked to log at INFO level all concordant or discordant sites, or both
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@480 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 21:03:44 +00:00
asivache 1f84b9647d auxiliary data structure for mendelian concordance reporting; it's nice to have the latest version checked in in order for the code to compile...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@479 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 21:02:40 +00:00
asivache ece3e9969e one trivial walker to filter reads; bam in -> filter -> bam out
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@478 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 20:39:29 +00:00
asivache 61e855200d latest version...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@477 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 20:38:37 +00:00
kcibul 64b2fd866f * extracted core quality-score based genotype likelihood code
* precompute expensive operations (log/pow) based on Picard experience

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@476 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 18:58:43 +00:00
jmaguire 11c520b283 completed my old draft of the old school single sample genotype walker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@475 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 05:38:04 +00:00
depristo b8233d92c8 Simple IO walker to test / crush file systems and evalute I/O performance in general
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@474 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-20 14:07:14 +00:00
jmaguire bf76eab955 whoops; fix a comment line.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@473 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-19 17:54:54 +00:00
jmaguire bcba1ff424 Fix a minor rounding bug and putz around with fractional counts in the pooled caller.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@472 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-19 17:52:24 +00:00
jmaguire af6788fa3d Misc:
1. Added logGamma function to utils
2. Required asserts to be enabled in the allele caller (run with java -ea)
3. put checks and asserts of NaN and Infinity in AlleleFrequencyEstimate
4. Added option FRACTIONAL_COUNTS to the pooled caller (not working right yet)

AlleleFrequencyWalker:
5. Made FORCE_1BASE_PROBS not static in AlleleFrequencyWalker (an argument should never be static! Jeez.)
6. changed quality_precision to be 1e-4 (Q40)
7. don't adjust by quality_precision unless the qual is actually zero.
8. added more asserts for NaN and Infinity
9. put in a correction for zero probs in P_D_q
10. changed pG to be hardy-weinberg in the presence of an allele frequency prior (duh)
11. rewrote binomialProb() to not overflow on deep coverage
12. rewrote nchoosek() to behave right on deep coverage
13. put in some binomailProb() tests in the main() routine (they come out right when compared with R)

Hunt for loci where 4bp should change things:
14. added FindNonrandomSecondBestBasePiles walker.




git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@471 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-19 15:35:07 +00:00
hanna eafb4633ba Temporary workaround for samtools index bug: there seems to be an off-by-one error. Will file bug report.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@470 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 23:14:41 +00:00
ebanks 758db73b98 Fixed SLOWNESS issue.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@469 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 20:10:34 +00:00
asivache 2a937fa8d3 set SAM file header's sorting order to unsorted, hopefully it will help to speed things up
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@468 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 19:32:24 +00:00
asivache 03ec3452f2 a first, simplest version of a walker that filters out reads based on user-specified criteria and writes remaining reads into a new bam file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@467 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 18:51:39 +00:00
hanna 1660379753 Matt's current status.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@466 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 18:03:11 +00:00
aaron 01d000411d added some updates to the omniplan
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@465 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 17:44:04 +00:00
asivache f2f9fa3ed4 doc added
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@464 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 16:43:25 +00:00
hanna d639ec3776 Remove some copied code to make sure the traversal engine stays in sync with the locus context provider.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@463 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 16:41:56 +00:00
asivache df5aae5ed4 got read of a couple of warnings and added percentage(x,base) methods
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@462 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 15:15:21 +00:00
depristo 50ae1763f7 Support for -continue_after_errors flag in the validating pileup walker in case you want to see errors as they arise, rather than aborting greedily
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@461 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 03:13:11 +00:00
depristo ee5ab9536f trivial checking / flagging issues to enable testing of merging iterator performance
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@460 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 03:11:59 +00:00
depristo dbf2344cef Fixes for including duplicate reads in the locus traversal; now checks that the ref arg is provided when needed
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@459 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 01:27:36 +00:00
depristo e842b543c9 Better validation scripts
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@458 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 23:18:00 +00:00
hanna 01be8f09e3 Exception cleanup. All our non-runtime exceptions should extend from StingException, StingException needs to be lower in the tree to build.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@457 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 22:17:25 +00:00
aaron e5c80e59dc fixed the case when you're not seeking, it didn't initalize
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@456 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 22:16:03 +00:00
depristo f47f640df6 Better debugging output and testing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@455 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 21:54:56 +00:00
hanna 165e504d1c Turn on new TraverseLociByReference is now only dependent on the -et flag. REGION_STR does not matter.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@454 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 19:45:47 +00:00
aaron 12e1f192c4 Fixed a bug in this code where it would eat reads that didn't start at the beginning of the provided interval. This should fix / help fix Kristian problem
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@453 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 18:42:00 +00:00
asivache 835f1067d8 added isHom() and isHet() queries to the Genotype interface (with the obvious meaning)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@452 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 18:41:39 +00:00
asivache 55537c0d1e chnage class name, now it compiles...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@451 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 16:51:00 +00:00
asivache 4f9bc7206f some cleanup, also ensuring that all reads get written into output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@450 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 16:49:25 +00:00
asivache e8a6cdb386 renamed standalone main
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@449 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 15:56:46 +00:00
asivache 832afd3d60 renamed standalone main
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@448 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 15:56:27 +00:00
asivache 85308f4ddc resurrected indel tool's standalone main
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@447 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-16 15:55:52 +00:00