Commit Graph

25 Commits (db40e28e542c932ce2cdbf06dc20f5b0fc408565)

Author SHA1 Message Date
depristo db40e28e54 ReadBackedPileup in all its glory. Documented, aligned with the output of LocusIteratorByState, and caching common outputs for performance
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2165 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-25 20:54:44 +00:00
depristo 03342c1fdd Restructuring and interface change to ReadBackedPileup. We now lower support the Pileup interface, the BasicPileup static methods, and the ReadBackedPileup class. Now everything is a ReadBackedPileup and all methods to manipulate pileups are off of it. Also provides the recommended iterable() interface of pileup elements so you can use the syntax for (PileupElement p : pileup) and access directly from p.getBase() and p.getQual() and p.getSecondBase(). Only a few straggler walkers use the old style interface -- but those walkers will be retired soon. Documentation coming in the AM. Please everyone use the new syntax, it's safer, and will be more efficient as soon as the LocusIteratorByState directly emits the ReadBackedPileup for the Alignment context, as opposed to the current interface. In the process of the change over, discovered several bugs in the second-best base code due to things getting out of sync, but these changes were resolved manually. All other integrationtests passed without modification.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2154 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-25 03:51:41 +00:00
aaron 33dcfc858d updates to the paper genotyper based on Mark's comments. There's still more work to do, including more testing.
Also a 250% improvement in the getBases() and getQuals() of BasicPileup, which was nearly all of the runtime for the genotyper (using primitives instead of objects when possible).

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2097 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-19 23:06:49 +00:00
depristo 6fe1c337ff Pileup cleanup; pooled caller v1
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2070 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-18 17:03:48 +00:00
chartl 43bd4c8e8f Ignoring deletions in the primary pileup by default was causing the primary pileup to become shorter than the secondary pileup when building up the secondary base pileup string. This fix makes sure to include the primary Ds within the pileup so that not only are the pileups guaranteed to be the same size, the same offsets will truly correspond with the same read.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2058 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-17 17:20:13 +00:00
mmelgar 3742a05760 Now can read E2 or SQ tag.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2027 348d0f76-0448-11de-a6fe-93d51630548a
2009-11-12 15:18:21 +00:00
chartl ad777a9c14 @BasicPileup - made the counts public so they can be used
@PoolUtils - split reads by indel/simple base

@BaseTransitionTable - complete refactoring, nicer now

@UnifiedArgumentCollection - added PoolSize as an argument

@UnifiedGenotyper - checks to ensure pooled sequencing uses the appropriate model

@GenotypeCalculationModel - instantiates with the new PoolSize argument




git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1867 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-16 21:56:56 +00:00
ebanks f9a1598d75 Reformatting
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1778 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-07 20:03:34 +00:00
ebanks 8bd345ba00 Generalized deletions in pileup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1739 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-29 15:58:43 +00:00
ebanks a7c306f757 -deal with offsets that can be -1
-added option to have "D"s inserted for deleted bases in pileup strings


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1635 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-16 16:44:57 +00:00
hanna 21d1eba502 Cleaned division of responsibilities between arguments to map function. Reference has been changed
from an array of bases to an object (ReferenceContext), and LocusContext has been renamed to reflect
the fact that it contains contextual information only about the alignments, not the locus in general.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1376 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-04 21:01:37 +00:00
depristo 9c12c02768 AlleleBalance and on/off primary base filters -- version 0.0.1 -- for experimental use only
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1294 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-22 17:54:44 +00:00
hanna d19366eaad Cleanup emergency fixes for out-of-bounds issues in reference retrieval. Fix spelling mistakes.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1173 348d0f76-0448-11de-a6fe-93d51630548a
2009-07-06 15:41:30 +00:00
kiran b0cc763eb5 Added some methods to format bases such that read bases on the forward strand are in uppercase, while those on the negative strand are lowercase. This does *not* affect the default functionality of the standard PileupWalker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@969 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 17:31:00 +00:00
kiran 2b0e7f612b Handles bam pileups where some of the reads have SQ tags and some don't.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@958 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-10 08:17:15 +00:00
depristo dc17a5661d Better accessors for dealing with second base prob pileups
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@785 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 22:25:16 +00:00
jmaguire 11723fbcc2 added method indelPileup. Generates a pileup of indel alleles given reads and ofsets (as from a locus walker).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@663 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 15:08:24 +00:00
depristo 5a4bb76cc3 More capabilities for the pileup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@621 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:03:13 +00:00
kiran 4f818f5c1c Choose a random base to stick in the pileup if the 2nd-best base matches the best base.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@578 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 06:27:37 +00:00
kiran 135d3eabeb Now only distributes 80% of the residual probability to the secondary base, 10% each to the other two bases. Nicer labelling for stringified probability distribution output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@521 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 03:34:43 +00:00
kiran 5b8502745a Added an epsilon (1e-4) to the tertiary and quaternary base hypotheses.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@488 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 00:01:37 +00:00
kiran 2ac240d78b Removed an extraneous print statement.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@487 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 23:36:36 +00:00
kiran 0149c887ff Fixed a bug wherein the residual probability was not being distributed properly when a file had secondary probs and the best and next-best base agreed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@486 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 23:36:09 +00:00
kiran dac76f041b Added some methods to retreive the probability distributions of individual bases.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@484 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 22:26:25 +00:00
depristo 72a3d84ed2 General purpose pileup code -- you can use these features to obtain detailed pileup data from reads and offsets. Useful for all pileup based walkers. Expanded support for rodSAMPileup to enable the new ValidatingPileupWalker, which takes a samtools pileup output and checks that GATK gives identical output as samtools on a per base and per qual pileup. It's going to be a very useful validation tool.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@418 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:13:10 +00:00