Commit Graph

61 Commits (9de3e58aa8ab9532930f2bae2b84800c5dc494eb)

Author SHA1 Message Date
depristo 9de3e58aa8 qualsAsInt argument for Pileup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@896 348d0f76-0448-11de-a6fe-93d51630548a
2009-06-04 18:37:39 +00:00
hanna 54bb643d19 Validated Mark's assertion that GSA-27 is fixed. Also did some cleanup on the pileup walker so that it doesn't output to System.out.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@812 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-26 15:58:21 +00:00
hanna 008d677bea Fixed ValidatingPileup to work with Andrey's new rodSAMPileup -> GenotypeList type hierarchy.
Fixed reference-ordered data validation system to validate class hierarchies instead of specific class types.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@811 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-23 20:50:28 +00:00
hanna ec2e8d5726 Fixes for getting ValidatingPileup running in parallel.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@807 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 21:20:24 +00:00
ebanks f2ea193149 For some reason the apostraphes in the comments were throwing annoying
compile-time warnings: "unmappable character for encoding UTF8"


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@792 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-22 14:07:07 +00:00
depristo 30c63daf89 More improvements to the duplicate quality combiner, making progress towards a clean system
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@788 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 22:26:57 +00:00
depristo 65995887fc Releasable version of the Pileup walker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@786 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 22:25:37 +00:00
andrewk 0219d33e10 QualityUtils: added reverse function to reverse an array of bytes (and not complement it), BaseUtils: split qualToProb into itself and qualToErrProb, CovariateCounterWalker and LogisticRecalibrationWalker: several changes including a properly acocunting (only partly complete) for reversing AND complementing bases that are negative strand, PrintReadsWalker: created option to output reads to a BAM file rather than just to the sceern (useful for creating a downsampled BAM file)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@770 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-21 18:30:45 +00:00
ebanks 57918de753 add the @Requires for this walker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@762 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-20 17:03:12 +00:00
hanna 01a3cb27c7 @Required / @Allows flags for main arguments.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@751 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-19 23:26:17 +00:00
hanna ff798fe483 Reintroduce support for interval-based traversals.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@749 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-18 22:54:18 +00:00
hanna 2c4de7b5c5 Switch TraverseByLoci over to new sharding system, and cleanup some code in passing read files along
the pathway from command line to traversal engine.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@727 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-15 21:02:12 +00:00
hanna 862b8a6787 intervals_file + genome_loc => intervals.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@659 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-12 01:04:18 +00:00
hanna 23e9e29964 Changed reads traversals from providing a LocusContext from which the reference sequence
could be extracted to a char[] containing the reference bases.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@657 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:45:11 +00:00
hanna 052819bed5 Switched dependencies of GenomeAnalysisTK to depend on GenomeAnalysisEngine.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@656 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-11 22:33:00 +00:00
ebanks 3aabc144c6 Added functionality to allow for a contract between LocusWindowTraversalEngine and LocusWindowWalker which allows the Walker to act upon reads outside of the provided intervals.
(Really, all we want to do is spit out all reads, but this allows the Walker to do other things with the reads if it wants)


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@641 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 17:28:16 +00:00
hanna 7a9cfe1f75 Push reduceInit down a level so that the walker can call into it without weird casts.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@638 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-08 13:46:28 +00:00
hanna 7f8850a8a2 Argument validation.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@631 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 20:28:56 +00:00
depristo 2204be43eb System for traversing duplicate reads, along with a walker to compute quality scores among duplicates and a smarter method to combine quality scores across duplicates -- v1
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@624 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 18:06:02 +00:00
hanna 752928df94 Switch to better mechanism for supplying a default.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@615 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-07 01:22:01 +00:00
hanna dc944ec69b First stage of ROD plumbing for MicroScheduler.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@614 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-06 23:26:21 +00:00
hanna b0cdba8bb3 Acting on Kiran's suggestion to make the doc tag in the @Argument annotation required.x
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@598 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-05 22:43:40 +00:00
hanna 5bdf653919 Cleanup: prepare for better output handling.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@586 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 21:40:46 +00:00
hanna 9f5f6f9bc7 N-way parallelism. Works for small test cases. Untested for large test cases.
-Needs more comprehensive unit testing.
-Needs some basic refactoring.
-Needs rethink of interface boundaries.
-Needs to play more nicely in the /tmp sandbox.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@583 348d0f76-0448-11de-a6fe-93d51630548a
2009-05-01 19:34:09 +00:00
depristo 84dae06d5a Initial version of ByDuplicates traversal, as well as a duplicate quality score estimator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@576 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 22:16:21 +00:00
hanna 7f173af2ea Encapsulate output tracking a bit.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@570 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-30 15:12:13 +00:00
hanna 6ecc43f385 Provide a default logger, some config settings, and some doc updates.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@557 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-29 02:06:05 +00:00
hanna 9a8902571c Placeholder for parallel MicroManager.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@542 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 23:08:12 +00:00
ebanks 0c76a70313 Renamed traversal by "interval" to "locusWindow"
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@537 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-26 02:26:08 +00:00
depristo ce470702fc consistency with java naming conventions
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@535 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 21:44:48 +00:00
depristo bfce0c93ab removing bad file
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@534 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 21:40:04 +00:00
depristo 05c6679321 Enabled ReduceByInterval
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@533 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 21:39:44 +00:00
ebanks 13d4692d2e 1. Added a by-interval traversal.
2. Added a shell for the indel cleaner walker (it's currently being used to test the interval traversal).
3. Fixed small bug in downsampling (make sure to downsample the offsets too)
4. GenomeAnalysisTK.execute => anyone object to my change to "instanceof" instead of trying to catch a ClassCastException (yuck)?



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@524 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-24 04:33:35 +00:00
hanna 3af66a462e Make PrintLocusContextWalker less verbose.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@493 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-22 18:28:02 +00:00
kiran 5abfc7d079 Added an argument ('extended' or 'ext') that outputs the four-base probs in a long format.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@485 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-21 22:27:26 +00:00
asivache f2f9fa3ed4 doc added
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@464 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 16:43:25 +00:00
depristo 50ae1763f7 Support for -continue_after_errors flag in the validating pileup walker in case you want to see errors as they arise, rather than aborting greedily
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@461 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-17 03:13:11 +00:00
depristo 24722a442e Slight code cleanup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@421 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:21:36 +00:00
depristo 72a3d84ed2 General purpose pileup code -- you can use these features to obtain detailed pileup data from reads and offsets. Useful for all pileup based walkers. Expanded support for rodSAMPileup to enable the new ValidatingPileupWalker, which takes a samtools pileup output and checks that GATK gives identical output as samtools on a per base and per qual pileup. It's going to be a very useful validation tool.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@418 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 22:13:10 +00:00
kiran 30121534ed Outputs the secondary bases and quals (if available) in verbose mode. Prefixed with the tag 'SQ='.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@398 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 13:58:28 +00:00
depristo 794360c410 Added verbose option to show mapping qualities and base qualities as ints!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@394 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-14 00:54:48 +00:00
hanna e91a429c58 A class to print out as much context about the given locus site as is possible. Useful for testing traversal engines -- run old and new code across a given region and diff the output to make sure they have the same context.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@383 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-13 15:29:55 +00:00
depristo 4eac3193f7 Added RefMetaDataTracker system as a replacement for the List<RefenenceOrderedData> going into walkers. This system allows you to more easily get a tracker for processing using the lookup(name, default) system. See Pileup for an example.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@292 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 19:54:54 +00:00
depristo f031d882c6 ByReference traversals!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@281 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-03 13:23:18 +00:00
depristo 93fc768c38 Fixing problems with SAMQueryIterator and reads
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@263 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-02 18:04:28 +00:00
jmaguire bb3dbb5756 change default onTraversalDone to use the new output streams
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@249 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 19:50:31 +00:00
ebanks 6994cca988 added precision
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@246 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 16:21:29 +00:00
depristo 385736469c High performance pileup code and utilities
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@242 348d0f76-0448-11de-a6fe-93d51630548a
2009-04-01 00:47:47 +00:00
ebanks 8d601a6a42 unbox
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@239 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:51:59 +00:00
ebanks 234137dee8 use boolean instead of String for flag to suppress printing in map
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@236 348d0f76-0448-11de-a6fe-93d51630548a
2009-03-31 15:14:00 +00:00