Mark DePristo
7c5cdb51c2
UnitTests for ActivityProfile and minor ART cleanup
...
-- TODO for ryan -- there are bugs in ActivityProfile code that I cannot fix right now :-(
-- UnitTesting framework for ActivityProfile -- needs to be expanded
-- Minor helper functions for ActiveRegion to help with unit tests
2012-03-14 17:26:37 -04:00
Mark DePristo
e440c9be98
Clean up logic for adding reads to ART cache
...
-- No longer has duplicate code
2012-03-14 17:26:37 -04:00
Mark DePristo
5bcb5c7433
Preliminary refactoring of ART
...
-- Refactored ART into clearer, simpler procedures. Attempted to merge shared code into utility classes.
-- Added some docs
-- Created a new, testable ActivityProfile that represents as a class the probability of a base being active or inactive
-- Separated band-pass filtering from creation of active regions. Now you can band pass filter a profile to make another profile, and then that is explicitly converted to active regions
-- Misc. utility functions in ActiveRegionWalker such as hasPresetActiveRegions()
-- Many TODOs in ActivityProfile.
2012-03-14 17:26:37 -04:00
Ryan Poplin
78718b8d6a
Adding Genotype Given Alleles mode to the HaplotypeCaller. It constructs the possible haplotypes via assembly and then injects the desired allele to be genotyped.
2012-02-18 10:31:26 -05:00
Ryan Poplin
41ffd08d53
On the fly base quality score recalibration now happens up front in a SAMIterator on input instead of in a lazy-loading fashion if the BQSR table is provided as an engine argument. On the fly recalibration is now completely hooked up and live.
2012-02-13 12:35:09 -05:00
Ryan Poplin
9b8fd4c2ff
Updating the half of the code that makes use of the recalibration information to work with the new refactoring of the bqsr. Reverting the covariate interface change in the original bqsr because the error model enum was moved to a different class and didn't make sense any more.
2012-02-11 10:57:20 -05:00
Matt Hanna
b57d4250bf
Documentation request by Eric. At each stage of the GATK where filtering occurs, added documentation suggesting the goal of the filtering along with examples of suggested inputs and outputs.
2012-02-09 11:24:52 -05:00
Ryan Poplin
894d3340be
Active Region Traversal should use GATKSAMRecords everywhere instead of SAMRecords. misc cleanup.
2012-02-03 17:13:52 -05:00
Ryan Poplin
601e53d633
Fix when specifying preset active regions with -AR argument
2012-02-02 16:34:26 -05:00
Ryan Poplin
abb91cf26b
Increasing the size of the active regions that are produced by the active probability integrator, more context is needed to call more complex events
2012-01-30 15:36:12 -05:00
Ryan Poplin
fc08235ff3
Bug fix in active region traversal, locusView.getNext() skips over pileups with zero coverage but still need to count them in the active probability integrator
2012-01-27 15:12:37 -05:00
Ryan Poplin
390d493049
Updating ActiveRegionWalker interface to output a probability of active status instead of a boolean. Integrator runs a band-pass filter over this probability to produce actual active regions. First version of HaplotypeCaller which decides for itself where to trigger and assembles those regions.
2012-01-26 11:37:08 -05:00
Ryan Poplin
bbefe4a272
Added option to be able to write out the active regions to an interval list file
2012-01-25 09:47:06 -05:00
Ryan Poplin
9818c69df6
Can now specify active regions to process at the command line, mainly for debugging purposes
2012-01-25 09:32:52 -05:00
Ryan Poplin
4d6312d4ea
HaplotypeCaller is now an ActiveRegionWalker.
2012-01-22 14:31:01 -05:00
Ryan Poplin
ace9333068
Active region walkers can now see the reads in a buffer around thier active reigons. This buffer size is specified as a walker annotation. Intervals are internally extended by this buffer size so that the extra reads make their way through the traversal engine but the walker author only needs to see the original interval. Also, several corner case bug fixes in active region traversal.
2012-01-19 22:05:08 -05:00
Ryan Poplin
11982b5a34
We no longer calculate the population-level TDT statistic if there are fewer than 5 trios with full genotype likelihood information. When there is a high degree of missingness the results are skewed or in the worst case come out as NaN.
2012-01-18 09:42:41 -05:00
Ryan Poplin
a6886a4cc0
Initial commit of the Active Region Traversal. Not ready to be used by anyone yet.
2012-01-04 17:03:21 -05:00
Eric Banks
6d260ec6ae
Start printing traversal stats after 30 seconds. I can't stand waiting 2 minutes.
2011-12-22 15:40:59 -05:00
Mauricio Carneiro
e89ff063fc
GATKSAMRecord refactor
...
The GATK engine will now provide a GATKSAMRecord to all tools which incorporates the functionality used by the GATK to the bam file (ReadGroups, Reduced Reads, ...).
* No tools should create SAMRecord anymore, use GATKSAMRecord instead *
2011-11-03 15:43:26 -04:00
Mark DePristo
0b88af4af9
Counts of records failing filters are displayed sorted
...
-- Stops random ordering of the output, as the counts are returned sorted by string name of the class
-- Deleted now unused sh*tty assessors in Utils
2011-10-06 18:42:26 -07:00
Mark DePristo
a1b4cafe7a
Bug fix for NPE when timer wasn't initialized
2011-09-20 13:59:59 -04:00
Mark DePristo
430da23446
At least 2 minutes must pass before a status message is printed, further stabilizing time estimates
2011-09-07 13:13:07 -04:00
Mark DePristo
d23d620494
Pushing traversal engine timer start to as close to actual start as possible
...
-- Should make initial timings more accurate
2011-09-07 12:52:33 -04:00
Mark DePristo
569e1a1089
Walker.isDone() aborts execution early
...
-- Useful if you want to have a parameter like MAX_RECORDS that wants the walker to stop after some number of map calls without having to resort to the old System.exit() call directly.
2011-08-23 16:53:06 -04:00
Mark DePristo
39b4e76fde
Continuing refactoring of RefMetaDataTracker.
...
On the path towards converging getVariantContext() and getValues() in tracker so that we can have a single approach to get values from RODs with the new RodBinding() types
2011-07-28 17:48:28 -04:00
Mark DePristo
9992c373be
Optimize imports run on the whole project, public and private. I just got too tired of all of the unused imports floating around. Confirmed that the system builds after the changes.
2011-07-17 20:29:58 -04:00
David Roazen
3c9497788e
Reorganized the codebase beneath top-level public and private directories,
...
removing the playground and oneoffprojects directories in the process. Updated
build.xml accordingly.
2011-06-28 06:55:19 -04:00