Commit Graph

197 Commits (66fc8ea4446b14050104306d54088bd283356ef8)

Author SHA1 Message Date
aaron 66fc8ea444 GSA-182: Adding support for BED interval files.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1767 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-06 02:45:31 +00:00
aaron 7fc4472e6d A big fix for MergingSamRecordIterator, where we weren't correctly handling the comparisons of SAMRecords correctly (we weren't applying the new reference index first, so sometimes the MT contig would be ID 23, sometimes 24 in different records).
Also a fix to the GLF tests, and a correction to PrintReadsWalker to remove the close() on the output source, the source handles that itself (and you get a double close).

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1758 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-02 19:35:35 +00:00
ebanks 7249fade05 updated
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1756 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-02 18:10:34 +00:00
aaron e885cc4b21 changes for corrected GLF likelihood output, along with better tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1754 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-01 20:45:05 +00:00
aaron 2e4949c4d6 Rev'ing Picard, which includes the update to get all the reads in the query region (GSA-173). With it come a bunch of fixes, including retiring the FourBaseRecaller code, and updated md5 for some walker tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1751 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-30 20:37:59 +00:00
hanna 70e1aef550 Better integrate the @ArgumentCollection into the command-line argument parser. Walkers can now specify their own @ArgumentCollections. Also cleaned up a bit of the CommandLineProgram template method pattern to minimize duplicate code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1746 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-29 22:23:19 +00:00
andrewk 5662a88ee1 Cosmetic change to list sampling functions: the typical usage of n and k were reversed. No change in functionality of the classes has been made and unit tests still pass.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1736 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-28 18:12:32 +00:00
aaron 130a01a40a delete the integration test temp files when the test is over
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1727 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-25 16:34:08 +00:00
aaron d262cbd41c changes to add VCF to the rod system, fix VCF output in VariantsToVCF, and some other minor changes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1715 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-24 15:16:11 +00:00
aaron eeb14ec717 a couple of light changes to GenomeLocSortedSet.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1708 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-23 20:38:53 +00:00
aaron 11c32b588f fixing VariantEvalWalkerIntegrationTest md5 sums, a couple comment changes, and a little bit of cleanup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1690 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-22 20:54:47 +00:00
ebanks b0fa19a0b2 Fixed recal integration test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1689 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-22 20:22:32 +00:00
ebanks 6780476fb5 updated to deal with new dbSNP rod
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1687 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-22 19:46:32 +00:00
aaron 83a9eebcc4 fixed a bug I checked in that Eric found, for intervals with no start or stop coordinate. Now I owe Eric a cookie, and Milk Street is so far away. Damn.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1679 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-22 04:34:18 +00:00
aaron 7bfb5fad27 fixing the dbSNP test. Also removing unnessasary comments from the GenomeLocParser, added some tests, and commented out the performance test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1676 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-21 23:32:24 +00:00
aaron 39a47491a9 changes to make GenomeLoc string parsing 25% faster
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1675 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-21 22:37:47 +00:00
asivache a6bd509593 Changing the carpet under your feet!! New incremental update to th eROD system has arrived.
all the updated classes now make use of new SeekableRodIterator instead of RODIterator. RODIterator class deleted. This batch makes only trivial updates to tests dictated by the change in the ROD system interface. Few less trivial updates to follow. This is a partial commit; a few walkers also still need to be updated, hold on...

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1667 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-21 16:55:22 +00:00
aaron b6d7d6acc6 fix for the eval tests, and a change to the backedbygenotypes interface, more changes to come
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1661 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-18 22:25:16 +00:00
aaron 7b39aa4966 Adding the VCF ROD. Also changed the VCF objects to much more user friendly.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1658 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-18 20:19:34 +00:00
hanna 01a9b1c63b Fix for problem where err stream remapped to output stream in certain cases, (hopefully) completing Matt's hat trick of fail. Thanks, unit tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1634 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-16 08:33:56 +00:00
aaron eedf55e94d temp fix for a broken test, we'll fix the test tomorrow. We promise, we're engineers, we love our tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1633 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-16 04:36:42 +00:00
aaron 542d817688 more cleanup
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1631 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-15 21:42:03 +00:00
aaron b401929e41 incremental clean-up and changes for VariantEval, moved DiploidGenotype to a better home, and fixed a spelling error.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1624 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-15 04:48:42 +00:00
ebanks 6783fda42a Updated unit test to reflect changes to vcf output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1623 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-15 01:56:08 +00:00
aaron e03fccb223 Changes to switch Variant Eval over to the new Variation system.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1611 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-14 05:34:33 +00:00
aaron 5b41ef5f70 rod DBSNP had a bug where the reference wasn't calculated correctly under certain conditions. Fixed getRefBasesFWD and getRefSnpFWD so that they were more in line with getAltBasesFWD and getAltSnpFWD. Also updated Variant Eval tests to reflect this change.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1609 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-13 23:48:58 +00:00
depristo 6e13a36059 Framework for ROD walkers -- totally experiment and not working right now
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1600 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-12 19:13:15 +00:00
depristo 17ab1d8b25 General purpose merging iterator implementation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1593 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-12 19:06:15 +00:00
ebanks e24c8d00d5 So, the VCF spec allows for an optional meta field in the header representing the date. However, using this field means that integration tests run on the vcf file will fail the MD5 test (which is what happened to the VariantFiltration test this morning after working just fine yesterday).
After consulting our resident expert (Aaron), we're going to (temporarily) remove the date from the vcf output until we can come up with a better solution.  However, this shouldn't cause any short-term problems because the data truly is optional.
VF test's MD5s are updated.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1580 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 14:28:43 +00:00
aaron 5a64a80ab5 changes to the variation class, updates to SSG, updated tests based on changes to the SSGenotypeCall, and added the ability to run a single integration test from using the build script.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1577 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 04:31:33 +00:00
ebanks 1362a56227 Added fasta tests and small fix to cleaner test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1575 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 03:13:11 +00:00
ebanks 8ca89279aa Added a test for VariantFiltration and the VECs
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1574 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 02:21:14 +00:00
ebanks bed646e4f6 Adding cleaner test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1561 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 16:05:56 +00:00
depristo d9588e6083 bug fixes to LIBS and LIBH following ultra-aggressive regression testing across 454, solid, and solexa
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1558 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 15:36:12 +00:00
aaron 0df6a9da5c -Seperating out normal (unit) tests and integration tests. From now on if your test are more of an integration test (i.e. you're testing a walker and all the subunits it relies on) please name the test "______IntegrationTest.java" instead of "______Test.java".
-Bamboo will now run the integration tests once a day, and the normal units tests on each check-in.

-Also added a bunch of unit tests for VariantEval walker

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1555 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 15:01:40 +00:00
depristo eeb9b6eb13 GenotypeLikelhoods now support a cache per subclass, avoiding genotyping clashes
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1554 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 10:39:14 +00:00
ebanks 0cc219c0df -Added unit test for walkers dealing with intervals for cleaning
-I also uncovered a corner case in the cleaner that for some reason was commented out but shouldn't have been.  Hooray for unit tests!



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1553 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 02:35:17 +00:00
depristo ec0f6f23c7 LocusIterationByState is now the system deafult. Fixed Aaron's build problem
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1552 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 01:28:05 +00:00
aaron ea6ffd3796 initial VariantEvalWalker test. More to be added soon...
Also fixed the case where MD5 sums had leading zero's clipped off

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1551 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 01:02:04 +00:00
depristo 1c3d67f0f3 Improvements to the CountCovariates and TableRecablirator, as well as regression tests for SLX and 454 data
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1539 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 22:26:57 +00:00
depristo 2b0d1c52b2 General WalkerTest framework. Includes some minor changes to GATK core to enable creation of true command-line like GATK modules in the code. Extensive first-pass tests for SSG
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1538 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 19:13:37 +00:00
aaron 0cc634ed5d -Renamed rodVariants to RodGeliText
-Remove KGenomesSNPROD
-Remove rodFLT
-Renamed rodGFF to RodGenotypeChipAsGFF
-Fixed a problem in SSGenotypeCall
-Added basic SSGenotype Test class
-Make VCFHeader constructors public

git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1536 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 18:40:43 +00:00
depristo a08c68362e Renaming error to getNegLog10PError(); added Cached clearing method to GL; SSG now has a CallResult that counts calls; No more Adding class to System.out, now to logger.info; First major testing piece (and general approach too) to unit testing of a walker -- SingleSampleGenotyper now knows how many calls to make on a particular 1mb region on NA12878 for each call type and counts the number of calls *AND* the compares the geli MD5 sum to the expected one!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1530 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 12:39:06 +00:00
depristo 49a7babb2c Better organization of Genotype likelihood calculations. NewHotness is now just GenotypeLikelihoods. There are 1, 3, and empirical base error models available as subclasses, along with a simple way to make this (see the factory).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1481 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-30 19:16:30 +00:00
depristo 8e129d76fd Support for original quality scores OQ flag. pQ flag in TableRecalibation to preserve quality scores below a threshold (defaulting to 5)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1474 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-28 14:14:21 +00:00
depristo 37a9b84276 corresponding test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1470 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-28 00:17:42 +00:00
aaron 811503d67b vcf changes from Richards comments, fixed a test case
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1456 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-25 14:32:16 +00:00
hanna ccdb4a0313 General-purpose management of output streams.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1454 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-23 00:56:02 +00:00
aaron 6313c465fb we want the RMS of the reads qualities not the RMS of the RMS of the read qualities.
Also the VCF version tag seems to be standardized as VCR.  Updated the VCF code.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1447 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-20 21:56:29 +00:00
aaron 5725de56dc fixes in VCF, some changes to get it ready to move out of the GATK
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1441 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-19 23:31:03 +00:00