aaron
41a95cb3f0
fixing unified genotyper test for change: VCF output now emits no calls as ./.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1865 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-16 19:38:58 +00:00
ebanks
07b134a124
Added some integration tests for multiple samples
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1861 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-16 15:22:10 +00:00
aaron
a69ea9b57c
Cleaning up the VCF code, adding lots of tests for a variety of edge cases. Two issues are still outstanding: updating the no call string with the standard 1000g decided on today, and fixing Eric's issue where not all the VCF sample names are present initially.
...
also: their, I hope your happy Eric, from now on I'll try not to flout my awesomest grammer in the future accept when I need to illicit a strong response :-)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1858 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-16 04:11:34 +00:00
aaron
eb90e5c4d7
changes to VCF output, and updated MD5's in the integration tests
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1836 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-14 18:42:48 +00:00
ebanks
d89bc2c796
This class no longer outputs in sequenom format
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1832 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-14 17:16:36 +00:00
ebanks
0c06bf9dbc
Explicitly set output to GELI now that default is VCF
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1824 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-13 22:12:03 +00:00
aaron
98e3a0bf1a
VCF can now be emitted from SSG. The basic's are there (the genotype, read depth, our error estimate), but more fields need to be added for each record as nessasary.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1797 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-08 19:50:04 +00:00
ebanks
df8ea8f437
UG integration test. This was the old SSG test with MD5s updated.
...
I'll need to add some multi-sample tests in a bit...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1791 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-08 17:43:58 +00:00
ebanks
008455915a
One way of making the integration test stop failing is to remove it...
...
[waiting for Matt to cringe...]
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1789 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-08 17:08:41 +00:00
ebanks
04fe50cadd
*** We no longer have a separate model for the single-sample case. ***
...
For now, a single sample input will be special-cased in the EM model - but that will change when the EM model degenerates to the single sample output with a single sample as input. For now, the EM code for multi-samples isn't finished; I'm planning on checking that in soon.
The SingleSampleIntegrationTest now uses the UnifiedCaller instead of SSG, and so should all of you. More on that in a separate email.
Other minor cleanups added too.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1785 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-08 14:08:57 +00:00
aaron
f9a0eefe4b
GELI_BINARY is now functional, and can be used as a variant type in SSG (-vf=GELI_BINARY). Also fixed the max mapping quality column in both GELI output formats, we haven't been correctly outputing up until now.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1774 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-07 18:20:34 +00:00
aaron
7fc4472e6d
A big fix for MergingSamRecordIterator, where we weren't correctly handling the comparisons of SAMRecords correctly (we weren't applying the new reference index first, so sometimes the MT contig would be ID 23, sometimes 24 in different records).
...
Also a fix to the GLF tests, and a correction to PrintReadsWalker to remove the close() on the output source, the source handles that itself (and you get a double close).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1758 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-02 19:35:35 +00:00
ebanks
7249fade05
updated
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1756 348d0f76-0448-11de-a6fe-93d51630548a
2009-10-02 18:10:34 +00:00
aaron
2e4949c4d6
Rev'ing Picard, which includes the update to get all the reads in the query region (GSA-173). With it come a bunch of fixes, including retiring the FourBaseRecaller code, and updated md5 for some walker tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1751 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-30 20:37:59 +00:00
hanna
70e1aef550
Better integrate the @ArgumentCollection into the command-line argument parser. Walkers can now specify their own @ArgumentCollections. Also cleaned up a bit of the CommandLineProgram template method pattern to minimize duplicate code.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1746 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-29 22:23:19 +00:00
aaron
d262cbd41c
changes to add VCF to the rod system, fix VCF output in VariantsToVCF, and some other minor changes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1715 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-24 15:16:11 +00:00
ebanks
b0fa19a0b2
Fixed recal integration test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1689 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-22 20:22:32 +00:00
ebanks
6780476fb5
updated to deal with new dbSNP rod
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1687 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-22 19:46:32 +00:00
aaron
7bfb5fad27
fixing the dbSNP test. Also removing unnessasary comments from the GenomeLocParser, added some tests, and commented out the performance test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1676 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-21 23:32:24 +00:00
asivache
a6bd509593
Changing the carpet under your feet!! New incremental update to th eROD system has arrived.
...
all the updated classes now make use of new SeekableRodIterator instead of RODIterator. RODIterator class deleted. This batch makes only trivial updates to tests dictated by the change in the ROD system interface. Few less trivial updates to follow. This is a partial commit; a few walkers also still need to be updated, hold on...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1667 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-21 16:55:22 +00:00
aaron
7b39aa4966
Adding the VCF ROD. Also changed the VCF objects to much more user friendly.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1658 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-18 20:19:34 +00:00
hanna
01a9b1c63b
Fix for problem where err stream remapped to output stream in certain cases, (hopefully) completing Matt's hat trick of fail. Thanks, unit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1634 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-16 08:33:56 +00:00
aaron
eedf55e94d
temp fix for a broken test, we'll fix the test tomorrow. We promise, we're engineers, we love our tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1633 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-16 04:36:42 +00:00
aaron
b401929e41
incremental clean-up and changes for VariantEval, moved DiploidGenotype to a better home, and fixed a spelling error.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1624 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-15 04:48:42 +00:00
ebanks
6783fda42a
Updated unit test to reflect changes to vcf output
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1623 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-15 01:56:08 +00:00
aaron
e03fccb223
Changes to switch Variant Eval over to the new Variation system.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1611 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-14 05:34:33 +00:00
aaron
5b41ef5f70
rod DBSNP had a bug where the reference wasn't calculated correctly under certain conditions. Fixed getRefBasesFWD and getRefSnpFWD so that they were more in line with getAltBasesFWD and getAltSnpFWD. Also updated Variant Eval tests to reflect this change.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1609 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-13 23:48:58 +00:00
depristo
6e13a36059
Framework for ROD walkers -- totally experiment and not working right now
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1600 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-12 19:13:15 +00:00
ebanks
e24c8d00d5
So, the VCF spec allows for an optional meta field in the header representing the date. However, using this field means that integration tests run on the vcf file will fail the MD5 test (which is what happened to the VariantFiltration test this morning after working just fine yesterday).
...
After consulting our resident expert (Aaron), we're going to (temporarily) remove the date from the vcf output until we can come up with a better solution. However, this shouldn't cause any short-term problems because the data truly is optional.
VF test's MD5s are updated.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1580 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 14:28:43 +00:00
aaron
5a64a80ab5
changes to the variation class, updates to SSG, updated tests based on changes to the SSGenotypeCall, and added the ability to run a single integration test from using the build script.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1577 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 04:31:33 +00:00
ebanks
1362a56227
Added fasta tests and small fix to cleaner test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1575 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 03:13:11 +00:00
ebanks
8ca89279aa
Added a test for VariantFiltration and the VECs
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1574 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-10 02:21:14 +00:00
ebanks
bed646e4f6
Adding cleaner test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1561 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 16:05:56 +00:00
depristo
d9588e6083
bug fixes to LIBS and LIBH following ultra-aggressive regression testing across 454, solid, and solexa
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1558 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 15:36:12 +00:00
aaron
0df6a9da5c
-Seperating out normal (unit) tests and integration tests. From now on if your test are more of an integration test (i.e. you're testing a walker and all the subunits it relies on) please name the test "______IntegrationTest.java" instead of "______Test.java".
...
-Bamboo will now run the integration tests once a day, and the normal units tests on each check-in.
-Also added a bunch of unit tests for VariantEval walker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1555 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 15:01:40 +00:00
depristo
eeb9b6eb13
GenotypeLikelhoods now support a cache per subclass, avoiding genotyping clashes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1554 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 10:39:14 +00:00
ebanks
0cc219c0df
-Added unit test for walkers dealing with intervals for cleaning
...
-I also uncovered a corner case in the cleaner that for some reason was commented out but shouldn't have been. Hooray for unit tests!
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1553 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 02:35:17 +00:00
depristo
ec0f6f23c7
LocusIterationByState is now the system deafult. Fixed Aaron's build problem
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1552 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-09 01:28:05 +00:00
depristo
1c3d67f0f3
Improvements to the CountCovariates and TableRecablirator, as well as regression tests for SLX and 454 data
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1539 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 22:26:57 +00:00
depristo
2b0d1c52b2
General WalkerTest framework. Includes some minor changes to GATK core to enable creation of true command-line like GATK modules in the code. Extensive first-pass tests for SSG
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1538 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 19:13:37 +00:00
aaron
0cc634ed5d
-Renamed rodVariants to RodGeliText
...
-Remove KGenomesSNPROD
-Remove rodFLT
-Renamed rodGFF to RodGenotypeChipAsGFF
-Fixed a problem in SSGenotypeCall
-Added basic SSGenotype Test class
-Make VCFHeader constructors public
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1536 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 18:40:43 +00:00
depristo
a08c68362e
Renaming error to getNegLog10PError(); added Cached clearing method to GL; SSG now has a CallResult that counts calls; No more Adding class to System.out, now to logger.info; First major testing piece (and general approach too) to unit testing of a walker -- SingleSampleGenotyper now knows how many calls to make on a particular 1mb region on NA12878 for each call type and counts the number of calls *AND* the compares the geli MD5 sum to the expected one!
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1530 348d0f76-0448-11de-a6fe-93d51630548a
2009-09-04 12:39:06 +00:00
depristo
49a7babb2c
Better organization of Genotype likelihood calculations. NewHotness is now just GenotypeLikelihoods. There are 1, 3, and empirical base error models available as subclasses, along with a simple way to make this (see the factory).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1481 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-30 19:16:30 +00:00
depristo
8e129d76fd
Support for original quality scores OQ flag. pQ flag in TableRecalibation to preserve quality scores below a threshold (defaulting to 5)
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1474 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-28 14:14:21 +00:00
depristo
37a9b84276
corresponding test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1470 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-28 00:17:42 +00:00
hanna
ccdb4a0313
General-purpose management of output streams.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1454 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-23 00:56:02 +00:00
aaron
d101c20b30
added the ability to pass in a csv file of ROD triplets (one triplet per line) to the -B option
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1412 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-11 22:10:20 +00:00
aaron
d69ae60b69
fixed two tests affected by my previous commit
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1408 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-11 17:57:50 +00:00
hanna
dd228880ed
Partially implemented NewHotnessGenotypeLikelihoodsTest caused the tests to fail.
...
Ouch! So hot it burned me.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1403 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-10 20:45:44 +00:00
depristo
a864c2f025
Updated polarized reference priors, need DiploidGenotypePriors class that is directly used by the NewHotness genotypelikelihoods, more bug fixes and refactoring, etc.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@1390 348d0f76-0448-11de-a6fe-93d51630548a
2009-08-07 19:00:06 +00:00