Commit Graph

90 Commits (32fc221ffe631fcd040fefd5df663b6fddd2d46c)

Author SHA1 Message Date
ebanks 8c28be5933 Fixing a VCF bug for Sendu: we weren't emitting flags (booleans) correctly in VCF3.3 (rev'ed tribble for this).
Updated dbsnp/hapmap membership info fields to be flags now instead of ints.
While I was there, I added the change in the Annotator for Jan to force reads to be from a specific sample.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3536 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-11 16:42:06 +00:00
aaron 871cf0f4f6 Call out ROD types by there record type, instead of the codec type (which was clumsy). So instead of:
@Requires(value={},referenceMetaData=@RMD(name="eval",type= VCFCodec.class))

you'd say:

@Requires(value={},referenceMetaData=@RMD(name="eval",type= VCFRecord.class))

Which is more in-line with what was done before.  All instances in the existing codebase should be switched over.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3457 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-28 14:52:44 +00:00
aaron a2fab07258 fixed the build problem: there were two copies of the AnnotatorInputTable Codec and Feature in two different spots.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3439 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-26 14:47:15 +00:00
ebanks 0607f76a15 commenting out this test until I can figure out what the hell is going on with the codecs.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3436 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-26 01:12:10 +00:00
ebanks 572b383fe2 Make VA annotate dbsnp again
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3345 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 14:06:53 +00:00
aaron a68f3b2e9c VCF moved over to tribble.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3302 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 17:28:48 +00:00
aaron ad11201235 adding more ROD pile-up tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3301 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 16:01:11 +00:00
aaron cbed0b1ade Adding GeliText tribble track as the first enabled Tribble track. This mean 'Variants' is no longer valid for a ROD type, use GeliText instead. I've updated all the references in the codebase.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3271 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-29 22:50:17 +00:00
aaron 7fbfd34315 adding the GELI ROD validation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3270 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-29 21:43:00 +00:00
ebanks df31eeff9f minor change
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3259 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 06:05:29 +00:00
ebanks e702bea99f Moving VE2 to core; calling it "VariantEval" (one more checkin coming)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3179 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-15 20:25:47 +00:00
weisburd b930dc52a5 Integration test for GenomicAnnotator
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3167 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-14 14:43:25 +00:00
aaron 4014a8a674 A long overdue correction; all unit tests now end in 'UnitTest'. This was something we wanted to do for a while, and now with the performance tests coming, it was a good time to clean-up. Please label any new test appropriately: *UnitTest and *IntegrationTest are the two valid file name patterns for tests.
Thanks!



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3135 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-08 06:14:15 +00:00
aaron 8fd59c8823 Modified the report system based on Ryan's feedback: tables are now created independently to avoid the permutation problem when they were all compressed in rows, and removed our dependency on FreeMarker. The Grep format stays the same.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3130 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-07 20:39:55 +00:00
ebanks 73a14a985b Moving VariantsToVCF to core.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3078 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 18:55:12 +00:00
ebanks 14bf6923a8 HapMap-to-VCF now works fine within Variants-to-VCF. Added integration test for it and removed old code.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3077 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-26 18:34:59 +00:00
ebanks 4398a8b370 Updated. Now uses VariantContext and is truly "variants" to vcf (i.e. not just GELI to vcf).
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3074 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-25 04:53:31 +00:00
aaron 439c34ed38 clean-up before annotating VariantEval2 for output.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3055 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-22 07:39:20 +00:00
aaron 8a5f0b746e some cleanup for the output system.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3032 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-18 12:54:39 +00:00
ebanks 4340601c26 -Pushed base quals back down into SAMRecord; if -OQ is used, the SAMRecord quals get updated automatically
-Better integration test


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3020 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-17 16:00:10 +00:00
aaron 10e76abbbc adding some VE2 report infrastructure; work-in-progress.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3008 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 03:57:42 +00:00
ebanks 202231141c -Push the --use_original_qualities argument into the engine.
-Check that base and qual strings are the same lengths
-Fix one more bug in the clipper.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3006 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-16 02:06:11 +00:00
ebanks 411d25c8d1 -Integration tests for walkers that use original quals.
-framework for pushing -OQ into GATK (not done)


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3004 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-15 18:46:31 +00:00
ebanks 6e855809e1 Renaming and moving relevant tools into a sequenom directory
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2971 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-10 02:31:10 +00:00
ebanks e5475a7ba9 re-enabling PlinkToVCF integration tests
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2964 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-09 17:35:49 +00:00
ebanks 9f3b99c11b Moving UnifiedGenotyper and VariantAnnotator over to VariantContext system.
Removing obsolete genotyping classes.
First stage of removing dependence on old Genotype class.
More changes to come.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2960 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-09 03:41:07 +00:00
ebanks 5f3c80d9aa 1. To make indel calls, we need to get rid of the SNP-centricity of our code. First step is to have the reference be a String, not a char in the Genotype. Note that this is just a temporary patch until the genotype code is ported over to use VariantContext.
2. Significant refactoring of Plink code to work in the rods and use VariantContext.  More coming.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2913 348d0f76-0448-11de-a6fe-93d51630548a
2010-03-02 20:26:40 +00:00
hanna 9dbdfff786 Moved VariantEval to core. Updated integration test md5s to reflect new Analysis class names.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2762 348d0f76-0448-11de-a6fe-93d51630548a
2010-02-02 00:22:15 +00:00
depristo 1993472b38 Just like VariantFiltration but lets you match info fields out of the VCF instead of annotating them.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2736 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-29 15:38:03 +00:00
chartl 5b2a1e483e Renamed SequenomToVCF as PlinkToVCF. Wiki will be changed accordingly.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2649 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-21 17:35:20 +00:00
depristo 9e0ae993c7 -B 1kg_ceu,VFC,CEU.vcf -B 1kg_yri,VCF,YRI.vcf system supported to allow 1KG % (like dbSNP%)
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2632 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-19 21:33:13 +00:00
depristo d8e74c5795 Update to MD5s for old tests and added extensive VCF testing
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2615 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-16 20:22:58 +00:00
chartl 424d1b57f7 Sequenom to VCF now allows user to specify filters for QC, and they will appear in the filter field of the output VCF
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2577 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-13 23:22:37 +00:00
ebanks 8ca5bba738 We emit genotype data in the VCF record if the format string instructs us to (regardless of whether or not genotypes are provided - this was the wrong test).
SequenomToVCF now correctly has no-calls when probes fail.
Re-enabled SequenomToVCF integration test.


git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2572 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-13 15:40:27 +00:00
chartl 6d1107a4ed Update to SequenomToVCF
Output changing slightly so integration test disabled temporarily



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2571 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-13 15:32:05 +00:00
ebanks 040fdfee61 Cleaned up the interface to VCFRecord. It's now possible (and easy) to create records and then write them with a VCFWriter.
I've updated HapMap2VCF to use the new interface; Chris agreed to take care of Sequenom2VCF.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2558 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-11 21:42:12 +00:00
chartl dfa3c3b875 Added:
SequenomToVCF - Takes a sequenom ped file and converts it to a VCF file with the proper metrics for QC. It's currently a rough draft,
but is working as expected on a test ped file, which is included as an integration test.

Modified:

VCFGenotypeCall -- added a cloneCall() method that returns a clone of the call

Hapmap2VCF -- removed a VCFGenotypeCall object that gets instantiated and modified but never used
(caused me all kinds of confusion when I was basing SequenomToVCF off of it)



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2554 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-11 17:17:21 +00:00
ebanks dfcd5ce25b Fixed broken test
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2547 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-08 06:13:01 +00:00
depristo 7215526810 Fix to isReference() in VCFRecord. Change to VariantCounter to correctly counter only non-genotype variants, as well as update to VariantEvalWalker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2531 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-07 00:03:29 +00:00
andrewk 6c4ac9e663 Updated HapMap2VCF to use the VCFGenotypeWriterAdapter interface; fixed bug in VCFParameters that affects VariantsToVCF and HapMap2VCF when reference is lower-cased; added integration test for HapMap2VCF that checks for the lower-case issue by testing against Hg18 region that has lower-cased bases
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2530 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-06 21:27:11 +00:00
depristo 8d13597a27 Temporary command-line support to enable rod walkers, if you know what you are doing this is safe.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2505 348d0f76-0448-11de-a6fe-93d51630548a
2010-01-06 12:15:36 +00:00
depristo 7826e144a1 forgot to update md5s
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2473 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-29 20:31:29 +00:00
depristo 87e863b48d Removed used routines in duputils; duplicatequals to archive; docs for new duplicate traversal code; general code cleanup; bug fixes for combineduplicates; integration tests for combine duplicates walker
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2468 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-29 19:46:29 +00:00
aaron a34c2442c0 moved hard-coded file paths to the oneKGLocation, validationDataLocation, and seqLocation variables setup in the BaseTest.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2460 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-29 07:40:48 +00:00
depristo 9d263b2565 Integration tests for count duplicates walker validated on a TCGA hybrid capture lane.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2459 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-28 23:57:25 +00:00
hanna 9e53c06328 First revision of command-line argument support for GenotypeWriter. Also, fixed the damn build.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2416 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-20 19:19:23 +00:00
chartl 1389ac6bdf Hurrr -- this uses power as part of its output. Changes to the power calculation broke the md5s RIGHT AFTER I HAD FIXED THEM arghflrg.
Will fix again.




git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2351 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-14 22:42:50 +00:00
chartl b42fc905e8 Added - new tests (Hapmap was re-added)
Modified - Hapmap now takes a -q command to filter out variants by quality
Modified - MathUtils - cumBinomialProbLog now uses BigDecimal to handle some numerical imprecisions
Modified - PowerBelowFrequency - returns 0.0 if called with a negative number (can't be done from inside the walker itself, but since it's called elsewhere one can't be too careful)



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2350 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-14 21:57:20 +00:00
ebanks e6f541fdca Forgot to update integration test last night
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2308 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-10 12:57:10 +00:00
ebanks e8822a3fb4 Stage 3 of Variation refactoring:
We are now VCF3.3 compliant.
(Only a few more stages left.  Sigh.)



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@2287 348d0f76-0448-11de-a6fe-93d51630548a
2009-12-08 21:43:28 +00:00