Commit Graph

5 Commits (b21f90aee0678e550746e101c6d0629ca962bce6)

Author SHA1 Message Date
ebanks c5325b03be 1) Removed hard-coded strings. Please let's use the fields defined in VCFConstants.
2) General code cleanup.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3856 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-23 01:49:47 +00:00
delangel 55b756f1cc First step in major cleanup/redo of VCF functionality. Specifically, now:
a) VCF track name can work again with 3.3 or 4.0 VCF's when specifying -B name,VCF,file. Code will read header and parse automatically the version. 
b) Old VCF codec is deprecated. Reader goes now direct from parsing VCF lines into producing VariantContext objects, with no intermediate VCF records. If anyone can't resist the urge to still input files using the old method, a new VCF3Codec is in place with the old code, but it will be eventually deleted.
c) VCF headers and VCF info fields no longer keep track of the version. They are parsed into an internal representation and will be output only in VCF4.0 format.
d) As a consequence, the existing GATK bug where files are produced with VCF4 body but VCF3.3 headers is solved.
e) Several VCF 4.0 writer bugs are now solved.
f) Integration test MD5's are changed, mostly because of corrected VCF4.0 headers and because validation data mostly uses now VCF4.0.
g) Several VCF files in the ValidationData/ directory have been converted to VCF 4.0 format. I kept the old versions, and the new versions have a .vcf4 extension.

Pending issues:
a) We are still not dealing with indels consistently or correctly when representing them. This will be a second part of the changes.
b) The VCF writer doesn't use VCFRecord but it does still use a lot of leftovers like VCFGenotypeEncoding, VCFGenotypeRecord, etc. This needs to be simplified and cleaned.



git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3813 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-16 22:49:16 +00:00
ebanks fb717fe128 First pass needed to remove old VCF code: moving all VCF-related constants into a single unified class
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3759 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-11 07:19:16 +00:00
depristo b934cc7554 Updates to fix some bugs in merger. Now able to merge into project wide indel VCF files. Integration teests coming tomorrow
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3727 348d0f76-0448-11de-a6fe-93d51630548a
2010-07-07 03:16:33 +00:00
aaron d3848745ab moving VCF 3.3 back into the GATK so Guillermo can make changes for VCF 4 output
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3639 348d0f76-0448-11de-a6fe-93d51630548a
2010-06-25 18:20:06 +00:00