Eric Banks
66c652d687
Added some extra error checks in the VCF codec. Now that we've moved this back into the GATK, changed some of the standard exceptions to be USerErrors (instead of TribbleExceptions).
2011-07-14 11:56:10 -04:00
Eric Banks
bb0e3a26fc
Added integration test for VCF writing. Also, bug fix for writing the GT-free records.
2011-07-13 14:57:21 -04:00
Eric Banks
6007eea3ff
Allowing VCF records without GTs in vf4.1
2011-07-13 09:56:08 -04:00
Eric Banks
a2597e7f00
This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation.
2011-07-12 14:11:53 -04:00
Ryan Poplin
329c3d8050
Merged bug fix from Stable into Unstable
2011-07-12 13:55:51 -04:00
Ryan Poplin
73735863b0
Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext
2011-07-12 13:55:21 -04:00
Eric Banks
d7d15019dd
Adding support for other simple header line types (e.g. ALT) and cleaning up the interface a bit.
2011-07-12 01:16:21 -04:00
Eric Banks
400b0d4422
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 23:38:57 -04:00
Eric Banks
e3748675db
Support for VCF 4.1 header counts
2011-07-11 17:40:45 -04:00
Christopher Hartl
d6517adb42
Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-07-11 16:16:37 -04:00
Christopher Hartl
86890c6357
N and K (in binomial probability) got switched in RFA Walker with the last commit. No longer will NaNs be produced.
...
Added: TableToVCF. Kind of a longer-term project, but there are lots of variant calls available in a weird tabular format. I used this to convert Ju Et Al small indels to VCF. I'll check against the 1000G ASN superpopulation calls to see if we see a good amount of recapitulation, and if so, i'll put them in unvalidated comparisons. Minor chances to the TableCodec and TableFeatures to allow for this (the codec can sometimes drop a column, and the feature now allows you to grab on to its header).
2011-07-11 16:16:15 -04:00
Guillermo del Angel
7fbc5987d0
Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-07-08 21:17:32 -04:00
Guillermo del Angel
224574424e
Bug fix: if we're genotyping a very long indel (>100 bp) fail gracefully instead of with an array out of bounds exception
2011-07-08 12:48:49 -04:00
Eric Banks
4cfe0dd857
Test for bad alleles so that we don't generate IndexOutOfBoundsExceptions
2011-07-07 23:01:03 -04:00
Eric Banks
14fee4ccbd
Patch from Bob to deal with symbolic alleles: these weren't getting padded but they should be.
2011-07-06 12:51:44 -04:00
Khalid Shakir
b6bc64a0c8
Cleanup of the utils.broad package.
...
Using Picard IoUtils on sample names.
2011-07-01 20:47:03 -04:00
David Roazen
d647ea4fdc
Long-delayed change to CachingIndexedFastaSequenceFile. Made the cache
...
non-static to avoid problems when multiple references are used within the same
thread (eg., during integration tests). This should kill the intermittent
IndelRealignerIntegrationTest failures.
2011-07-01 16:04:30 -04:00
Eric Banks
761347b8d5
The VariantContext utility method used by SelectVariants wasn't checking the filter status (unfiltered vs. passing filters) and always returned a VC that was passing filters. This is fixed and the md5 from the VCF Streaming test has been re-updated.
2011-06-30 15:26:09 -04:00
Eric Banks
1f19afe1d9
Fixed bug in the IndelRealigner: now that variants are correctly typed in VariantContext, it is possible that a variant can be an indel but neither an insertion or a deletion; added a isComplexIndel() method and now we check for such an event in the realigner (we don't use them to generate alternate consenses). Also, added a isMNP() method while I was there so that it would be consistent with other variant types.
2011-06-29 15:54:09 -04:00
Guillermo del Angel
5b6d279a2e
Two bug fixes:
...
a) Modified the way clipped bases are dealt with in ReadPosRankSumTest when annotating indels. Cigar string cannot be trusted because BWA can clip good high quality bases and some sites get incorrect ReadPos annotations if BWA systematically clips at an indel breakpoint.
b) PL header needs to specify "." as length. Otherwise we fail VCF validation if multiallelic sites are present.
2011-06-29 10:21:27 -04:00
David Roazen
139c6b84a1
Modified build.xml and the help extractor doclet to use the output of "git
...
describe" as an absolute version number (if the repository has at least one
tag), using the raw SHA-1 hash value as a fallback version number in the case
where there are no tags.
2011-06-28 08:37:05 -04:00
David Roazen
3c9497788e
Reorganized the codebase beneath top-level public and private directories,
...
removing the playground and oneoffprojects directories in the process. Updated
build.xml accordingly.
2011-06-28 06:55:19 -04:00