Commit Graph

22 Commits (9ca9cf52aca2fe7bc47c6cc5d7eb527a77c488a6)

Author SHA1 Message Date
Eric Banks 66c652d687 Added some extra error checks in the VCF codec. Now that we've moved this back into the GATK, changed some of the standard exceptions to be USerErrors (instead of TribbleExceptions). 2011-07-14 11:56:10 -04:00
Eric Banks bb0e3a26fc Added integration test for VCF writing. Also, bug fix for writing the GT-free records. 2011-07-13 14:57:21 -04:00
Eric Banks 6007eea3ff Allowing VCF records without GTs in vf4.1 2011-07-13 09:56:08 -04:00
Eric Banks a2597e7f00 This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation. 2011-07-12 14:11:53 -04:00
Ryan Poplin 329c3d8050 Merged bug fix from Stable into Unstable 2011-07-12 13:55:51 -04:00
Ryan Poplin 73735863b0 Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext 2011-07-12 13:55:21 -04:00
Eric Banks d7d15019dd Adding support for other simple header line types (e.g. ALT) and cleaning up the interface a bit. 2011-07-12 01:16:21 -04:00
Eric Banks 400b0d4422 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 23:38:57 -04:00
Eric Banks e3748675db Support for VCF 4.1 header counts 2011-07-11 17:40:45 -04:00
Christopher Hartl d6517adb42 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 16:16:37 -04:00
Christopher Hartl 86890c6357 N and K (in binomial probability) got switched in RFA Walker with the last commit. No longer will NaNs be produced.
Added: TableToVCF. Kind of a longer-term project, but there are lots of variant calls available in a weird tabular format. I used this to convert Ju Et Al small indels to VCF. I'll check against the 1000G ASN superpopulation calls to see if we see a good amount of recapitulation, and if so, i'll put them in unvalidated comparisons. Minor chances to the TableCodec and TableFeatures to allow for this (the codec can sometimes drop a column, and the feature now allows you to grab on to its header).
2011-07-11 16:16:15 -04:00
Guillermo del Angel 7fbc5987d0 Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-07-08 21:17:32 -04:00
Guillermo del Angel 224574424e Bug fix: if we're genotyping a very long indel (>100 bp) fail gracefully instead of with an array out of bounds exception 2011-07-08 12:48:49 -04:00
Eric Banks 4cfe0dd857 Test for bad alleles so that we don't generate IndexOutOfBoundsExceptions 2011-07-07 23:01:03 -04:00
Eric Banks 14fee4ccbd Patch from Bob to deal with symbolic alleles: these weren't getting padded but they should be. 2011-07-06 12:51:44 -04:00
Khalid Shakir b6bc64a0c8 Cleanup of the utils.broad package.
Using Picard IoUtils on sample names.
2011-07-01 20:47:03 -04:00
David Roazen d647ea4fdc Long-delayed change to CachingIndexedFastaSequenceFile. Made the cache
non-static to avoid problems when multiple references are used within the same
thread (eg., during integration tests). This should kill the intermittent
IndelRealignerIntegrationTest failures.
2011-07-01 16:04:30 -04:00
Eric Banks 761347b8d5 The VariantContext utility method used by SelectVariants wasn't checking the filter status (unfiltered vs. passing filters) and always returned a VC that was passing filters. This is fixed and the md5 from the VCF Streaming test has been re-updated. 2011-06-30 15:26:09 -04:00
Eric Banks 1f19afe1d9 Fixed bug in the IndelRealigner: now that variants are correctly typed in VariantContext, it is possible that a variant can be an indel but neither an insertion or a deletion; added a isComplexIndel() method and now we check for such an event in the realigner (we don't use them to generate alternate consenses). Also, added a isMNP() method while I was there so that it would be consistent with other variant types. 2011-06-29 15:54:09 -04:00
Guillermo del Angel 5b6d279a2e Two bug fixes:
a) Modified the way clipped bases are dealt with in ReadPosRankSumTest when annotating indels. Cigar string cannot be trusted because BWA can clip good high quality bases and some sites get incorrect ReadPos annotations if BWA systematically clips at an indel breakpoint.
b) PL header needs to specify "." as length. Otherwise we fail VCF validation if multiallelic sites are present.
2011-06-29 10:21:27 -04:00
David Roazen 139c6b84a1 Modified build.xml and the help extractor doclet to use the output of "git
describe" as an absolute version number (if the repository has at least one
tag), using the raw SHA-1 hash value as a fallback version number in the case
where there are no tags.
2011-06-28 08:37:05 -04:00
David Roazen 3c9497788e Reorganized the codebase beneath top-level public and private directories,
removing the playground and oneoffprojects directories in the process. Updated
build.xml accordingly.
2011-06-28 06:55:19 -04:00