Commit Graph

71 Commits (6a431da554634b164635b299e2ef472a35fd0739)

Author SHA1 Message Date
Eric Banks 6a431da554 Don't output source and ref header lines anymore. Short-term motivation for this is that I'd like this tool when run on a VCF to emit the exact same VCF. Long-term motivation is that these tags should be output by the VCF writer itself for all tools. 2011-07-13 14:40:01 -04:00
Eric Banks 969227c657 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 10:01:28 -04:00
Eric Banks 6007eea3ff Allowing VCF records without GTs in vf4.1 2011-07-13 09:56:08 -04:00
Ryan Poplin 837fb8f689 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:39:26 -04:00
Ryan Poplin 5077c94d85 Adding MappingQualityUnavailableReadFilter to the SNP and indel CountCovariates 2011-07-12 15:39:07 -04:00
Mark DePristo 01fd6a6949 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:20:44 -04:00
Mark DePristo ccedd6ff4c Difference is now the general form -- used to be SummarizedDifference. The old Difference class is now a subclass of Difference that includes pointers to specific the master and test DiffElements.
Added a size() function that calculates the number of elements tree from a DiffElement.
2011-07-12 15:20:28 -04:00
Eric Banks a2597e7f00 This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation. 2011-07-12 14:11:53 -04:00
Ryan Poplin 329c3d8050 Merged bug fix from Stable into Unstable 2011-07-12 13:55:51 -04:00
Ryan Poplin 73735863b0 Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext 2011-07-12 13:55:21 -04:00
Guillermo del Angel c4c145afb9 Merged bug fix from Stable into Unstable 2011-07-12 13:44:48 -04:00
Guillermo del Angel cfe43e3971 Bug fix for Genotype given alleles: if we are in INDEL mode ignore SNPs and MNPs instead of emitting an empty site with alleles but no annotations 2011-07-12 13:43:46 -04:00
Mark DePristo 05212aea62 reader now takes an argument for the maximum number of elements to read from the file. 2011-07-12 08:53:19 -04:00
Mark DePristo 8056a3fe89 getElement() now uses O(1) get from hash instead of linear O(n) search. Enables us to read large files easily. 2011-07-12 08:52:31 -04:00
Eric Banks d7d15019dd Adding support for other simple header line types (e.g. ALT) and cleaning up the interface a bit. 2011-07-12 01:16:21 -04:00
Eric Banks 400b0d4422 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 23:38:57 -04:00
Mark DePristo d5056ad899 Merge branch 'master' into diffit 2011-07-11 23:16:15 -04:00
Mark DePristo 893cc2e103 Making the package public, so there's no dependances from public -> private 2011-07-11 23:15:08 -04:00
Eric Banks e3748675db Support for VCF 4.1 header counts 2011-07-11 17:40:45 -04:00
Christopher Hartl d6517adb42 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-11 16:16:37 -04:00
Christopher Hartl 86890c6357 N and K (in binomial probability) got switched in RFA Walker with the last commit. No longer will NaNs be produced.
Added: TableToVCF. Kind of a longer-term project, but there are lots of variant calls available in a weird tabular format. I used this to convert Ju Et Al small indels to VCF. I'll check against the 1000G ASN superpopulation calls to see if we see a good amount of recapitulation, and if so, i'll put them in unvalidated comparisons. Minor chances to the TableCodec and TableFeatures to allow for this (the codec can sometimes drop a column, and the feature now allows you to grab on to its header).
2011-07-11 16:16:15 -04:00
Guillermo del Angel 6e7b5e1e7a Merged bug fix from Stable into Unstable
Merge branch 'master' into unstable
2011-07-08 21:19:45 -04:00
Guillermo del Angel 7fbc5987d0 Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-07-08 21:17:32 -04:00
Mark DePristo bd29236684 Merge branch 'master' into diffengine 2011-07-08 14:08:17 -04:00
Guillermo del Angel 224574424e Bug fix: if we're genotyping a very long indel (>100 bp) fail gracefully instead of with an array out of bounds exception 2011-07-08 12:48:49 -04:00
Ryan Poplin 2a4b3ae4a2 Cleaning up / removing most of the monkeying around with annotation values that happens in VariantDataManager 2011-07-08 12:48:33 -04:00
Mark DePristo 8add2a3866 Merge branch 'master' into diffengine 2011-07-08 09:15:54 -04:00
Eric Banks cc143493e3 Merged bug fix from Stable into Unstable 2011-07-07 23:01:24 -04:00
Eric Banks 4cfe0dd857 Test for bad alleles so that we don't generate IndexOutOfBoundsExceptions 2011-07-07 23:01:03 -04:00
Mark DePristo 3d4f0e9dd7 Now supports the case where you have multiple AC values in the info field. 2011-07-07 17:21:15 -04:00
Ryan Poplin 212e9a1a0c Fixing unstable build after stable commit 2011-07-07 15:18:57 -04:00
Ryan Poplin 11d9a0473a Merged bug fix from Stable into Unstable 2011-07-07 15:03:58 -04:00
Ryan Poplin 50111db2b7 Fixing non-determinism in single-threaded VQSR by moving references to cern.Normal over to the static random generator available in GenomeAnalysisEngine 2011-07-07 15:02:48 -04:00
Eric Banks 52f6f9fdcc Merged bug fix from Stable into Unstable 2011-07-06 16:05:48 -04:00
Eric Banks 54121eb082 Catch malformed bams that cause the writer to run in infinite loops 2011-07-06 16:05:08 -04:00
Eric Banks 76a01a7453 Merged bug fix from Stable into Unstable 2011-07-06 12:53:09 -04:00
Eric Banks 14fee4ccbd Patch from Bob to deal with symbolic alleles: these weren't getting padded but they should be. 2011-07-06 12:51:44 -04:00
Ryan Poplin bdef233d4d Merged bug fix from Stable into Unstable 2011-07-06 10:05:02 -04:00
Ryan Poplin e8ed6b7f0f Adding more comments to main VQSR walker. Fixing copyright lines. Bug fix for default paths to now point to public/R/ instead of R/ Bug fix in VQSR for the path to the R scripts not ending in a slash. 2011-07-06 10:01:14 -04:00
Guillermo del Angel 8e8b901d12 Merged bug fix from Stable into Unstable
Merge branch 'master' into unstable
2011-07-06 09:57:55 -04:00
Guillermo del Angel 81a4d18468 Mark several indel-related arguments as @Hidden 2011-07-06 09:56:38 -04:00
Ryan Poplin fb315b5f8c Merge branch 'incoming' 2011-07-02 18:10:48 -04:00
Ryan Poplin 41d46059e7 fixing bad format statement 2011-07-02 18:09:17 -04:00
Ryan Poplin 3804afeb8a Merge branch 'incoming' 2011-07-02 17:55:39 -04:00
Ryan Poplin 781c0c33a4 Use the worst X% of calls in addition to the bad training sites list. Don't include the already added calls in the calculation of X% 2011-07-02 17:55:10 -04:00
Ryan Poplin 6b8af6afd8 Merge branch 'master' of ssh://gsa1.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-02 17:15:56 -04:00
Ryan Poplin fdc2ebb321 Adding ability to specify in VQSR a list of bad sites to use when training the negative model. Just add bad=true to the list of rod tags for your bad sites track. 2011-07-02 17:15:13 -04:00
Guillermo del Angel 09af6bbc6c Ugh - backed out experimental code not for public consumption unintendedly committed 2011-07-02 16:58:57 -04:00
Guillermo del Angel c6c0dba040 Merge branch 'master' of ssh://delangel@nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-02 16:45:34 -04:00
Ryan Poplin 4532a84314 Merged bug fix from Stable into Unstable 2011-07-02 10:48:55 -04:00