Commit Graph

7509 Commits (68da555932761567f79f0b3bfdbc808dcf72ca5e)

Author SHA1 Message Date
Mark DePristo 68da555932 UnitTest for simpleMerge for alleles 2011-09-22 15:16:37 -04:00
Mark DePristo 8811bb8668 Merge branch 'stable' 2011-09-22 12:11:01 -04:00
Mark DePristo ba5f83fee2 start of VariantContextUtils UnitTest
-- tests rsID merging
2011-09-22 12:10:39 -04:00
Eric Banks 5e06a45628 Fix the AnalayzeCovariates packaging. 2011-09-22 11:55:40 -04:00
Mark DePristo 8ca4b5e938 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-22 11:42:19 -04:00
Mauricio Carneiro d3cc25454c Updating the MDCP 2011-09-22 11:27:40 -04:00
Mark DePristo 93dd1faa5f Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-22 11:20:10 -04:00
Mark DePristo a05c959e5a Empty unit tests for VariantContextUtils
-- will be expanded over the day
2011-09-22 11:20:07 -04:00
Mark DePristo 3fdee2b9ed Merge from stable into unstable 2011-09-22 11:19:43 -04:00
Mauricio Carneiro 623c49765d NO BAQ ON EXOMES!
says the boss.
2011-09-22 11:13:40 -04:00
Christopher Hartl 63758efc17 Adding in a qscript for running the ILG (as calculating the insert size distribution needs to happen first). 2011-09-22 11:02:26 -04:00
Christopher Hartl 4f4a0fc38a Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/chartl/dev/git 2011-09-22 11:01:58 -04:00
Christopher Hartl 982c47bfa7 Remove duplicate effort in ReadUtils (with apologies to Mauricio)
Big (but not major) cleanup of code in ILG - mostly excising the old likelihood model
Activated the early-abort check for ILG. I think it should be better this way.
2011-09-22 10:58:26 -04:00
Mark DePristo c514df6d18 Merge of stable into unstable 2011-09-22 10:34:27 -04:00
Mark DePristo f81a41b889 Updating MD5s for CombineVariants
-- Old version had broken RSIDs, new version is fixed.  No longer see rs1234,. as it is now just rs1234
2011-09-22 10:30:25 -04:00
Eric Banks b8ea9ceb68 Adding integration test that uses the -V:dbsnp binding to make sure it won't fail later on if someone messes with Tribble. 2011-09-21 22:43:31 -04:00
Eric Banks 8f8b59a932 My interpretation of the VCF spec is that the FORMAT field should only be present if there is genotype/sample data. So the VCFCodec now throws an exception when it encounters such a case. I had to fix one of the integration test VCFs. 2011-09-21 22:23:28 -04:00
Ryan Poplin e53cb79d42 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-21 20:26:54 -04:00
Ryan Poplin 5d0f284305 Fixing exome specific arguments to the VQSR in the methods development calling pipeline 2011-09-21 20:26:28 -04:00
Christopher Hartl dc96f6da79 Merge branch 'master' of ssh://chartl@gsa2/humgen/gsa-scr1/chartl/dev/git 2011-09-21 18:18:41 -04:00
Christopher Hartl f9cdc119af Added a method to ReadUtils that converts reads of the form 10S20M10S to 40M (just unclips the soft-clips).
Be careful when using this - if you're writing a bam file it will be potentially written out of order (since the previous alignment start was at the M, not the S).
2011-09-21 18:16:42 -04:00
Christopher Hartl faff6e4019 Failed to commit changes to the GATKReport required for more easy access when using the files as data sources (read: histograms) for walkers 2011-09-21 18:15:23 -04:00
Mauricio Carneiro 96768c8a18 Sending latest bug fixes to Reduce Reads to the main repository 2011-09-21 17:43:11 -04:00
Mauricio Carneiro 70335b2b0a Hard clipping soft clipped reads to fix misalignments.
Pre-softclipped reads (with high qual) are a complicated event to deal with in the Reduced Reads environment. I chose to hard clip them out for now and added a todo item to bring them back on in the future, perhaps as a variant region.
2011-09-21 17:12:01 -04:00
Christopher Hartl 1b47dcb1b5 Removing old intron loss genotyper (though all of the development tree got rebased away, I hope); committing a new version of the likelihood calculation model and the genotyper, as well as the sequence simulator (and a QualityScoreHisotgramWalker to help with the simulation of read qualities). RFA will remain local for now. 2011-09-21 16:55:09 -04:00
Christopher Hartl ef05827c7b Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-21 16:40:47 -04:00
Christopher Hartl 3b51d9106a Adding in likelihood calculations for mendelian violations. Also fixing a minor and rare bug in SelectVariants when specifying family structure on the command line. 2011-09-21 16:40:29 -04:00
Mark DePristo 04968c88b3 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-21 15:43:25 -04:00
Mark DePristo c6ba944719 Adding bgzip vcf file for unit tests 2011-09-21 15:39:45 -04:00
Mark DePristo 6bcfce225f Fix for dynamic type determination for bgzip files
-- GZipInputStream handles bgzip files under linux, but not mac
-- Added BlockCompressedInputStream test as well, which works properly on bgzip files
2011-09-21 15:39:19 -04:00
Mark DePristo 9f6f0c443c Marginally cleaner isVCFStream() function
-- cleanup trying to debug minor bug.  Failed to fix the bug, but the code is nicer now
2011-09-21 15:25:01 -04:00
Ryan Poplin 5fef6dc5d0 Merged bug fix from Stable into Unstable 2011-09-21 15:23:06 -04:00
Ryan Poplin 2585fc3d6c Updating Rscript path doc text for Broad users 2011-09-21 15:22:26 -04:00
Mark DePristo 74f9ccf6dd Merge 2011-09-21 11:30:11 -04:00
Mark DePristo 6592972f82 Putative fix for BAQ array out of bounds
-- Old code required qual to be <64, which isn't strictly necessary.  Now uses the Picard SAMUtils.MAX_PHRED_SCORE constant
-- Unittest to enforce this behavior
2011-09-21 11:25:08 -04:00
Eric Banks 174859fc68 Don't allow whitespace in the INFO field 2011-09-21 11:14:54 -04:00
Mark DePristo ecc7f34774 Putative fix for BAQ problem. 2011-09-21 11:09:54 -04:00
Mark DePristo 7d11f93b82 Final bugfix for CombineVariants
-- Now handles multiple records at a site, so that you don't see records like set=dbsnp-dbsnp-dbsnp when combining something with dbsnp
-- Proper handling of ids.  If you are merging files with multiple ids for the same record, the ids are merged into a comma separated list
2011-09-21 10:58:32 -04:00
Mark DePristo b36d396c16 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-21 10:16:24 -04:00
Mark DePristo 34f435565c Accidentally committed unclean tribble jar to repo 2011-09-21 10:16:17 -04:00
Mark DePristo a91ac0c5db Intermediate commit of bugfixes to CombineVariants 2011-09-21 10:15:05 -04:00
Mauricio Carneiro ac4f2d6d34 Fixing choppy consensus reads
When the consensus read had holes in the middle, the consensus was being finalized but not properly reinitialized. It was restarting with the old coordinates of the finalized consensus, misaligning following bases.
2011-09-21 00:49:50 -04:00
Mark DePristo 48c413fee8 Now throws an error when the mismatch fraction is too high 2011-09-20 21:28:31 -04:00
Mark DePristo 3b9314aecf Max fraction of mismatch test for debugging
-- Useful example for individuals who want to compute mismatches between a read and the reference.
2011-09-20 20:42:18 -04:00
David Roazen b04d8eab55 Merged bug fix from Stable into Unstable 2011-09-20 17:24:14 -04:00
Mauricio Carneiro 758ecf2d43 Bringing latest updates of ReduceReads to the master repository 2011-09-20 16:35:09 -04:00
David Roazen d9ea764611 SnpEff annotator now adds OriginalSnpEffVersion and OriginalSnpEffCmd lines to the header of the VCF output file.
This change is urgently required for production, which is why it's going into Stable+Unstable
instead of just Unstable.

The keys for the SnpEff version and command header lines in the VCF file output by
VariantAnnotator (OriginalSnpEffVersion and OriginalSnpEffCmd) are intentionally
different from the keys for those same lines in the SnpEff output file (SnpEffVersion
and SnpEffCmd), so that output files from VariantAnnotator won't be confused
with output files from SnpEff itself.
2011-09-20 16:30:55 -04:00
Mark DePristo bffd3cca6f Bug fix for reduced read; only adds regular bases for calculation
-- No longer passes on deletions for genotyping
2011-09-20 15:07:06 -04:00
Mark DePristo 83bb91020f Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-20 14:52:54 -04:00
Menachem Fromer a97e039a62 Thanks to Chris for instructing me to use VCFExtractIntervals to get proper scattering of Variant Annotation 2011-09-20 14:29:39 -04:00