gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Aaron McKenna	ced6775de3	Changes to allow for external tests Changes to the build script that allow the external directory to have tests. This means groups like CGA don't have to reinvent the wheel on testing, and can instead use the GATKs unit and integration tests. Signed-off-by: David Roazen <droazen@broadinstitute.org>	2012-01-19 13:04:24 -05:00
Christopher Hartl	98f8431b07	Right. Forgot the = true. If only there were some way to silently commit this OH WAIT	2012-01-19 12:36:30 -05:00
Christopher Hartl	7f3ad25b01	Adding a mode to VariantFiltration to invalidate previously-applied filters to allow complete re-filtering of a VCF. T2D VQSR: re-calling now done with appropriate quality settings and using BAQ.	2012-01-19 10:54:48 -05:00
Ryan Poplin	ecdd07b748	updating HaplotypeCaller integration test	2012-01-19 09:31:22 -05:00
Ryan Poplin	7e082c7750	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-19 09:11:23 -05:00
Christopher Hartl	d1c8c38541	A QScript to generate a VQSR of union sites for T2D, using a broad set and a union site set as input.	2012-01-19 02:04:04 -05:00
Christopher Hartl	39e6df5aa9	Fix edge case for very small VCFs	2012-01-19 00:51:28 -05:00
Christopher Hartl	1e037a0ecf	Ensure second-to-last line printed	2012-01-19 00:33:08 -05:00
Christopher Hartl	9946853039	Remove duplicated line	2012-01-19 00:25:22 -05:00
Christopher Hartl	cf9b1d350a	Some minor changes to in-process functions that nobody else uses. CGL now properly ignores no-calls for external VCFs.	2012-01-19 00:20:49 -05:00
Eric Banks	ab8f499bc3	Annotate with FS even for filtered sites	2012-01-18 22:04:51 -05:00
Mauricio Carneiro	b0b0cd9aef	Conforming to the guru's recommendation on library usage ;-) thanks Khalid.	2012-01-18 21:19:16 -05:00
Guillermo del Angel	b123416c4c	Resolve stale merge changes	2012-01-18 20:56:36 -05:00
Guillermo del Angel	2eb45340e1	Initial, raw, mostly untested version of new pool caller that also does allele discovery. Still needs debugging/refining. Main modification is that there is a new operation mode, set by argument -ALLELE_DISCOVERY_MODE, which if true will determine optimal alt allele at each computable site and will compute AC distribution on it. Current implementation is not working yet if there's more than one pool and it will only output biallelic sites, no functionality for true multi-allelics yet	2012-01-18 20:54:10 -05:00
Ryan Poplin	0133d1a901	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-18 09:53:42 -05:00
Ryan Poplin	0268da7560	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-18 09:53:00 -05:00
Ryan Poplin	60024e0d7b	updating TDT integration test	2012-01-18 09:52:50 -05:00
David Roazen	b7c65cb089	Merged bug fix from Stable into Unstable	2012-01-18 09:52:47 -05:00
Ryan Poplin	11982b5a34	We no longer calculate the population-level TDT statistic if there are fewer than 5 trios with full genotype likelihood information. When there is a high degree of missingness the results are skewed or in the worst case come out as NaN.	2012-01-18 09:42:41 -05:00
Mark DePristo	ca11f68303	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-18 08:29:03 -05:00
Mark DePristo	9e77facda5	More analyses for random forest test script forest.R	2012-01-18 08:28:47 -05:00
Mark DePristo	5bd1a45879	Usability improvements to analyzeRunReports -- Print out the name / db of SQL server, not a python connection object -- Print out the ID, not a python objects, of XML record that fails to convert	2012-01-18 08:27:15 -05:00
Mark DePristo	b52db51599	Don't try to write log to a non-existant file	2012-01-18 08:26:49 -05:00
Mark DePristo	763c81d520	No longer enforce MAX_ALLELE_SIZE in VCF codec -- Instead issue a warning when a large (>1MB) record is encountered -- Optimized ref.getBytes()[i] => (byte)ref.charAt(i), which avoids an implicit O(n) allocation each iteration through computeReverseClipping()	2012-01-18 07:35:11 -05:00
Mark DePristo	0c7865fdb5	UnitTest for reverseAlleleClipping -- No code modified yet, just implementing a unit test to ensure correctness of the existing code	2012-01-18 07:35:11 -05:00
David Roazen	d5199db8ec	Be explicit about setting the snpEff -onlyCoding option in the pipeline When run without an explicit -onlyCoding option, as we've been doing up to now, snpEff automatically sets -onlyCoding to "true" provided that there is at least one transcript marked as "protein_coding", which will always be the case for us in practice (and indeed, all pipeline runs so far with snpEff 2.0.5 have run with -onlyCoding auto-set to "true"). However, given the disastrous effect on annotation quality setting "-onlyCoding false" has, we wish to be explicit with this option rather than relying on snpEff's auto-detection logic.	2012-01-17 20:04:27 -05:00
Christopher Hartl	9770250b72	Fix for Amy W - evidently binding defaults are not null but an unbound object, which caused the improper branch to be entered into.	2012-01-17 17:28:58 -05:00
Mark DePristo	b0560f9440	Rev. tribble to fix BED codec bug in tribble 51	2012-01-17 16:40:26 -05:00
Mark DePristo	62801e430a	Bugfix for unnecessary optimization -- don't cache the ref bytes	2012-01-17 16:40:26 -05:00
Mark DePristo	f2b0575dee	Detect unreasonably large allele strings (>2^16) and throw an error -- samtools can emit alleles where the ref is 42M Ns and this caused the GATK (via tribble) to hang in several places. -- Tribble was updated so we actually could read the line properly (rev. to 51 here). -- Still the parsing algorithms in the GATK aren't happy with such a long allele. Instead of optimizing the code around an improper use case I put in a limit of 2^16 bp for any allele, and throw a meaningful exception when encountered.	2012-01-17 16:40:26 -05:00
Menachem Fromer	816dcf9616	Finally got around to adding support for Eric's fix to permit annotation exclusion by VariantAnnotator	2012-01-17 16:35:16 -05:00
Ryan Poplin	8b0ddf0aaf	Adding notes to CountCovariates docs about using interval lists as database of known variation	2012-01-17 16:13:13 -05:00
Mauricio Carneiro	ff2fc514ae	Updated plots to CGL walker a few updates on the CalibrateGenotypeLikelihoods walker output * Fixed ggplot2 issue with dataset with poor coverage * Added jitter as default geometry * Dropped the cut by technology from the graphs	2012-01-17 15:14:47 -05:00
Ryan Poplin	56761297dd	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-17 15:03:32 -05:00
Ryan Poplin	75f87db468	Replacing Mills file with new gold standard indel set in the resource bundle for release with v1.5	2012-01-17 15:02:45 -05:00
Matt Hanna	40ebc17437	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-17 14:49:17 -05:00
Matt Hanna	41d70abe4e	At chartl's request, add the bwa aln -N and bwa aln -m parameters to the bindings.	2012-01-17 14:47:53 -05:00
Mark DePristo	2390449f0f	Local and S3 archiving scripts now push data to MySQL as well	2012-01-17 14:42:48 -05:00
Ryan Poplin	ae259f81cc	Bug fixing for merging of read fragments when one fragment contained an indel	2012-01-17 14:39:27 -05:00
Menachem Fromer	80a1ae254b	Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-17 14:25:40 -05:00
Menachem Fromer	284a8e9ddc	Fixed to match recent minor updates by Khalid and Eric	2012-01-17 14:24:41 -05:00
Christopher Hartl	cde224746f	Bait Redesign supports baits that overlap, by picking only the start of intervals. CalibrateGenotypeLikelihoods supports using an external VCF as input for genotype likelihoods. Currently can be a per-sample VCF, but has un-implemented methods for allowing a read-group VCF to be used. Removed the old constrained genotyping code from UGE -- the trellis calculated is exactly the same as that done in the MLE AC estimate; so we should just re-use that one.	2012-01-17 13:51:05 -05:00
Ryan Poplin	8e23c98dd9	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-17 13:46:28 -05:00
Matt Hanna	32ccde374b	Merged bug fix from Stable into Unstable	2012-01-17 11:08:35 -05:00
Matt Hanna	3ba918aff1	Error message cleanup in BAM indexing code.	2012-01-17 11:05:42 -05:00
Mark DePristo	aa8a885a5b	Generalizing forest.R analysis script -- Support for N tree analyses -- Testing of NA omit and roughfix options -- Misc. analyses and refactoring	2012-01-16 09:33:41 -05:00
Mark DePristo	8ddac9a06f	Don't show individual jobs in queueStatus for gsaadm, just count	2012-01-16 09:33:05 -05:00
Mark DePristo	61f82f138f	Extract a high-level GATK version from the SVN / GIT full version numbers in analyzeRunReports -- Maps SVN versions 1.0.5988 for example to 0.5, 1.0.6134 to 0.6, etc -- Maps GIT versions 1.x-XXX to 1.x Used in tableau analyses	2012-01-16 09:30:48 -05:00
Mauricio Carneiro	8272c8bd26	Added exceptions to CGL walker * Assert that a user provided a VCF not some other type of ROD * Assert that the VCF has samples * Assert that the samples in the BAM exist in the VCF * Warn the user if not all samples in the BAM are present in the VCF	2012-01-14 14:10:19 -05:00
Mauricio Carneiro	cec7107762	Better location for the downsampling of reads in PrintReads * using the filter() instead of map() makes for a cleaner walker. * renaming the unit tests to make more sense with the other unit and integration tests	2012-01-14 14:06:09 -05:00

1 2 3 4 5 ...

8621 Commits (ced6775de3af5b5df967ea96ada5c73300aa3bc1) All Branches Search

8621 Commits (ced6775de3af5b5df967ea96ada5c73300aa3bc1)

All Branches