gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Matt Hanna	f7df8bdecc	Merged bug fix from Stable into Unstable	2011-10-27 11:31:17 -04:00
Matt Hanna	41ddc7bce7	Make sure we output a full stack trace when we encounter Tribble error messages on VCF header merge.	2011-10-27 11:30:04 -04:00
Eric Banks	44f905b5e5	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-26 23:31:11 -04:00
Eric Banks	68283b1651	Fixing docs and adding GATKdocs for the new interval functionality	2011-10-26 22:14:43 -04:00
Mark DePristo	c9978316a3	Merge branch 'FragmentUtils'	2011-10-26 19:51:49 -04:00
Mauricio Carneiro	add9ad97ec	No scatter gather for VQSR or ApplyVQSR. These walkers should not be scatter gatherable. Annotating them accordingly so that Queue doesn't allow a less than knowledgeable user to try and scatter/gather VQSR.	2011-10-26 16:35:44 -04:00
Ryan Poplin	74aeb22eeb	Merged bug fix from Stable into Unstable	2011-10-26 15:57:30 -04:00
Ryan Poplin	86871bd1e3	Throw a UserException in the BQSR when there is no data instead of creating an empty csv file	2011-10-26 15:56:41 -04:00
Mark DePristo	034a997d07	Generalized Reads -> Fragment calculation -- Supports ReadBackedPileup -> FragmentCollection as before -- Added support for List<SAMRecord> -> FragmentCollection for Ryan's haplotype caller -- General cleanup, renaming, move to separate package, more extensive unit tests, etc. -- Added toFragment() function to ReadBackedPileup interface	2011-10-26 15:54:38 -04:00
Eric Banks	2f21b6ecfb	Removed debugging output	2011-10-26 15:50:20 -04:00
Eric Banks	b39fcb1bea	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-26 15:44:25 -04:00
Eric Banks	b6ce6ed3f8	Go around the ROD system for now so that we can just call decodeLoc() for efficiency. Noted that we should go through the ROD system once it gets cleaned up. This means that currently gzipped files are not supported with -L.	2011-10-26 15:42:53 -04:00
Eric Banks	3273c20c98	Added integration tests for Tribble-based intervals and fixed up some of the other tests based on some method changes.	2011-10-26 15:29:18 -04:00
Eric Banks	9424e8b2ca	Initial working version of new interval system in which the argument for -L (and -XL) is allowed to be a rod file (e.g. VCF). Old samtools-style intervals still behave as before. BTI is no longer supported. The merging (union or intersection) of intervals is now consistently applied to all -L (or -XL) intervals, which is nice. More testing needed.	2011-10-26 14:11:49 -04:00
Mark DePristo	7fa943aef1	Renamed FragmentPileup to FragmentUtils	2011-10-26 14:01:45 -04:00
Mark DePristo	af3613cc5f	GATKSAMRecord commit branch summary First, I'm sure there's a better way to do this, but I wanted to create a single commit summarizing the changes from my branch SamRecordFactory. What's the best way to do this? Rebase? Now, on to the changes here: -- Picard added a SamRecordFactory that is used to create instances the subclass SamRecord or BAMRecord. This factory allows us to have low-level picard readers (SamFileReader) create objects of type GATKSamRecord. The abomination of the extends and contains GATKSamRecord is now gone. GATKSamRecords are now produced by this factory, the GATK provides this factory to our SamFileReaders, and everything works with GATKSamRecord just extending BAMRecord. This results in up to a 2x performance improvement in writing BAMs and a ~10% improvement when reading BAMs files. -- As a consequence of this, we no longer officially support SAM records. Attempting to create SAMRecord objects with the factory will throw a user exception. -- Created a standard NGSPlatform enum, and GATKSamRecords support efficiently obtaining this value. The real BQSR (not the copy indel version) got the efficient code to use this. Please add all future platforms to this enum. -- GATKSamRecord no longer supports using the OQ or defaultBaseQuality. This is performed in a wrapper iterator that's only added when these command line options are used. -- ReducedRead code has been moved from ReadUtils until efficiency caching assessors in GATKSamRecord. -- ArtificialSamUtils creates GATKSamRecords now, just SAMRecords. Added code here to create artifical pairs and using that code to create artificial ReadBackedPileups with specific properties -- New smarter algorithm for FragmentPileup. This new code is up to 3x faster than the previous version, and is lazy so is more efficient when no overlapping pairs are actually in the pileup. Created extensive DataProvider driven UnitTest. Added Caliper-based benchmarking system to characterize the performance differences between the old and new algorithms. TODO still remains to make a efficient version that works for non-pileups for the HaplotypeCaller	2011-10-25 20:52:56 -04:00
Mark DePristo	2822f0dc27	Merge branch 'SamRecordFactory'	2011-10-25 20:34:47 -04:00
Mark DePristo	1b722c21cf	merge master	2011-10-25 16:08:39 -04:00
Ryan Poplin	56fdf0b865	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-25 15:58:56 -04:00
Ryan Poplin	4a34c1862e	misc cleanup. We now filter out haplotypes when it is obvious that the assembly has failed to find a parsimonious event rather than use haplotypes with large numbers of SNPs and small indels on them.	2011-10-25 15:22:28 -04:00
David Roazen	2794e5c1d4	Modified the VCFJarClassLoadingUnitTest to play nice with the packaged-jar test targets.	2011-10-25 14:47:15 -04:00
Guillermo del Angel	b559936b7a	a)New variant eval stratification module for indel size. b) Next iteration on indel caller runtime optimization: when computing likelihood of each haplotype for a given read, many computations will be redundant since pieces of haplotypes will be common to both REF and ALT haplotypes. So, we keep HMM matrices from one haplotype to the next one and recompute starting at the part where either haplotype is different or GOP/GCP are different.	2011-10-25 09:56:43 -04:00
Khalid Shakir	fac9932938	Embedding gsalib source and queueJobReport R scripts in the dist and package jars. Moved gsalib and queueJobReport.R to embeddable namespaced locations. Updated packager dependencies/dir to add an @includes which filters the embedded fileset. RScriptExecutor can now JIT compiles the gsalib. RScriptExecutor uses ProcessController and sends the Rscript output to java's stdout when run under -l DEBUG. Refactored ProcessController and IOUtils from Queue to Sting Utils. Added more unit tests to ProcessController along with a utility class to hard stop OutputStreams at a specified byte count. Replaced uses of some IOUtils with Apache Commons IO. ShellJobRunner refactored to use direct ProcessController and now kills jobs on shutdown. Better QGraph responsiveness on shutdown by using Object.wait() instead of Thread.sleep().	2011-10-24 15:58:34 -04:00
Khalid Shakir	89a581a66f	Added ability to specify arguments in files via -args/--arg_file Pushing back downsample and read filter args so they show up in getApproximateCommandLineArgs()	2011-10-24 15:58:34 -04:00
Mark DePristo	502592671d	Cleanup FragmentPileup before main repo commit -- removed intermiate functions. Now only original version and best optimized new version remain -- Moved general artificial read backed pileup creation code into ArtificialSamUtils	2011-10-24 14:40:05 -04:00
Mark DePristo	166174a551	Google caliper example execution script -- FragmentPileup with final performance testing	2011-10-24 14:04:53 -04:00
Mark DePristo	f6ccac889b	Merged bug fix from Stable into Unstable	2011-10-23 16:37:12 -04:00
Mark DePristo	585a45b7a3	Bug fix for ClipReadsWalker when stats output isn't provided -- See http://getsatisfaction.com/gsa/topics/clipreadswalker?utm_content=topic_link&utm_medium=email&utm_source=reply_notification	2011-10-23 16:36:48 -04:00
Ryan Poplin	f5d910b8a5	Haplotype caller now sends genotype likelihoods to the exact model to genotype the events found in the best haplotypes.	2011-10-23 13:29:08 -04:00
Mark DePristo	42bf9adede	Initial version of "fast" FragmentPileup code -- Uses mayOverlapRoutine in ReadUtils -- Attempts to be smart when doing overlap calculation, to avoid unnecessary allocations -- PileupElement now comparable (sorts on offset than on start) -- Caliper microbenchmark to assess performance	2011-10-22 21:36:37 -04:00
Mauricio Carneiro	4913f8a60f	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-21 17:45:07 -04:00
Mauricio Carneiro	102dafdcbc	Validation of GATKSamRecord in read filters Moved the validation of the GATKSamRecord to the MalformedReadFilter with the intent to make the read filter the ultimate validation location for sam records. This way we can opt to filter out malformed reads if we know what we are doing or blow up otherwise.	2011-10-21 17:40:43 -04:00
Guillermo del Angel	f4b409fa0d	CombineVariants bug fix: when merging records with disparate alleles we were leaving AC,AF fields intact. This had as a consequence that we could end up with a record with 3 alt alleles but only 2 values in AC,AF fields. Now, if alleles in combined vc are different from original, and if AC,AF fields can't be recomputed from genotypes, we remove attributes from vc map since they'll be invalid anyway. Integration test md5 changed since there were several badly merged records in result	2011-10-21 14:07:20 -04:00
Mark DePristo	b863390cb1	Moving reduced read functionality into GATKSAMRecord -- More functions take / produce GATKSAMRecords instead of SAMRecord	2011-10-21 13:28:05 -04:00
Mark DePristo	2403e96062	Renamed GATKSamRecord -> GATKSAMRecord for consistency. Better docs.	2011-10-21 09:59:24 -04:00
Mark DePristo	110e13bc1e	Merge branch 'master' into SamRecordFactory	2011-10-21 09:43:52 -04:00
Mark DePristo	be797a8a1f	Recalibrator now uses the much more efficient NGSPlatform in the cycle covariates system	2011-10-21 09:39:21 -04:00
Mark DePristo	ed74ebcfa1	GATKSamRecords with efficiency NGSPlatform method	2011-10-21 09:38:41 -04:00
Mark DePristo	94e1898d8f	A canonical set of NGS platforms as enums with convenient manipulation methods	2011-10-21 09:37:45 -04:00
Mark DePristo	999a8998ae	Constructor for GATKSamRecord with header only, for unit testing	2011-10-19 17:51:48 -04:00
Mark DePristo	3227143a1c	Systematic test code for FragmentPileup -- Creates all combinatinos of overlapping and non-overlapping read pair pileups in all orientations and first/second pairings to validate fragment detection.	2011-10-19 17:50:27 -04:00
Mark DePristo	bba69701b5	Now creates GATKSamRecords now SamRecords	2011-10-19 17:49:17 -04:00
Christopher Hartl	cd8a6d62bb	You know how the wiki has a big section on commiting local changes to BRANCHES of the repository you clone it from? Yeah. It sucks if you don't do that. This commit contains: - IntronLossGenotyper is brought into its current incarnation - A couple of simple new filters (ReadName is super useful for debugging, MateUnmapped is useful for selecting out reads that may have a relevant unaligned mate) - RFA now matches my current local repository. It's in flux since I'm transitioning to the new traversal type. + the triggering read stash pilot required me to change the scope of some of the variables in the ReadClipping code, private -> protected. Those are all the changes there. - MendelianViolation restored to its former glory (and an annotator module that uses the likelihood calculation has been added) + use this rather than a hard GQ threshold if you're doing MV analyses. - Some miscellaneous QScripts	2011-10-19 17:42:37 -04:00
Mark DePristo	52345f0aec	Meaningful documentation string	2011-10-19 15:47:36 -04:00
Mark DePristo	1b38aa1a7e	Cleaning up reduced read code accessors	2011-10-19 15:46:44 -04:00
Eric Banks	d8d73fe4f2	Treat ./X genotypes as MIXED so that isHet, isHom, etc. still return the expected and correct values. Added docs to these accessors with contracts explicitly mentioned. Fixed case where NPE could be thrown.	2011-10-19 15:11:13 -04:00
Mark DePristo	7928b287fc	GATKSamRecord now produced by SAMFileReaders by default -- Removed all of the unnecessary caching operations in GATKSAMRecord -- GATKSAMRecord renamed to GATKSamRecord for consistency	2011-10-19 13:15:27 -04:00
Eric Banks	5a6468c11e	Allowing ./X genotypes and adding a unit test to ensure that this case is covered from now on (especially given that we may want to revert in the future). Reverting this change is really easy and entails uncommenting a few lines of code. But for now, despite Mark's objections, this case is allowed in the VCF spec and we are wrong not to allow it.	2011-10-19 11:52:05 -04:00
Eric Banks	48c4a8cb33	Make error messages clearer (even I was confused)	2011-10-19 11:49:16 -04:00
Eric Banks	6cadaa84c9	Just use validate() from super class since it does the same thing	2011-10-19 11:48:23 -04:00
Mark DePristo	df3e4e1abd	First working code to use SamRecordFactory to produce objects of our own design in SAMFileReader	2011-10-19 11:22:35 -04:00
Mauricio Carneiro	c27e2fb676	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-18 15:23:05 -04:00
Mark DePristo	f77f2eeb7d	Fix for new ID structure	2011-10-18 13:04:43 -04:00
Mark DePristo	1a92ee3593	No longer adds a binding of ID -> . when the ID field is dot in the VCF -- Really we should make ID a primary key in VariantContext. Putting it into the attributes is just annoying now	2011-10-18 10:57:02 -04:00
Ryan Poplin	e45fcb66eb	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-17 15:56:19 -04:00
Ryan Poplin	1e6794c539	fixing typo in VariantsToTable docs	2011-10-17 15:56:02 -04:00
Mark DePristo	0de8550f17	Merged bug fix from Stable into Unstable	2011-10-17 15:29:53 -04:00
Mark DePristo	c1329c4dde	Fixing a binary to logical or	2011-10-17 15:29:45 -04:00
Mark DePristo	9e4963efc8	Merged bug fix from Stable into Unstable	2011-10-17 15:27:38 -04:00
Mark DePristo	ec911ce5bb	Even better error messages	2011-10-17 15:27:22 -04:00
Mark DePristo	d065bf1715	Merged bug fix from Stable into Unstable	2011-10-17 15:25:47 -04:00
Mark DePristo	a7cf9cdc67	Fixing error message typo	2011-10-17 15:25:35 -04:00
Ryan Poplin	589df6b7cf	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-17 14:35:14 -04:00
Ryan Poplin	6b02354d84	Adding a new getter in VariantsToTable to extract the indel event length.	2011-10-17 14:34:52 -04:00
Mark DePristo	3550798c4c	Merged bug fix from Stable into Unstable	2011-10-17 13:58:56 -04:00
Mark DePristo	4108a294f7	Better error message when a RodBinding file doesn't exist	2011-10-17 13:58:46 -04:00
Mark DePristo	cc76826f78	Merged bug fix from Stable into Unstable	2011-10-17 13:38:11 -04:00
Mark DePristo	09a09cacef	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/stable	2011-10-17 13:38:00 -04:00
Mark DePristo	fd4540cd32	Fixed extraordinarily subtle race condition with contracts invariant -- all of the methods in the class must be synchronized or the internal state can be inconsistent with the contract invariant when entering the class in a non-synchronized method, even when that method doesn't care about the object's internal state	2011-10-17 13:37:55 -04:00
David Roazen	88d6b8bc1f	Merged bug fix from Stable into Unstable	2011-10-14 20:13:38 -04:00
David Roazen	bd8bb93811	Split RScriptExecutorUnitTest into public and private test classes. We can't have a public test that depends on both public and private code/data -- the new release system needs to do public-only tests, and will catch this sort of thing.	2011-10-14 20:04:42 -04:00
David Roazen	4f01a742cb	Merged bug fix from Stable into Unstable	2011-10-13 21:39:52 -04:00
David Roazen	edfd6f8a06	Removing a public -> private dependency from the test suite. The public integration test VariantContextIntegrationTest was dependent on the private walker TestVariantContextWalker. Moved this walker to public/java/test (NOT public/java/src, since this walker is only used by the test suite) to avoid errors during public-only tests.	2011-10-13 21:32:52 -04:00
Mark DePristo	404ef741f1	Merged bug fix from Stable into Unstable	2011-10-13 18:02:06 -04:00
Mark DePristo	2ebdff074c	Update MD5s for SOLiD recalibration -- MD5 db had spelling error; fixed -- Bug in AlignmentUtils resulted in some bases not being color space corrected. The integration test caught the change, and it's clear that the new version is correct, as the prev. version was not considering the last the N qualities for reads with a ND operation.	2011-10-13 18:01:51 -04:00
Mark DePristo	5a881360df	Merged bug fix from Stable into Unstable	2011-10-13 15:54:43 -04:00
Mark DePristo	7cab6f6bb0	Bug fixes for thread unsafe simple timer and bad Ns treatment in AlignmentUtils -- SimpleTimer is now threadsafe using synchronized method keywords -- Bug fix for alignmentToByteArray() where the N case was refPos++ not the now correct refPos += elementLength	2011-10-13 15:53:12 -04:00
Mauricio Carneiro	e12ffb6547	Updating docs for GCContentByInterval This walker does not take any BAMs. It only walks over the reference.	2011-10-13 13:27:00 -04:00
Eric Banks	9aecd50473	Adding ability to exclude annotations from the VA and UG lists. As described in the docs, this argument trumps all others (including -all) so that we can get around the SnpEff issue brought up by Menachem. Added integration test for it.	2011-10-12 15:44:54 -04:00
Mauricio Carneiro	e53a952aeb	Added ION Torrent support to CountCovariates.	2011-10-12 01:57:02 -04:00
Mauricio Carneiro	a2733a451f	Added NotCalled feature to GAV Added "not called" and "no status" to the truth table. Very useful.	2011-10-11 19:31:45 -04:00
David Roazen	ae83420637	Merged bug fix from Stable into Unstable	2011-10-11 12:26:08 -04:00
David Roazen	794f275871	SnpEff is now marked as a RodRequiringAnnotation instead of an ExperimentalAnnotation. Having SnpEff grouped with the Experimental annotations was proving problematic, since it requires a rod. Placing it in its own group should improve the situation somewhat, making it easier to request "all annotations except for SnpEff".	2011-10-11 12:08:56 -04:00
David Roazen	cfd0ac8410	Merged bug fix from Stable into Unstable Conflicts: public/java/test/org/broadinstitute/sting/gatk/walkers/genotyper/UnifiedGenotyperIntegrationTest.java	2011-10-11 12:03:51 -04:00
David Roazen	24b72334b3	UnifiedGenotyper now correctly initializes the VariantAnnotator engine. This allows the annotation classes to perform any necessary initialization/validation. For example, it allows the SnpEff annotator to (among other things) validate its rod binding. This will prevent a NullPointerException when SnpEff annotation is requested but no rod binding is present. Added an integration test to cover this case so that it doesn't break again.	2011-10-11 12:02:05 -04:00
Guillermo del Angel	0429b38021	Merged bug fix from Stable into Unstable	2011-10-11 11:19:38 -04:00
Guillermo del Angel	1c485d8b5e	Forgot that no matter how trivial a change it's a good idea to compile first	2011-10-11 11:18:41 -04:00
Guillermo del Angel	6418f4d69b	Merged bug fix from Stable into Unstable	2011-10-11 11:13:18 -04:00
Guillermo del Angel	1975de1b32	Second try: hide --do_indel_quality in AnalyzeCovariates	2011-10-11 11:11:29 -04:00
Guillermo del Angel	6506ea83e8	Revert "Hide --do_indel_quality argument in AnalyzeCovariates. This shouldn't be documented nor used by external users"... a hidden passenger change made it through. This reverts commit 70e10ccb1be90dcff8f4485ae6ee036db2d1ac86.	2011-10-11 11:03:12 -04:00
Guillermo del Angel	4c1d8c8d44	Hide --do_indel_quality argument in AnalyzeCovariates. This shouldn't be documented nor used by external users	2011-10-11 11:01:06 -04:00
Eric Banks	77c983c5b5	No one claimed this walker and it doesn't have integration tests or GATKdocs so it doesn't belong in public.	2011-10-10 15:17:54 -04:00
Mark DePristo	fb72bcf732	DiffObjects no longer prints out the file name in the status so MD5 are stable	2011-10-10 15:10:57 -04:00
Mark DePristo	e3ff4f4266	Failing MD5 because output now contains absolute path	2011-10-10 11:05:02 -04:00
Mark DePristo	3e6c16d961	CombineVariants preserves allele order	2011-10-10 11:04:38 -04:00
Mark DePristo	a4bb842958	RankSum tests have lightly different MD5 results based on allele order -- UG GENOTYPE_GIVEN_ALLELES now uses the order of alleles in the VCF, so this changes the MD5	2011-10-10 11:04:07 -04:00
Mark DePristo	46e7370128	this.allele, getAlleles(), and getAltAlleles() now return List not set -- Changes associated code throughout the codebase -- Updated necessary (but minimal) UnitTests to reflect new behavior -- Much better makealleles() function in VC.java that enforces a lot of key constraints in VC	2011-10-09 11:45:55 -07:00
Mark DePristo	822654b119	UnitTests for allele getting functions in VC in prep for move from set to list	2011-10-09 10:36:14 -07:00
Mark DePristo	c67f6c076b	simpleMerge now preserves allele order -- UnitTests for dangerous PL merging cases in the multi-allelic case. The new behavior is correct	2011-10-08 17:39:53 -07:00
Mark DePristo	e94e6ba101	A UnitTest to ensure that the order of alleles is maintained -> A, C, T and A, T, C are different and must be maintained. The constructors were doing this appropriately, so nothing needed to be changed	2011-10-08 08:47:58 -07:00
Mark DePristo	ec14a4a606	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-07 08:38:50 -07:00
Matt Hanna	6fbd41724a	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-07 11:20:00 -04:00
Matt Hanna	4514bc350f	More reliable way of finding the Tribble jar.	2011-10-07 11:19:29 -04:00
Eric Banks	181c76750e	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-06 22:38:55 -04:00
Eric Banks	ca9cd9b688	Minor fix for merging intervals which hadn't been necessary when only merging from the left to right. Added integration tests to cover the parallelization of RTC.	2011-10-06 22:38:44 -04:00
Khalid Shakir	f91b015e0e	Made the BaseTest.testDir absolute	2011-10-06 22:33:21 -04:00
Mark DePristo	c7864c7256	Filter application order is now deterministic, in the order defined by the walker -- For no apparent reason we were using a HashSet to store the ReadFilters, so the order of operations was really arbitrarily applied. The order now is (1) the order of the walker intrinsic filters (2) read group black list (if provided) (3) command line filters (if provided)	2011-10-06 18:51:40 -07:00
Mark DePristo	0b88af4af9	Counts of records failing filters are displayed sorted -- Stops random ordering of the output, as the counts are returned sorted by string name of the class -- Deleted now unused sh*tty assessors in Utils	2011-10-06 18:42:26 -07:00
Mark DePristo	d1e70d6ec2	Removed Nx counting of reads in metrics with -nt > 1	2011-10-06 18:29:26 -07:00
Eric Banks	c61804a450	Rename the long version of the argument name to more accurately reflect its purpose.	2011-10-06 16:14:04 -04:00
Eric Banks	61a3dfae24	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-06 15:58:04 -04:00
Eric Banks	6eb87bf58a	RTC now caches all intervals as GenomeLocs (which is expected to take < 1Gb whole genome based on back of the envelope calculations with Matt) so that 1) we don't have to worry about emitting outside of the leaves in the hierarchical reductions and 2) we can emit the intervals in sorted order which is a big performance plus for the realigner. Integration tests change only because intervals whose start=stop are now printed as chr:start instead of chr:start-stop.	2011-10-06 15:57:49 -04:00
Mark DePristo	6d9c210460	Updating MD5s for updated BAM with read groups	2011-10-06 12:15:48 -07:00
Mark DePristo	ab357ef900	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-06 10:50:02 -07:00
Eric Banks	1b0735f0a3	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-06 13:41:45 -04:00
Eric Banks	c4dfc1fb8b	Temporary commit of parallelization support for RealignerTargetCreator. Tim begged us for this and I got assurances from Khalid/Matt that this would also be extremely helpful for the whole genome calling pipeline, so I spent a while working on this. Needs to be fixed up though because apparently only the leaves in the hierarchical reduce get their output aggregated. Worked out a better solution with Matt.	2011-10-06 13:41:36 -04:00
Matt Hanna	3961733590	Merged bug fix from Stable into Unstable	2011-10-06 12:54:52 -04:00
Matt Hanna	4fa5045e84	Abandoning classfileset/rootfileset approach due to difficulting managing classloading of bcel.jar/ant-apache-bcel.jar. Switching instead to manually specifying a minimal set of packages/classes to include in the vcf.jar via build.xml, and adding a unit test which creates a limited classloader only aware of vcf.jar and tribble.jar and tries to use it to load the core classes in the vcf jar. Hopefully third time's the charm.	2011-10-06 12:49:51 -04:00
Mark DePristo	73f9d1f217	GATK read group requirement iron hand -- The GATK will now throw a user exception if it opens a SAM/BAM file that doesn't have at least one RG defined -- LIBS again throws an error if the complete list of samples isn't provided -- Updating ExmpleCountLociPipeline test to use the well-formated versions of the exampleBAM and exampleFASTA files in testdata, instead of the old broken ones in validation_data. -- Convenience constructors for UserExceptions.MalformedBAM	2011-10-06 08:40:35 -07:00
Mark DePristo	23845ac798	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-06 08:17:08 -07:00
Mark DePristo	4b5b9155a9	Fixed bad expected value in PedReaderUnitTest	2011-10-06 08:16:47 -07:00
Mark DePristo	daa5999489	Fixed typo in argument description	2011-10-06 08:16:25 -07:00
Guillermo del Angel	8a474e38ff	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-06 10:08:39 -04:00
Guillermo del Angel	93f7e632bd	Minor fix/enhancement for VariantEval: if a vcf has symbolic alleles, program would crash ungracefully - now we'll just skip record without processing. This is a big issue since we can't process 1000G integration files with code as is.	2011-10-06 10:07:46 -04:00
Mark DePristo	190be4d0d1	Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-10-05 21:27:11 -07:00
Mark DePristo	8e6845806a	Allowing empty samples list in LIBS -- Right now we cannot process BAM files without read groups because we enforce the samples list to not be empty when there's a SAM record. Now if there are reads and there are no samples we add the "null" sample so that LIBS walks the reads properly	2011-10-05 21:26:21 -07:00
Matt Hanna	180c8f286f	Merged bug fix from Stable into Unstable	2011-10-05 20:37:43 -04:00
Matt Hanna	55b9f06527	Ensure that IndelRealigner n-way out option supports MD5 generation.	2011-10-05 20:36:28 -04:00
Mark DePristo	be2d29ce69	Final PED documentation	2011-10-05 15:17:41 -07:00
Mark DePristo	3226d5dc0d	Merge branch 'master' into ped	2011-10-05 15:03:09 -07:00
Mark DePristo	6a573437af	Details documentation arguments for -ped	2011-10-05 15:00:58 -07:00
Mark DePristo	e7c80f7c45	Renaming quantitative trait to OtherPhenotype which is now a String not a double -- we can now use PED file to represent population data or other arbitrary phenotype data, not just doubles	2011-10-05 12:26:33 -07:00
Mark DePristo	51ecc20867	getFamily() and associated methods implemented and tested -- Sample no longer serializable -- Sample now implements Comparable	2011-10-05 09:55:05 -07:00
Mark DePristo	f4bac58f14	Merged bug fix from Stable into Unstable	2011-10-04 21:00:34 -07:00
Mark DePristo	d1d39943d0	Updating MD5 for BAMs that I added a read group to, part 2	2011-10-04 21:00:15 -07:00
Mark DePristo	9bd3ba4c7e	Missed one MD5	2011-10-04 16:04:52 -07:00
Mark DePristo	ffdfdcde3f	Updating MD5s -- Interval test now uses RG containing BAM -- DoC sample name ordering has changed.	2011-10-04 15:54:45 -07:00
Mark DePristo	a45d985818	TODO method stubs	2011-10-04 15:54:09 -07:00
Mark DePristo	463eab7604	All MD5 mismatches for test are shown -- Now for tests like DoC, with 20 output md5s, you see all of the differences before failing.	2011-10-04 15:53:52 -07:00
Mark DePristo	c642a080d4	Merged bug fix from Stable into Unstable	2011-10-04 14:08:41 -07:00
Mark DePristo	941317167e	Updating MD5 for BAMs that I added a read group to	2011-10-04 14:08:00 -07:00
Mark DePristo	e1d6c7a50a	Updating MD5 that have changed due to sample ordering differences	2011-10-04 09:33:23 -07:00
Mark DePristo	343a7b6b2f	Updating UG integration tests for arbitrary impact of sample order changes on downsampling	2011-10-04 08:14:00 -07:00
Mark DePristo	fee89e47ff	Only throws an error when there are no samples but there are reads -- Handles the case when you are running a ROD traversal and yet the LIBS is still used to return null everywhere.	2011-10-04 06:50:54 -07:00
Mark DePristo	f552aede42	Only provide the sample names in the BAM file for efficiency	2011-10-04 06:50:12 -07:00
Mark DePristo	a27641e1fc	Cleaned up imports	2011-10-04 06:28:36 -07:00
Mark DePristo	b20689ff55	No longer supports extraProperties -- the underlying data structure is still present, but until I decide what to do for the extensible system I've completely disabled the subsystem -- Added code to merge Samples, so that a mostly full record can be merged with a consistent empty record. If the two records are inconsistent, an error is thrown -- addSample() in Sample.class now invokes mergeSample() when appropriate -- Validation types are now only STRICT or SILENT -- Validation code implemented in SampleDBBuilder -- Extensive unit tests for SampleDBBuilder	2011-10-03 19:20:33 -07:00
Mark DePristo	867a7476c1	Systematic unit tests for the sample object	2011-10-03 19:09:02 -07:00
Mauricio Carneiro	3837aa45b4	Fixing conflicts Conflicts: public/java/test/org/broadinstitute/sting/utils/clipreads/ReadClipperUnitTest.java	2011-10-03 19:07:59 -07:00
Mark DePristo	2e3dc52088	Minor function renaming	2011-10-03 14:41:13 -07:00

1 2 3 4 5 ...

1114 Commits (cd3146f4cff4bc81aee7fd728dfb9c9a080e4247)