gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Mark DePristo	0376d73ece	Improved, public version of ErrorRateByCycle -- A cleaner table output (molten). For those interested in seeing how this can be done with GATKReports look here for a nice clean example -- Integration tests -- Minor improvements to GATKReportTable with methods to getPrimaryKeys	2012-03-07 13:10:08 -05:00
Mark DePristo	569be953b9	Bugfix for VariantEval -- We weren't properly handling the case where a site had both a SNP and indel in both eval and comp. These would naturally pair off as SNP x SNP and INDEL x INDEL in eval, but we'd still invoke update2 with (null, SNP) and (null, INDEL) resulting most conspicously as incorrect false negatives in the validation report. -- Updating misc. integrationtests, as the counting of comps (in particular for dbSNP) was inflated because of this effect.	2012-03-06 16:56:59 -05:00
David Roazen	811f871f78	Do not fail tests that require the GATK private key if the user does not have permission to read it Several of the unit tests for the new key authorization feature require read access to the GATK master private key file. Since this file is only readable by members of the group gsagit, this makes it hard for people outside the group to run the test suite. Now, we skip tests that require the master private key if the private key exists (since not existing would be a true error) but is not readable by the user running the test suite Bamboo, of course, will always be able to run these tests.	2012-03-06 15:57:02 -05:00
Ryan Poplin	46b470cc69	Minor misc updates	2012-03-06 10:14:45 -05:00
David Roazen	0702ee1587	Public-key authorization scheme to restrict use of NO_ET -Running the GATK with the -et NO_ET or -et STDOUT options now requires a key issued by us. Our reasons for doing this, and the procedure for our users to request keys, are documented here: http://www.broadinstitute.org/gsa/wiki/index.php/Phone_home -A GATK user key is an email address plus a cryptographic signature signed using our private key, all wrapped in a GZIP container. User keys are validated using the public key we now distribute with the GATK. Our private key is kept in a secure location. -Keys are cryptographically secure in that valid keys definitely came from us and keys cannot be fabricated, however keys are not "copy-protected" in any way. -Includes private, standalone utilities to create a new GATK user key (GenerateGATKUserKey) and to create a new master public/private key pair (GenerateKeyPair). Usage of these tools will be documented on the internal wiki shortly. -Comprehensive unit/integration tests, including tests to ensure the continued integrity of the GATK master public/private key pair. -Generation of new user keys and the new unit/integration tests both require access to the GATK private key, which can only be read by members of the group "gsagit".	2012-03-06 00:09:43 -05:00
Ryan Poplin	f6905630bb	Adding Unit test for Haplotype class. Used in HC's genotype given alleles mode.	2012-03-05 21:08:07 -05:00
Ryan Poplin	14a77b1e71	Getting rid of redundant methods in MathUtils. Adding unit tests for approximateLog10SumLog10 and normalizeFromLog10. Increasing the precision of the Jacobian approximation used by approximateLog10SumLog which changes the UG+HC integration tests ever so slightly.	2012-03-05 12:28:32 -05:00
Ryan Poplin	f879daa7d0	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-03-05 08:29:08 -05:00
Ryan Poplin	d6871967ae	Adding more unit tests and contracts to PairHMM util class. Updating HaplotypeCaller to use the new PairHMM util class. Now that the HMM result isn't dependent on the length of the haplotype there is no reason to ensure all haplotypes have the save length which simplifies the code considerably.	2012-03-05 08:28:42 -05:00
Mark DePristo	69611af7d3	Workaround for bug in Picard in ReadGroupProperties -- NPE caused when you call getRunDate on a read group without a date.	2012-03-02 18:53:45 -05:00
Mark DePristo	2f334a57c2	ReadGroupProperties mk2 -- Includes paired end status (T/F) -- Includes count of reads used in calculation -- Includes simple read type (2x76 for example) -- Better handling of insert size, read length when there's no data, or the data isn't paired end by emitting NA not 0	2012-03-01 18:43:53 -05:00
Mauricio Carneiro	29f74b658b	Unit tests for the context covariate this is simple, but it's the infra-structure to start messing around with the context.	2012-03-01 17:56:45 -05:00
Mark DePristo	aff508e091	ReadGroupProperties walker and associated infrastructure -- ReadGroupProperties: Emits a GATKReport containing read group, sample, library, platform, center, median insert size and median read length for each read group in every BAM file. -- Median tool that collects up to a given maximum number of elements and returns the median of the elements. -- Unit and integration tests for everything. -- Making name of TestProvider protected so subclasses and override name more easily	2012-03-01 15:01:11 -05:00
Mauricio Carneiro	d379c3763a	DNA Sequence to BitSet and vice-versa conversion tools * Turns DNA sequences (for context covariates) into bit sets for maximum compression * Allows variable context size representation guaranteeing uniqueness. * Works with long precision, so it is limited to a context size of 31 bases (can be extended with BigNumber precision if necessary). * Unit Tests added	2012-02-29 19:25:20 -05:00
Mark DePristo	ca0931c01f	Adding test for reading samtools VCF file	2012-02-27 17:05:50 -05:00
Eric Banks	bd944ab04f	Another test where we no longer print out 'NaN' for the AF.	2012-02-27 15:19:08 -05:00
Eric Banks	52871187d7	Adding integration test for file with no GTs. Also updated md5 for one other test (since we no longer print out 'NaN' for the AF).	2012-02-27 15:09:56 -05:00
Eric Banks	1ea34058c2	Updating integration tests now that standard annotations support multiple alleles	2012-02-27 11:32:26 -05:00
Guillermo del Angel	16122bea8d	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-02-25 13:57:54 -05:00
Guillermo del Angel	dea35943d1	a) Bug fix in calling new functions that give indel bases and length from regular pileup in LocusIteratorByState, b) Added unit test to cover these.	2012-02-25 13:57:28 -05:00
Mark DePristo	c8a06e53c1	DoC now properly handles reference N bases + misc. additional cleanups -- DoC now by default ignores bases with reference Ns, so these are not included in the coverage calculations at any stage. -- Added option --includeRefNSites that will include them in the calculation -- Added integration tests that ensures the per base tables (and so all subsequent calculations) work with and without reference N bases included -- Reorganized command line options, tagging advanced options with @Advanced	2012-02-25 11:32:50 -05:00
Mark DePristo	50de1a3eab	Fixing bad VCFIntegration tests -- Left disabled a test that should have been enabled -- Didn't add the md5 to the test I actually added -- Now VCFIntegrationTests should be working!	2012-02-25 11:26:36 -05:00
Mark DePristo	e0c189909f	Added support for breakpoint alleles -- See https://getsatisfaction.com/gsa/topics/support_vcf_4_1_structural_variation_breakend_alleles?utm_content=topic_link&utm_medium=email&utm_source=new_topic -- Added integrationtest to ensure that we can parse and write out breakpoint example	2012-02-23 12:14:48 -05:00
Mauricio Carneiro	75783af6fc	int <-> BitSet conversion utils for MathUtils * added unit tests.	2012-02-21 14:10:36 -05:00
David Roazen	85d31f80a2	Merged bug fix from Stable into Unstable	2012-02-13 16:37:11 -05:00
David Roazen	03e5184741	Fix serious engine bug that could cause reads to be dropped under certain circumstances When aggregating raw BAM file spans into shards, the IntervalSharder tries to combine file spans when it can. Unfortunately, the method that combines two BAM file spans was seriously flawed, and would produce a truncated union if the file spans overlapped in certain ways. This could cause entire regions of the BAM file containing reads within the requested intervals to be dropped. Modified GATKBAMFileSpan.union() to correct this problem, and added unit tests to verify that the correct union is produced regardless of how the file spans happen to overlap. Thanks to Khalid, who did at least as much work on this bug as I did.	2012-02-13 16:25:21 -05:00
Eric Banks	ad90af94ed	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-02-13 15:10:10 -05:00
Eric Banks	0920a1921e	Minor fixes to splitting multi-allelic records (as regards printing indel alleles correctly); minor code refactoring; adding integration tests to cover +/- splitting multi-allelics.	2012-02-13 15:09:53 -05:00
Eric Banks	14981bed10	Cleaning up VariantsToTable: added docs for supported fields; removed one-off hidden arguments for multi-allelics; default behavior is now to include multi-allelics in one record; added option to split multi-allelics into separate records.	2012-02-13 14:32:03 -05:00
Ryan Poplin	41ffd08d53	On the fly base quality score recalibration now happens up front in a SAMIterator on input instead of in a lazy-loading fashion if the BQSR table is provided as an engine argument. On the fly recalibration is now completely hooked up and live.	2012-02-13 12:35:09 -05:00
Eric Banks	f52f1f659f	Multiallelic implementation of the TDT should be a pairwise list of values as per Mark Daly. Integration tests change because the count in the header is now A instead of 1.	2012-02-10 14:15:59 -05:00
Eric Banks	5e18020a5f	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-02-10 11:08:33 -05:00
Eric Banks	f53cd3de1b	Based on Ryan's suggestion, there's a new contract for genotyping multiple alleles. Now the requester submits alleles in any arbitrary order - rankings aren't needed. If the Exact model decides that it needs to subset the alleles because too many were requested, it does so based on PL mass (in other words, I moved this code from the SNPGenotypeLikelihoodsCalculationModel to the Exact model). Now subsetting alleles is consistent.	2012-02-10 11:07:32 -05:00
Mauricio Carneiro	5af373a3a1	BQSR with indels integrated! * added support to base before deletion in the pileup * refactored covariates to operate on mismatches, insertions and deletions at the same time * all code is in private so original BQSR is still working as usual in public * outputs a molten CSV with mismatches, insertions and deletions, time to play! * barely tested, passes my very simple tests... haven't tested edge cases.	2012-02-09 18:46:45 -05:00
Eric Banks	7a937dd1eb	Several bug fixes to new genotyping strategy. Update integration tests for multi-allelic indels accordingly.	2012-02-09 16:14:22 -05:00
Mauricio Carneiro	d561914d4f	Revert "First implementation of GATKReportGatherer" premature push from my part. Roger is still working on the new format and we need to update the other tools to operate correctly with the new GATKReport. This reverts commit aea0de314220810c2666055dc75f04f9010436ad.	2012-02-08 23:28:55 -05:00
Eric Banks	2f800b078c	Changes to default behavior of UG: multi-allelic mode is always on; max number of alternate alleles to genotype is 3; alleles in the SNP model are ranked by their likelihood sum (Guillermo will do this for indels); SB is computed again.	2012-02-08 15:27:16 -05:00
Mauricio Carneiro	337819e791	disabling the test while we fix it	2012-02-07 19:22:32 -05:00
Roger Zurawicki	c0c676590b	First implementation of GATKReportGatherer - Added the GATKReportGatherer - Added private methods in GATKReport to combine Tables and Reports - It is very conservative and it will only gather if the table columns, match. - At the column level it uses the (redundant) row ids to add new rows. It will throw an exception if it is overwriting data. Added the gatherer functions to CoverageByRG Also added the scatterCount parameter in the Interval Coverage script Made some more GATKReport methods public The UnitTest included shows that the merging methods work Added a getter for the PrimaryKeyName Fixed bugs that prevented the gatherer form working Working GATKReportGatherer Has only the functional to addLines The input file parser assumes that the first column is the primary key Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>	2012-02-07 18:14:47 -05:00
Mauricio Carneiro	e1d69e4060	make the size of a GenomeLoc int instead of long it will never be bigger than an int and it's actually useful to be an int so we can use it as parameters to array/list/hash size creation.	2012-02-03 17:12:42 -05:00
Mauricio Carneiro	d5d4fa8a88	Fixed discordance bug reported by Brad Chapman discordance now reports discordance between genotypes as well (just like concordance)	2012-01-30 09:50:45 -05:00
Mauricio Carneiro	2a565ebf90	embarrassing fix-up, thanks Khalid.	2012-01-26 19:58:42 -05:00
Mauricio Carneiro	246e085ec9	Unit tests for GATKSAMRecord class * new unit tests for the alignment shift properties of reduce reads * moved unit tests from ReadUtils that were actually testing GATKSAMRecord, not any of the ReadUtils to it. * cleaned up ReadUtilsUnitTest	2012-01-26 17:06:36 -05:00
Ryan Poplin	cdff23269d	HaplotypeCaller now uses insertions and softclipped bases as possible triggers. LocusIteratorByState tags pileup elements with the required info to make this calculation efficient. The days of the extended event pileup are coming to a close.	2012-01-26 15:56:33 -05:00
Eric Banks	ddaf51a50f	Updated one integration test for indels	2012-01-25 19:18:51 -05:00
Eric Banks	e349b4b14b	Allow appending with the dbSNP ID even if a (different) ID is already present for the variant rod.	2012-01-25 11:35:54 -05:00
Mauricio Carneiro	ffd61f4c1c	Refactor the Pileup Element with regards to indels Eric reported this bug due to the reduced reads failing with an index out of bounds on what we thought was a deletion, but turned out to be a read starting with insertion. * Refactored PileupElement to distinguish clearly between deletions and read starting with insertion * Modified ExtendedEventPileup to correctly distinguish elements with deletion when creating new pileups * Refactored most of the lazyLoadNextAlignment() function of the LocusIteratorByState for clarity and to create clear separation between what is a pileup with a deletion and what's not one. Got rid of many useless if statements. * Changed the way LocusIteratorByState creates extended event pileups to differentiate between insertions in the beginning of the read and deletions. * Every deletion now has an offset (start of the event) * Fixed bug when LocusITeratorByState found a read starting with insertion that happened to be a reduced read. * Separated the definitions of deletion/insertion (in the beginning of the read) in all UG annotations (and the annotator engine). * Pileup depth of coverage for a deleted base will now return the average coverage around the deletion. * Indel ReadPositionRankSum test now uses the deletion true offset from the read, changed all appropriate md5's * The extra pileup elements now properly read by the Indel mode of the UG made any subsequent call have a different random number and therefore all RankSum tests have slightly different values (in the 10^-3 range). Updated all appropriate md5s after extremely careful inspection -- Thanks Ryan! phew!	2012-01-24 16:07:21 -05:00
Khalid Shakir	c18beadbdb	Device files like /dev/null are now tracked as special by Queue and are not used to generate .out file paths, scattered into a temporary directory, gathered, deleted, etc. Attempted workaround for xdr_resourceInfoReq unsatisfied link during loading of libbat.so.	2012-01-23 16:17:04 -05:00
Mark DePristo	02450e4b12	Merged bug fix from Stable into Unstable	2012-01-23 12:08:39 -05:00
Mark DePristo	80a4ce0edf	Bugfix for incorrect error messages for missing BAMs and VCFs -- Missing BAMs were appearing as StingExceptions -- Missing VCFs were showing up as CommandLineErrors, but it's clearer for them to be CouldNotReadInputFile exceptions -- Added integration tests to ensure missing BAMs, VCFs, and -L files are properly thrown as CouldNotReadInputFile exceptions -- Added path to standard b37 BAM to BaseTest -- Cleaned up code in SAMDataSource, removing my parallel loading code as this just didn't prove to be useful.	2012-01-23 09:52:07 -05:00
Christopher Hartl	4a08e8ca6e	Minor tweaks to T2D-related qscripts. Replacing old md5s from the BeagleIntegrationTest. All differences boiled down either to the accounting of genotypes changed (./. --> 0/0 is no longer a "changed" genotype, and original genotypes that were ./. are represented as OG=. rather than OG=./. .) This is somewhat of an arbitrary decision, and is negotiable. I could see treating GT:PL ./.:. differently from GT:PL .:0,3,6 but am not sure the worth of doing so.	2012-01-23 08:25:34 -05:00
Eric Banks	ab8f499bc3	Annotate with FS even for filtered sites	2012-01-18 22:04:51 -05:00
Ryan Poplin	0268da7560	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-18 09:53:00 -05:00
Ryan Poplin	60024e0d7b	updating TDT integration test	2012-01-18 09:52:50 -05:00
Mark DePristo	0c7865fdb5	UnitTest for reverseAlleleClipping -- No code modified yet, just implementing a unit test to ensure correctness of the existing code	2012-01-18 07:35:11 -05:00
Mauricio Carneiro	cec7107762	Better location for the downsampling of reads in PrintReads * using the filter() instead of map() makes for a cleaner walker. * renaming the unit tests to make more sense with the other unit and integration tests	2012-01-14 14:06:09 -05:00
Mauricio Carneiro	28aa353501	Added "unbiased" downsampling parameter to PrintReads * also cleaned up and updated part of the unit tests for print reads. Needs a more thorough cleaning.	2012-01-12 16:33:55 -05:00
Matt Hanna	2c3176eb80	Merged bug fix from Stable into Unstable	2012-01-12 13:31:10 -05:00
Matt Hanna	cd43f016ce	Fixed NPE in getNextOverlappingBAMScheduleEntry() when mixed mapped/unmapped interval lists are used. Added integrationtest to verify behavior.	2012-01-12 13:29:11 -05:00
Mauricio Carneiro	77a03c9709	Patching special case in the adaptor clipping * if the adaptor boundary is more than MAXIMUM_ADAPTOR_SIZE bases away from the read, then let's not clip anything and consider the fragment to be undetermined for this read pair. * updated md5's accordingly	2012-01-11 17:47:44 -05:00
Eric Banks	c5320ef1af	Resolving changes in integration test during merge	2012-01-10 12:14:16 -05:00
Eric Banks	0f36f6947e	Resolving merge conflicts	2012-01-10 11:44:16 -05:00
Eric Banks	f2cecce10f	Much better implementation of the approximate summing of an array of log10 values (including more efficient rounding). Now effectively takes 0% of UG runtime on T2D GENES (as opposed to 11% previously).	2012-01-10 11:34:23 -05:00
Mark DePristo	dd80ffbbbe	Merged bug fix from Stable into Unstable	2012-01-05 21:51:48 -05:00
Mark DePristo	c96fee477c	Bug fix for VariantSummary -- Call sets with indels > 50 bp in length are tagged as CNVs in the tag (following the 1000 Genomes convention) and were unconditionally checking whether the CNV is already known, by looking at the known cnvs file, which is optional. Fixed. Has the annoying side effect that indels > 50bp in size are not counted as indels, and so are substrated from both the novel and known counts for indels. C'est la vie -- Added integration test to check for this case, using Mauricio's most recent VCF file for NA12878 which has many large indels. Using this more recent and representative file probably a good idea for more future tests in VE and other tools. File is NA12878.HiSeq.WGS.b37_decoy.indel.recalibrated.vcf in Validation_Data	2012-01-05 21:51:06 -05:00
Guillermo del Angel	58d4539304	Enabled banded indel computation by default. Reversed logic in input UG argument so that we can still disable it if required. Minor changes to integration tests due to minor differences in GL's and in annotations	2012-01-04 15:28:26 -05:00
David Roazen	621ee2b613	Merged bug fix from Stable into Unstable	2012-01-03 16:56:49 -05:00
David Roazen	ea6e718cb8	SnpEff 2.0.5 support. Re-enabled SnpEff in the HybridSelectionPipeline. For now, we recommend only running with the GRCh37.64 database.	2012-01-03 15:18:36 -05:00
David Roazen	4984ca5e31	Merged bug fix from Stable into Unstable	2012-01-03 11:03:30 -05:00
David Roazen	f3f01da1af	Enforce serial dependencies in RecalibrationWalkersIntegrationTest Some tests in this class were intermittently not being executed due to being randomly scheduled before tests whose results they depend on. Now the serial dependencies are enforced to avoid problematic orderings.	2012-01-03 10:42:41 -05:00
Eric Banks	ab8d47d9a5	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-01-03 09:38:49 -05:00
Mauricio Carneiro	cd68cc239b	Added knuth-shuffle (KS) and randomSubset using KS to MathUtils * Knuth-shuffle is a simple, yet effective array permutator (hope this is good english). * added a simple randomSubset that returns a random subset without repeats of any given array with the same probability for every permutation. * added unit tests to both functions	2012-01-03 09:29:46 -05:00
Mauricio Carneiro	94791a2a75	Add support for reads starting with insertion * Modified cleanCigarShift to allow insertions in the beginning and end of the read * Allowed cigars starting/ending in insertions in the systematic ReadClipper tests * Updated all ReadClipper unit tests * ReduceReads does not hard clip leading insertions by default anymore * SlidingWindow adjusts start location if read starts with insertion * SlidingWindow creates an empty element with insertions to the right * Fixed all potential divide by zero with totalCount() (from BaseCounts) * Updated all Integration tests * Added new integration test for multiple interval reducing	2012-01-03 09:29:45 -05:00
Mauricio Carneiro	1b6d52817e	fixing adaptor clipping effect on recalibration integration test	2012-01-01 22:20:06 -05:00
Eric Banks	393993e0c7	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-12-31 20:42:46 -05:00
Mauricio Carneiro	55cfa76cf3	Updated integration tests for the new adaptor clipping fix.	2011-12-30 18:47:14 -05:00
Mauricio Carneiro	c7d0a9ebee	Forgot to test for inter-chromosomal mates in the adaptor clipping * Fixing bug caught by Eric (and Kristian)	2011-12-30 00:19:53 -05:00
Eric Banks	1a45ea5a05	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-12-29 11:37:15 -05:00
Eric Banks	d20a25d681	A much better way of choosing the alternate allele(s) to genotype in the SNP model of UG: instead of looking at the sum of base qualities (which can and did lead to us over-genotyping esp. when allowing multiple alternate alleles), we look at the likelihoods themselves (free since we are already calculating likelihoods for all 10 genotypes). Now, even if the base quals exceed some arbitrary threshold, we only bother genotyping an alternate allele when there's a sample for which it is more likely than ref/ref (I can generate weird edge cases where this falls apart, but none that model truly variable sites that we actually want to call). This leads to a huge efficiency improvement esp. for exomes (and esp. for many samples) where we almost always were trying to genotype all 3 alternate alleles. Integration tests change only because ref calls have slight QUAL differences (because the best alt allele is still chosen arbitrarily, but differently).	2011-12-27 16:50:38 -05:00
Mauricio Carneiro	17bfe48d5e	Made all class methods private in the ReadClipper * ReadClipperUnitTest now uses static methods * Haplotype caller now uses static methods * Exon Junction Genotyper now uses static methods	2011-12-27 02:11:32 -05:00
David Roazen	506c0e9c97	Disabling SnpEff support in the GATK and SnpEff annotation in the HybridSelectionPipeline SnpEff support will remain disabled until SnpEff 2.0.4 has been officially released and we've verified the quality of its annotations.	2011-12-23 19:12:57 -05:00
David Roazen	510c71158c	Merged bug fix from Stable into Unstable	2011-12-22 10:49:52 -05:00
David Roazen	32cdef9682	Rename PerformanceTest test classes to LargeScaleTest This is in preparation for the installation of the new performance test suite in Bamboo. Note that "ant performancetest" is now "ant largescaletest"	2011-12-22 10:38:49 -05:00
Mauricio Carneiro	731a463415	Updated IntegrationTests with new adaptor clipper phew!	2011-12-20 17:48:52 -05:00
Mauricio Carneiro	cadff40247	getRefCoordSoftUnclippedStart and End refactor These functions are methods of the read, and supplement getAlignmentStart() and getUnclippedStart() by calculating the unclipped start counting only soft clips. * Removed from ReadUtils * Added to GATKSAMRecord * Changed name to getSoftStart() and getSoftEnd * Updated third party code accordingly.	2011-12-20 17:48:51 -05:00
Mauricio Carneiro	f73ad1c2e2	Bugfix/Rewrite: Algorithm to determine adaptor boundaries The algorithm wasn't accounting for the case where the read is the reverse strand and the insert size is negative. * Fixed and rewrote for more clarity (with Ryan, Mark and Eric). * Restructured the code to handle GATKSAMRecords only * Cleaned up the other structures and functions around it to minimize clutter and potential for error. * Added unit tests for all 4 cases of adaptor boundaries.	2011-12-20 17:48:39 -05:00
Mauricio Carneiro	78d9bf7196	Added REVERT_SOFTCLIPPED_BASES capability to ReadClipper * New ClippingOp REVERT_SOFTCLIPPED_BASES turns soft clipped bases into matches. * Added functionality to clipping op to revert all soft clip bases in a read into matches * Added revertSoftClipBases function to the ReadClipper for public use * Wrote systematic unit tests	2011-12-20 00:04:30 -05:00
Laurent Francioli	16cc2b864e	- Corrected bug causing cases where both parents are HET to be accounted twice in the TDT calculation - Adapted TDT Integration test to corrected version of TDT Signed-off-by: Ryan Poplin <rpoplin@broadinstitute.org>	2011-12-19 10:30:59 -05:00
Eric Banks	3069a689fe	Bug fix: if there are multiple records at a given position, it turns out that SelectVariants would drop all variants that follow after one that fails filters (instead of dropping just the failing one). Added an integration test to cover this case.	2011-12-19 10:04:33 -05:00
Mauricio Carneiro	5b678e3b94	Remove ClippingOp UnitTests * all testing functionality is in the ReadClipperUnitTest, no need to double test. * class and package naming cleanup	2011-12-19 07:49:26 -05:00
Eric Banks	76bd13a1ed	Forgot to update the unit test	2011-12-18 01:13:49 -05:00
Eric Banks	07f9d14d9f	Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2011-12-18 00:43:15 -05:00
Eric Banks	c5ffe0ab04	No reason to sum the normalized posteriors array to get Pr(AF>0) given that we can just compute 1.0 - array[0]. Integration tests change only because of trivial precision artifacts for reference calls using EMIT_ALL_SITES.	2011-12-18 00:31:47 -05:00
Eric Banks	6dc52d42bf	Implemented the proper QUAL calculation for multi-allelic calls. Integration tests pass except for the ones making multi-allelic calls (duh) and one of the SLOD tests (which used to print 0 when one of the LODs was NaN but now we just don't print the SB annotation for that record).	2011-12-18 00:01:42 -05:00
Khalid Shakir	6059ca76e8	Removing cruft that snuck in last commit.	2011-12-16 23:00:16 -05:00
Khalid Shakir	7486696c07	When using bam list mode in HSP deriving VCF name from bam list instead of requiring an additional parameter. Creating a single temporary directory per ant test run instead of a putting temp files across all runs in the same directory. Updated various tests for above items and other small fixes.	2011-12-16 18:09:25 -05:00
Mauricio Carneiro	e5df9e0684	cleaner test output cleaned up the debug "pass" messages in the unit tests	2011-12-16 18:04:00 -05:00
Mauricio Carneiro	fcc21180e8	Added hardClipLeadingInsertions UnitTest for the ReadClipper fixed issue where a read starting with an insertion followed by a deletion would break, clipper can now safely clip the insertion and the deletion if that's the case. note: test is turned off until contract changes to allow hanging insertions (left/right).	2011-12-16 18:02:47 -05:00
Mauricio Carneiro	075be52adc	Added hardClipByReferenceCoordinates (left and right tails) UnitTest for the ReadClipper	2011-12-16 18:01:33 -05:00
Mauricio Carneiro	5bba44d693	Added hardClipByReferenceCoordinates UnitTest for the ReadClipper * fixed edge case when requested to hard clip beginning of a read that had hanging soft clipped bases on the left tail. * fixed edge case when requested to hard clip end of a read that had hanging soft clipped bases on the right tail. * fixed AlignmentStart of a clipped read that results in only hard clips and soft clips note: added tests to all these beautiful cases...	2011-12-16 18:01:33 -05:00

1 2 3 4 5 ...

631 Commits (8a9fb514b67ae1a66f9e1fa907ae75573e777b87)