Commit Graph

969 Commits (f4b409fa0df0ffc601cbf9efc718ab58c19b9b5c)

Author SHA1 Message Date
Mark DePristo a27641e1fc Cleaned up imports 2011-10-04 06:28:36 -07:00
Mark DePristo b20689ff55 No longer supports extraProperties
-- the underlying data structure is still present, but until I decide what to do for the extensible system I've completely disabled the subsystem
-- Added code to merge Samples, so that a mostly full record can be merged with a consistent empty record.  If the two records are inconsistent, an error is thrown
-- addSample() in Sample.class now invokes mergeSample() when appropriate
-- Validation types are now only STRICT or SILENT
-- Validation code implemented in SampleDBBuilder
-- Extensive unit tests for SampleDBBuilder
2011-10-03 19:20:33 -07:00
Mark DePristo 867a7476c1 Systematic unit tests for the sample object 2011-10-03 19:09:02 -07:00
Mauricio Carneiro 3837aa45b4 Fixing conflicts
Conflicts:
	public/java/test/org/broadinstitute/sting/utils/clipreads/ReadClipperUnitTest.java
2011-10-03 19:07:59 -07:00
Mark DePristo 2e3dc52088 Minor function renaming 2011-10-03 14:41:13 -07:00
Mark DePristo dd71884b0c On path to SampleDB engine integration
-- PedReader tag parser
-- Separation of SampleDBBuilder from SampleDB (now immutable)
-- Removed old sample engine arguments
2011-10-03 12:08:07 -07:00
Eric Banks c3eff7451a Found a small inefficiency while profiling: we were still using String.split instead of ParsingUtils.split to break up array values in the INFO field. There was a noticeable (albeit not big) difference in the change when reading sites only files. 2011-10-03 14:20:39 -04:00
Mark DePristo 8ee0f91904 Remove residual processing tracker arguments 2011-10-03 09:50:01 -07:00
Mark DePristo 89ac50e86e SampleDataSource -> SampleDB 2011-10-03 09:33:30 -07:00
Mark DePristo 93fba06cb5 Support for whitespace only lines 2011-10-03 09:30:10 -07:00
Mark DePristo 0604ce55d1 PedReader support for ; separated lines, not only newline 2011-10-03 09:19:58 -07:00
Mark DePristo 52f670c8b8 100% version of PedReader
-- Passes all unit tests
-- Added unit tests for missing fields
2011-10-03 06:12:58 -07:00
Roger Zurawicki bf6a3a6532 Added framework to do batch CigarClip Testing
*NOTE: This commit has not been compiled!
2011-10-02 22:33:46 -04:00
Mark DePristo dd75ad9f49 95% PedReader
-- Passes significiant unit tests
-- Implicit sample creation for mom / dad when you create single samples
-- Continuing cleanup of Sample and SampleDataSource
2011-09-30 18:03:34 -04:00
Andrey Sivachenko c7898a9be7 inconsequential change in string constants printed into the vcf which noone uses anyway... 2011-09-30 16:40:21 -04:00
Mark DePristo 010899f886 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-30 15:51:09 -04:00
Mark DePristo 84160bd83f Reorganization of Sample
-- Moved Gender and Afflication to separate public enums
-- PedReader 90% implemented
-- Improve interface cleanup to XReadLines and UserException
2011-09-30 15:50:54 -04:00
Mauricio Carneiro 05fba6f23a Clipping ends inside deletion and before insertion
fixed.
2011-09-30 15:44:43 -04:00
Mark DePristo c1cf6bc45a PEDReader should be in samples 2011-09-30 14:22:19 -04:00
Mark DePristo 56f10b40a8 Fixing test bugs for WindowMaker that required empty sample list 2011-09-30 14:18:27 -04:00
Ryan Poplin af6c053435 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-30 13:33:31 -04:00
Mark DePristo 810e8ad011 Removed getXByReaders() function from the engine
-- These could be simplied in their downstream uses
-- Or they could be replaced with a generic getSAMFileHeaders() function and then apply the getSamples(header) as desired downstream
2011-09-30 10:43:51 -04:00
Mark DePristo 178ba24c27 Move getSamplesForSamFile to SampleUtils
-- A nearly identical piece of code already lived in SampleUtils.  Now there are two functions, one taking a regular header and another grabbing the merged header from the GATK engine itself.  Much cleaner
2011-09-30 10:28:18 -04:00
Mark DePristo 30d23942b1 Renamed ReadBackedPileup getXSampleName() functions to getXSample
-- now that we don't have Sample objects floating around we don't have to have all of the Name extensions on our functions
2011-09-30 10:02:57 -04:00
Mark DePristo 3289a325fc Removed final use of Sample in RBP 2011-09-30 09:57:39 -04:00
Mark DePristo a69a4dda2f SamplesDB no longer has null sample
-- Updated getSamples().size() == 2 test in CallableLociWalker that really ensured there was one sample in the system
2011-09-30 09:56:23 -04:00
Mark DePristo e055a78f6e LIBS now requires at least one sample be present
-- UnitTest provides a "null" sample for matching the reads without read groups
2011-09-30 09:49:35 -04:00
Mark DePristo 9860a2c989 Merge branch 'master' into ped 2011-09-30 09:28:18 -04:00
Mark DePristo d901fed617 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-30 08:41:44 -04:00
Mauricio Carneiro cabacf028d Intermediate commit to fix interval skipping
may need additional testing.
2011-09-29 18:45:12 -04:00
Mark DePristo b71b51751e Bug fix for UnitTest
-- Provide the null sample to the LIBS, as this seems to be required for correctly passing this unit test
-- Will be fixed in a future update
2011-09-29 17:30:01 -04:00
Mark DePristo 1765fbeb6b Merge branch 'master' into ped 2011-09-29 17:18:51 -04:00
Mark DePristo 98ecaf8aa0 Support for ReducedReads with reduced counts and average quals
-- ReadUtils and UnitTest updated to support new byte[] style
-- Removed unnecessary read transformer in PairHMM
2011-09-29 17:18:39 -04:00
Mauricio Carneiro 9508220157 fixed hard clipping both ends inside deletion
If both ends of the interval falls within a deletion in the read then hardClipBothEnds would cut the right tail first including the entire deletion, then fail to cut the left tail because there would not be any bases there anymore. Fixed.
2011-09-29 15:36:49 -04:00
Mark DePristo 9458f01409 Test cleanup of Sample object 2011-09-29 15:13:05 -04:00
Mark DePristo 625ffb6a07 LocusIteratorByState and ReadBackedPileups no long use Sample 2011-09-29 14:52:11 -04:00
Mark DePristo b3a2371925 Merge branch 'master' into ped 2011-09-29 14:32:17 -04:00
Mark DePristo 68761a6e28 Removed sample from header 2011-09-29 14:13:05 -04:00
Mauricio Carneiro a5e75cd14c Outputting both consensus base qualities and counts
The base qualities of a consensus reads are now the average quality of the bases forming the consensus base (most common base) and the consensus quality tag now carry an array with the counts of each base in the consensus. This should increase file size but improve calling sensitivity/specificity.
2011-09-29 12:54:41 -04:00
Mark DePristo 505416b6c0 Merge branch 'master' into ped 2011-09-29 12:22:39 -04:00
Mauricio Carneiro 4086fa768f Disabling all ReadClipperUnitTests 2011-09-29 12:20:35 -04:00
Mark DePristo 9536845e35 Cleaning up unused code in MV 2011-09-29 12:20:07 -04:00
Mark DePristo 5043d76c3d Removing more bad uses of SampleDataSource creation 2011-09-29 12:16:34 -04:00
Mark DePristo 5c9227cf5e Further cleanup of Sample database
-- Removing more and more unnecessary code
-- Partial removal of type safe Sample usage.  On the road to SampleDB only
2011-09-29 11:50:05 -04:00
Mark DePristo 2a0cd556d3 Further cleanup of Sample
-- Cleaned up interface functions in GAE
-- Added Walker.getSampleDB() function which is an easier option for tools to get the samples db
2011-09-29 10:34:51 -04:00
Mark DePristo e76f381628 Moved sample package from DataSources to gatk, and renamed it samples
-- All associated changes to the codebase are just header updates
2011-09-29 09:57:15 -04:00
Mark DePristo e197dcd1f3 Pre-cleanup commit of Sample and SampleDataSource
-- SampleDataSource has all reader functionality disabled
2011-09-29 09:44:18 -04:00
Mark DePristo 4d31673cc5 No longer supporting YAML file allows us to delete 75% of the sample's codebase 2011-09-29 09:43:31 -04:00
Ryan Poplin e366ee18bc Adding ability to read in and make use of kmer quality tables during HMM evaluation 2011-09-29 07:46:19 -04:00
Mauricio Carneiro fc86cd6fd8 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/carneiro/gatk/RR into rr 2011-09-29 00:12:15 -04:00
Roger Zurawicki 4fd5630f6a Added ReadClipper Unit Test
* Includes tests that include HardClip to Read and Reference Coords.
* Changed ReadUtils.HardClipByReferenceCoordinates from private to protected to allow for testing
2011-09-28 23:13:50 -04:00
Matt Hanna 9272ed03b5 Merged bug fix from Stable into Unstable 2011-09-28 21:26:43 -04:00
Matt Hanna 0acaf2df65 Fix an embarrassing issue where a specific configuration of minimal coverage
over small intervals could cause reads to be dropped from the pileup.  Nothing
to see here...
2011-09-28 21:23:01 -04:00
Guillermo del Angel c8d3a720f9 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-28 18:17:34 -04:00
Guillermo del Angel 7e3cb45093 Further performance optim in banded hmm, about 60% speed improvement over current implementation now 2011-09-28 16:27:28 -04:00
Ryan Poplin 1b1ca80df2 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-28 16:17:39 -04:00
Ryan Poplin 3b73dc89fe Making several esoteric arguments in the BQSR @Hidden. Adding basic support for Complete Genomics machine cycle. 2011-09-28 16:17:31 -04:00
Mauricio Carneiro ff2f4df043 Fixed hardclipping inside indel (right tail)
when hard clipping the right tail of a read falls inside a deletion, clipping should fall back to the last base before the deletion to follow the ReadClipper's contract.
2011-09-28 16:07:34 -04:00
Mauricio Carneiro 3c7b7f74ef Optimized interval iteration
Using a TreedSet to manipulate getToolkit.getIntervals() and being smart about which intervals to test makes interval clipping O(1) instead of O(n).
2011-09-28 16:07:34 -04:00
Mauricio Carneiro 5c9b659c02 clipping both ends of the reads was modifying the original read
This goes against the ReadClipper contract, and was affecting the second part of the read that spans over multiple intervals. Fixed.
2011-09-28 16:07:34 -04:00
Guillermo del Angel fe23e4d10c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-28 15:53:11 -04:00
Guillermo del Angel e2b9030e93 First mostly fully functional implementation of banded pair HMM likelihood computation for indel caller. More experimentation to follow but it right now works in small data sets and at least it doesn't break existing things. Disabled by default at this point 2011-09-28 15:51:48 -04:00
Eric Banks 1b45f21774 Removing this command-line tool. Purposely not doing this in stable so that users who may still use it have time to find other options. But the docs are no longer on the wiki. 2011-09-28 13:18:32 -04:00
Eric Banks 1f0e354fae Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-28 13:13:21 -04:00
Eric Banks bb619a9a3c Fixing docs 2011-09-28 13:13:03 -04:00
Mark DePristo 5812004e06 Merge branch 'stable' 2011-09-28 11:36:40 -04:00
Mark DePristo a5006831d7 Shows "" not empty space when default string value is "" 2011-09-28 11:35:52 -04:00
Mark DePristo 1e32281a15 Fix to not show -null when missing short name argument 2011-09-28 11:31:20 -04:00
Mauricio Carneiro 89544c209c Fixing contracts
changed return type to Pair, changing contracts accordingly.
2011-09-28 11:19:17 -04:00
Eric Banks eacbee3fe5 Merged bug fix from Stable into Unstable 2011-09-27 20:35:18 -04:00
Eric Banks 43b0c98298 Fix docs 2011-09-27 20:34:46 -04:00
Eric Banks 232a6df11c Add longhand form to the error message. 2011-09-27 20:29:31 -04:00
Eric Banks 1d6fcb6eb1 Revert "Add longhand form to the error message to prevent users from posting borderline dumb posts to GS."
This reverts commit 75b2600527cfce05ae683cb394290ff2a80e8552.
2011-09-27 20:27:00 -04:00
Eric Banks 269b9826b6 Add longhand form to the error message to prevent users from posting borderline dumb posts to GS. 2011-09-27 20:26:36 -04:00
Mauricio Carneiro 3b6e43b7c4 Use reads that span multiple intervals
* RR will now compress reads that span across multiple intervals correctly and output them in the correct order.
* Fixed bug in getReadCoordinateForReferenceCoordinate where if the requested reference coordinate fell inside a deletion in the read the read would be clipped up to one element past the deletion.
2011-09-27 18:39:06 -04:00
Khalid Shakir 84bd355690 Merged bug fix from Stable into Unstable 2011-09-27 14:34:39 -04:00
Khalid Shakir b090751f62 Fixed Ant / PluginManager issue where reflections was picking up all class files under current working directory due to "." in jar manifest classpaths.
Updates to HybridSelectionPipeline:
- Added annotations back via snpEff
- Minor updates to VQSR paths and lowered memory
2011-09-27 14:33:57 -04:00
Eric Banks 26e71f6688 The Omni files have multiple records (with the same ALT) at a particular location, with one PASSing and the other(s) filtered. Chris, this is why using this file as both eval and comp leads to ref/no-call cells in the GenotypeConcordance table. However, this led to non-determinism in VE because the VCs were placed in a HashSet; we use a LinkedHashMap instead to bring back determinism. 2011-09-27 11:03:17 -04:00
Guillermo del Angel ceffefa6a6 Intermediate version with banded pair HMM 2011-09-27 10:18:58 -04:00
Mark DePristo e99ff3caae Removed lots of old, and not to be used, HMM options
-- resulted in massive code cleanup
-- GdA will integrate his new banded algorithm here
-- Removed: DO_CONTEXT_DEPENDENT_PENALTIES, GET_GAP_PENALTIES_FROM_DATA, INDEL_RECAL_FILE, dovit, GSA_PRODUCTION_ONLY
2011-09-27 10:08:40 -04:00
Mark DePristo fa0efbc4ca Refactoring of PairHMM to support reduced reads 2011-09-26 13:28:56 -04:00
Mark DePristo a6b65d6347 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-26 13:26:21 -04:00
Mark DePristo 4f09453470 Refactored reduced read utilities
-- UnitTests for key functions on reduced reads
-- PileupElement calls static functions in ReadUtils
-- Simple routine that takes a reduced read and fills in its quals with its reduced qual
2011-09-26 12:58:31 -04:00
Eric Banks 234b74dd05 Merged bug fix from Stable into Unstable 2011-09-26 11:47:23 -05:00
Eric Banks 317b95fa57 Fixing some annotator docs 2011-09-26 11:46:45 -05:00
Mauricio Carneiro b76dbc72f0 Fixed interval navigation bug.
If a read was hard clipped away from the current interval, all subsequent reads within that interval (not hardclipped) would be filtered out. Fixed.
2011-09-26 08:13:44 -04:00
Guillermo del Angel 9afccd11b1 Minor refactoring: add ability to MathUtils.normalizeFromLog10 to not go to linear domain but just substract max value from log values and return. Use this function in snp and indel GL computation. 2011-09-25 21:18:56 -04:00
Guillermo del Angel 3eef800889 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-24 21:20:11 -04:00
Guillermo del Angel 4707ab4a7d Added unit tests to test genotype merges with PL's 2011-09-24 21:17:15 -04:00
Guillermo del Angel 203517fbb7 a) Cleanups/bug fixes to previous commit to CombineVariants.
b) Change md5 to reflect records that are now merged correctly.
c) Change unit merge alleles test to reflect the fact that a null non-variant vc object is not valid and not supported because there's no way to codify such object in a vcf. The code correctly converts this to a non-variant single-base event with whatever the reference is at that location.
2011-09-24 19:08:00 -04:00
Mauricio Carneiro c31f4cb2f6 Cleaning leading insertions
With the current implementation, a read cannot start with a deletion or an insertion. Maybe this will change in the future, but for now, chop the leading insertion off.
2011-09-24 14:33:32 -04:00
Guillermo del Angel cd058dd10f a) Fixed md5 for legit change in UG output that now also no-calls genotypes w/0,0,0 in PL's in SNP case.
b) First reimplementation of new vc merger of different types. Previous version did it in two steps, first merging all vc's per type and then trying to see if resulting vc's would be merged if alleles of one type were a subset of another, but this won't work when uniquifying genotypes since sample names would be messed up and GT sample names wouldn't match VC sample names. Now, it's actually simpler: when splitting vc's by type before merging, we check for alleles of one vc being a subset of alleles of vc of another type and if so we put them together in same list.
2011-09-24 13:40:11 -04:00
Mark DePristo bb11951255 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-24 09:26:45 -04:00
Mark DePristo 8d9e136bba Merge branch 'stable' 2011-09-24 09:26:28 -04:00
Mark DePristo 6804ab6d2f Bug fix for NPE in very short GATK runs
-- Was already in unstable, but not stable...
2011-09-24 09:25:29 -04:00
Mark DePristo 92acff46e5 Moved Haplotype into Utils root 2011-09-24 09:14:05 -04:00
Mark DePristo f792353dcd Framework for genotype unit test 2011-09-24 08:56:45 -04:00
Mark DePristo c0bb0cb465 Make DiploidGenotype enum private to walkers.genotyper 2011-09-24 08:48:33 -04:00
Guillermo del Angel 3a4469a236 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-23 21:58:34 -04:00
Guillermo del Angel 0e74cc3c74 a) Treat SNP genotype likelihoods just as indels, in the sense that they're always normalized as PL's so one of them will always be zero. This creates minor numerical differences in Qual and annotations due to numerical approximations in AF computation.
b) Intermediate CombineVariants fixes, not ready yet
2011-09-23 21:58:20 -04:00
Khalid Shakir 1803bd6ae2 Merged bug fix from Stable into Unstable 2011-09-23 21:05:00 -04:00
Khalid Shakir 8ceb93b8ac Fixed an integration test which crashed on the out of date LSF DRMAA library when run against the obsolete LSF dotkit instead of .combined_LSF_SGE 2011-09-23 21:03:22 -04:00
Mauricio Carneiro 7cac75ae1d Merged bug fix from Stable into Unstable 2011-09-23 19:00:43 -04:00
Mauricio Carneiro fbe3c1e0b3 Adding warning on HardClipping
Hard Clipping is still under heavy development and should not be used by anyone less prepared than MacGyver.
2011-09-23 19:00:19 -04:00
Mark DePristo b66841f179 Static cache for binomial probability
-- Very low level performance optimization
2011-09-23 17:29:34 -04:00
Mauricio Carneiro 1a45c331b2 bringing the latest bug fixes to Reduce Reads 2011-09-23 16:40:06 -04:00
Mauricio Carneiro 9ea40f2e41 Deletions/Insertions in hard clip and bug fixes
* Deletions now count as hard clipped bases in order to recover the original alignment start of a clipped read.
* Insertions do not  count as hard clipped bases for the same reason.
* This created a bug in the previous cigar cleaning function. Fixed.
2011-09-23 16:37:08 -04:00
David Roazen 40202c85e0 Merged bug fix from Stable into Unstable 2011-09-23 16:35:55 -04:00
David Roazen e1cb5f6459 SnpEff annotator now assigns a functional class to each effect and distinguishes between actual effects and mere modifiers.
-We now assign a functional class (nonsense, missense, silent, or none) to each SnpEff effect, and add a
 SNPEFF_FUNCTIONAL_CLASS annotation to the INFO field of the output VCF.
-Effects are now prioritized according to both biological impact and functional class, instead of impact only.
-Many of SnpEff's "low-impact" effects are now classified as "modifiers" with lower priority than every
 other effect. This includes such "effects" as DOWNSTREAM, UPSTREAM, INTRON, GENE, EXON, and others that
 really describe the location of the variant rather than its biological effect.

This code will be short-lived (likely 1.2-only), as the next version of SnpEff will include most of these
features directly.

Checking this change into Stable+Unstable instead of Unstable because the current functional class stratification
in VariantEval is basically broken and urgently needs to be fixed for production purposes.
2011-09-23 16:06:52 -04:00
Matt Hanna e388c357ca Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-23 14:53:28 -04:00
Matt Hanna cc23b0b8a9 Fix for recent change modelling unmapped shards: don't invoke optimization to combine mapped and unmapped shards. 2011-09-23 14:52:31 -04:00
Mark DePristo e3d4efb283 Remove N2 EXACT model code, which should never be used 2011-09-23 11:55:21 -04:00
Mark DePristo 27ce3c822e Merge branch 'stable' 2011-09-23 09:04:52 -04:00
Mark DePristo 2bb77a7978 Docs for all VariantAnnotator annotations 2011-09-23 09:04:16 -04:00
Mark DePristo dd65ba5bae @Hidden for DocumentationTest and GATKDocsExample 2011-09-23 09:03:37 -04:00
Mark DePristo dfce301beb Looks for @Hidden annotation on all classes and excludes them from the docs 2011-09-23 09:03:04 -04:00
Mark DePristo 106a26c42d Minor file cleanup 2011-09-23 08:25:20 -04:00
Mark DePristo a9f073fa68 Genotype merging unit tests for simpleMerge
-- Remaining TODOs are all for GdA
2011-09-23 08:24:49 -04:00
Mark DePristo 4397ce8653 Moved removePLs to VariantContextUtils 2011-09-23 08:24:20 -04:00
Eric Banks a8e0fb26ea Updating md5 because the file changed 2011-09-23 07:33:20 -04:00
Mark DePristo c49cc623de Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-22 17:26:21 -04:00
Mark DePristo dab7232e9a simpleMerge UnitTest for not annotating and annotating to different info key 2011-09-22 17:26:11 -04:00
Mark DePristo 30ab3af0c8 A few more simpleMerge UnitTest tests for filtered vcs 2011-09-22 17:14:59 -04:00
Mark DePristo 5cf82f9236 simpleMerge UnitTest tests filtered VC merging 2011-09-22 17:05:12 -04:00
Mark DePristo 46ca33dc04 TestDataProvider now can be named 2011-09-22 17:04:32 -04:00
Mauricio Carneiro 96c875399c Merging many bug fixes to reduce reads 2011-09-22 17:04:11 -04:00
Mauricio Carneiro 39b54211d0 Fixed hard clipping soft clipped bases after hard clips
if soft clipped bases were after a hard clipped section of the read, the hard clip was clipping the left soft clip tail as if it were a right tail. Mayhem.
2011-09-22 15:46:55 -04:00
Mark DePristo 68da555932 UnitTest for simpleMerge for alleles 2011-09-22 15:16:37 -04:00
Mauricio Carneiro 1acf7945c5 Fixed hard clipped cigar and alignment start
* Hard clipped Cigar now includes all insertions that were hard clipped and not the deletions.
* The alignment start is now recalculated according to the new hard clipped cigar representation
2011-09-22 14:51:14 -04:00
Eric Banks 80d7300de4 Unit test was passing in FORMAT as one of the sample names. There used to be a hack in the VCFHeader to check for this and remove it and I couldn't figure out why, but now I know. Hack was removed and now the unit test passes in only the sample names as per the contract. 2011-09-22 13:28:42 -04:00
Mauricio Carneiro 4e9020c9f7 Fixed alignment start for hard clipping insertions 2011-09-22 13:28:25 -04:00
Eric Banks 9c1728416c Revert "Updating md5 for fixed file" because this was fixed properly in unstable (but will break SnpEff if put into Stable).
This reverts commit 6b4182c6ab3e214da4c73bc6f3687ac6d1c0b72c.
2011-09-22 13:16:42 -04:00
Eric Banks 888d8697b1 Merged bug fix from Stable into Unstable 2011-09-22 13:16:31 -04:00
Eric Banks 15a410b24b Updating md5 for fixed file 2011-09-22 13:15:41 -04:00
Mark DePristo ba5f83fee2 start of VariantContextUtils UnitTest
-- tests rsID merging
2011-09-22 12:10:39 -04:00
Mark DePristo 93dd1faa5f Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-22 11:20:10 -04:00
Mark DePristo a05c959e5a Empty unit tests for VariantContextUtils
-- will be expanded over the day
2011-09-22 11:20:07 -04:00
Mark DePristo 3fdee2b9ed Merge from stable into unstable 2011-09-22 11:19:43 -04:00
Christopher Hartl 4f4a0fc38a Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/chartl/dev/git 2011-09-22 11:01:58 -04:00
Christopher Hartl 982c47bfa7 Remove duplicate effort in ReadUtils (with apologies to Mauricio)
Big (but not major) cleanup of code in ILG - mostly excising the old likelihood model
Activated the early-abort check for ILG. I think it should be better this way.
2011-09-22 10:58:26 -04:00
Mark DePristo c514df6d18 Merge of stable into unstable 2011-09-22 10:34:27 -04:00
Mark DePristo f81a41b889 Updating MD5s for CombineVariants
-- Old version had broken RSIDs, new version is fixed.  No longer see rs1234,. as it is now just rs1234
2011-09-22 10:30:25 -04:00
Eric Banks b8ea9ceb68 Adding integration test that uses the -V:dbsnp binding to make sure it won't fail later on if someone messes with Tribble. 2011-09-21 22:43:31 -04:00
Eric Banks 8f8b59a932 My interpretation of the VCF spec is that the FORMAT field should only be present if there is genotype/sample data. So the VCFCodec now throws an exception when it encounters such a case. I had to fix one of the integration test VCFs. 2011-09-21 22:23:28 -04:00
Christopher Hartl dc96f6da79 Merge branch 'master' of ssh://chartl@gsa2/humgen/gsa-scr1/chartl/dev/git 2011-09-21 18:18:41 -04:00
Christopher Hartl f9cdc119af Added a method to ReadUtils that converts reads of the form 10S20M10S to 40M (just unclips the soft-clips).
Be careful when using this - if you're writing a bam file it will be potentially written out of order (since the previous alignment start was at the M, not the S).
2011-09-21 18:16:42 -04:00
Christopher Hartl faff6e4019 Failed to commit changes to the GATKReport required for more easy access when using the files as data sources (read: histograms) for walkers 2011-09-21 18:15:23 -04:00
Mauricio Carneiro 96768c8a18 Sending latest bug fixes to Reduce Reads to the main repository 2011-09-21 17:43:11 -04:00
Mauricio Carneiro 70335b2b0a Hard clipping soft clipped reads to fix misalignments.
Pre-softclipped reads (with high qual) are a complicated event to deal with in the Reduced Reads environment. I chose to hard clip them out for now and added a todo item to bring them back on in the future, perhaps as a variant region.
2011-09-21 17:12:01 -04:00
Christopher Hartl ef05827c7b Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-21 16:40:47 -04:00