Commit Graph

10548 Commits (bfbf1686cd0f71c94dea59c84b6c74c71f0ae1af)

Author SHA1 Message Date
Matt Hanna 4ebdc52b0d Merged bug fix from Stable into Unstable 2011-10-04 13:16:20 -04:00
Matt Hanna 88c2fad64f Change vcf jar to use a classfileset to pull all dependencies. Should save
Jim Robinson some detective work in the long run.
2011-10-04 13:14:39 -04:00
Mark DePristo e1d6c7a50a Updating MD5 that have changed due to sample ordering differences 2011-10-04 09:33:23 -07:00
Mark DePristo 343a7b6b2f Updating UG integration tests for arbitrary impact of sample order changes on downsampling 2011-10-04 08:14:00 -07:00
Mark DePristo fee89e47ff Only throws an error when there are no samples but there are reads
-- Handles the case when you are running a ROD traversal and yet the LIBS is still used to return null everywhere.
2011-10-04 06:50:54 -07:00
Mark DePristo f552aede42 Only provide the sample names in the BAM file for efficiency 2011-10-04 06:50:12 -07:00
Mark DePristo a27641e1fc Cleaned up imports 2011-10-04 06:28:36 -07:00
Mauricio Carneiro 8b4a27092d Merge branch 'master' into rr 2011-10-03 19:46:04 -07:00
Mauricio Carneiro 6ca162ae9a Not pre-sorted.
Turns out it is a complicated solution when going multi-sample, so I'll leave it off for now.
2011-10-03 19:45:46 -07:00
Mark DePristo b20689ff55 No longer supports extraProperties
-- the underlying data structure is still present, but until I decide what to do for the extensible system I've completely disabled the subsystem
-- Added code to merge Samples, so that a mostly full record can be merged with a consistent empty record.  If the two records are inconsistent, an error is thrown
-- addSample() in Sample.class now invokes mergeSample() when appropriate
-- Validation types are now only STRICT or SILENT
-- Validation code implemented in SampleDBBuilder
-- Extensive unit tests for SampleDBBuilder
2011-10-03 19:20:33 -07:00
Mark DePristo 867a7476c1 Systematic unit tests for the sample object 2011-10-03 19:09:02 -07:00
Mauricio Carneiro 3837aa45b4 Fixing conflicts
Conflicts:
	public/java/test/org/broadinstitute/sting/utils/clipreads/ReadClipperUnitTest.java
2011-10-03 19:07:59 -07:00
Mark DePristo 2e3dc52088 Minor function renaming 2011-10-03 14:41:13 -07:00
Mark DePristo dd71884b0c On path to SampleDB engine integration
-- PedReader tag parser
-- Separation of SampleDBBuilder from SampleDB (now immutable)
-- Removed old sample engine arguments
2011-10-03 12:08:07 -07:00
Eric Banks c3eff7451a Found a small inefficiency while profiling: we were still using String.split instead of ParsingUtils.split to break up array values in the INFO field. There was a noticeable (albeit not big) difference in the change when reading sites only files. 2011-10-03 14:20:39 -04:00
Mark DePristo 8ee0f91904 Remove residual processing tracker arguments 2011-10-03 09:50:01 -07:00
Mark DePristo 89ac50e86e SampleDataSource -> SampleDB 2011-10-03 09:33:30 -07:00
Mark DePristo 93fba06cb5 Support for whitespace only lines 2011-10-03 09:30:10 -07:00
Mark DePristo 0604ce55d1 PedReader support for ; separated lines, not only newline 2011-10-03 09:19:58 -07:00
Mark DePristo 52f670c8b8 100% version of PedReader
-- Passes all unit tests
-- Added unit tests for missing fields
2011-10-03 06:12:58 -07:00
Roger Zurawicki bf6a3a6532 Added framework to do batch CigarClip Testing
*NOTE: This commit has not been compiled!
2011-10-02 22:33:46 -04:00
Mark DePristo dd75ad9f49 95% PedReader
-- Passes significiant unit tests
-- Implicit sample creation for mom / dad when you create single samples
-- Continuing cleanup of Sample and SampleDataSource
2011-09-30 18:03:34 -04:00
Andrey Sivachenko c7898a9be7 inconsequential change in string constants printed into the vcf which noone uses anyway... 2011-09-30 16:40:21 -04:00
Mark DePristo 010899f886 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-30 15:51:09 -04:00
Mark DePristo 84160bd83f Reorganization of Sample
-- Moved Gender and Afflication to separate public enums
-- PedReader 90% implemented
-- Improve interface cleanup to XReadLines and UserException
2011-09-30 15:50:54 -04:00
Mauricio Carneiro 05fba6f23a Clipping ends inside deletion and before insertion
fixed.
2011-09-30 15:44:43 -04:00
Mauricio Carneiro e0b771c233 Pre-sorting output
sorting variable regions independently to make the output pre-sorted. Major speedup.
2011-09-30 15:44:43 -04:00
Mauricio Carneiro b5bdea1cb9 Fix: hardclip first interval overlap but overlap subsequent intervals
If a read had the left tail clipped (that would overlap the current interval) but the right tail overlaps other intervals, this was triggering out of order reads to get processed before they should. Fixed.
2011-09-30 15:44:43 -04:00
Mauricio Carneiro d211b3b7f0 Fixing interval overlap logic
Complete re-write of the optimized interval overlap logic. Added lots of exceptions to trap  future wrongdoings.
2011-09-30 15:44:42 -04:00
Mark DePristo c1cf6bc45a PEDReader should be in samples 2011-09-30 14:22:19 -04:00
Mark DePristo 56f10b40a8 Fixing test bugs for WindowMaker that required empty sample list 2011-09-30 14:18:27 -04:00
Ryan Poplin 73577b72cf Misc minor updates in HaplotypeCaller 2011-09-30 13:41:24 -04:00
Ryan Poplin af6c053435 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-30 13:33:31 -04:00
Mark DePristo 810e8ad011 Removed getXByReaders() function from the engine
-- These could be simplied in their downstream uses
-- Or they could be replaced with a generic getSAMFileHeaders() function and then apply the getSamples(header) as desired downstream
2011-09-30 10:43:51 -04:00
Guillermo del Angel f75573dd54 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-30 10:35:26 -04:00
Guillermo del Angel 6637cd9dd5 Fixes to CalibrateGenotypeLikelihoods: a) Fix up the walker itself which was broken since the new rod binding system, b) added ability to do indels and to optionally test banded implementation, c) experimental ability to do multiple samples (may need more work) 2011-09-30 10:35:10 -04:00
Mark DePristo 178ba24c27 Move getSamplesForSamFile to SampleUtils
-- A nearly identical piece of code already lived in SampleUtils.  Now there are two functions, one taking a regular header and another grabbing the merged header from the GATK engine itself.  Much cleaner
2011-09-30 10:28:18 -04:00
Mark DePristo 30d23942b1 Renamed ReadBackedPileup getXSampleName() functions to getXSample
-- now that we don't have Sample objects floating around we don't have to have all of the Name extensions on our functions
2011-09-30 10:02:57 -04:00
Mark DePristo 3289a325fc Removed final use of Sample in RBP 2011-09-30 09:57:39 -04:00
Mark DePristo a69a4dda2f SamplesDB no longer has null sample
-- Updated getSamples().size() == 2 test in CallableLociWalker that really ensured there was one sample in the system
2011-09-30 09:56:23 -04:00
Mark DePristo e055a78f6e LIBS now requires at least one sample be present
-- UnitTest provides a "null" sample for matching the reads without read groups
2011-09-30 09:49:35 -04:00
Mark DePristo 9860a2c989 Merge branch 'master' into ped 2011-09-30 09:28:18 -04:00
Mark DePristo a881d6f145 Now only generates the poly VCF with select variants if the file doesn't exist 2011-09-30 08:42:09 -04:00
Mark DePristo d901fed617 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-09-30 08:41:44 -04:00
Mauricio Carneiro cabacf028d Intermediate commit to fix interval skipping
may need additional testing.
2011-09-29 18:45:12 -04:00
Mark DePristo b71b51751e Bug fix for UnitTest
-- Provide the null sample to the LIBS, as this seems to be required for correctly passing this unit test
-- Will be fixed in a future update
2011-09-29 17:30:01 -04:00
Mark DePristo 1765fbeb6b Merge branch 'master' into ped 2011-09-29 17:18:51 -04:00
Mark DePristo 98ecaf8aa0 Support for ReducedReads with reduced counts and average quals
-- ReadUtils and UnitTest updated to support new byte[] style
-- Removed unnecessary read transformer in PairHMM
2011-09-29 17:18:39 -04:00
Mauricio Carneiro 9508220157 fixed hard clipping both ends inside deletion
If both ends of the interval falls within a deletion in the read then hardClipBothEnds would cut the right tail first including the entire deletion, then fail to cut the left tail because there would not be any bases there anymore. Fixed.
2011-09-29 15:36:49 -04:00
Mark DePristo 9458f01409 Test cleanup of Sample object 2011-09-29 15:13:05 -04:00