Commit Graph

6981 Commits (008aa281e7fd1a59900a5ab7880efbb51a1bd3ff)

Author SHA1 Message Date
Chris Hartl 008aa281e7 add interval list 2011-08-16 15:24:05 -04:00
Christopher Hartl 7cb42efccb Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-16 15:23:13 -04:00
Chris Hartl 99e7e53744 script it 2011-08-16 14:37:49 -04:00
Chris Hartl b36b047eb8 val --> var 2011-08-16 14:36:09 -04:00
Chris Hartl edd39da2ea Merge branch 'master' of ssh://gsa2/humgen/gsa-scr1/chartl/dev/git 2011-08-16 14:32:48 -04:00
Chris Hartl 880de6ccb0 RFCombine alteration 2011-08-16 14:32:06 -04:00
Chris Hartl 632cd9bed1 script to scatter-gather left align variants 2011-08-16 14:31:12 -04:00
Ryan Poplin 2d5bbecd9e Merged bug fix from Stable into Unstable 2011-08-16 14:19:04 -04:00
Ryan Poplin 9d4add3268 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-16 14:18:03 -04:00
Ryan Poplin 170d1ff7b6 Fix in UG for trying to call indels at IUPAC code bases when in EMIT_ALL_SITES mode 2011-08-16 14:17:46 -04:00
Andrey Sivachenko 9f3328db53 fixing read group name collision: before writing the read into respective stream in nway-out mode we now retrieve the original rg, not the merged/modified one 2011-08-16 13:45:40 -04:00
Andrey Sivachenko a684233265 Merge branch 'master' of ssh://cga1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-16 13:40:59 -04:00
Andrey Sivachenko c71a4e1832 this is a bug fix; reverting in unstable and pushing from stable instead 2011-08-16 13:40:35 -04:00
Christopher Hartl 801659c1d7 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-16 13:24:51 -04:00
Christopher Hartl 637f3e0756 Commenting out the @Argument so queue will build -- still unsure where the bug is, but this walker can't be brought properly onto the new ROD system yet 2011-08-16 13:24:30 -04:00
Eric Banks 335421820e Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-16 13:08:25 -04:00
Eric Banks ab0b56ed11 Minor doc fixes 2011-08-16 12:55:45 -04:00
Eric Banks 125ad0bcfa Added docs to RTC 2011-08-16 12:46:48 -04:00
Eric Banks ef9216011e Added docs to IR 2011-08-16 12:24:53 -04:00
Christopher Hartl b9725d6ce5 Merge branch 'incoming' 2011-08-16 12:15:54 -04:00
Chris Hartl 9528cb4c33 A nicer version of the Intron Loss Genotyper that breaks out an engine for future modifications. 2011-08-16 12:14:30 -04:00
Christopher Hartl ea3dfcfb4f Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-16 11:04:21 -04:00
Andrey Sivachenko 9e1d443c47 fixing read group name collision: before writing the read into respective stream in nway-out mode we now retrieve the original rg, not the merged/modified one 2011-08-16 10:55:51 -04:00
Mark DePristo d0d2e9eb2a Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-16 10:44:52 -04:00
Mark DePristo ee4be4e5cc Detailed docs. Index loaded only once. Linear threads
-- Final testing version
-- Detailed docs on all arguments
-- Runs nThreads linearly from 1, 2, ..., maxThreads
-- Only loads a single index
2011-08-16 10:44:36 -04:00
Chris Hartl 6c0b9a7fc7 Moving RFCombine to the new ROD system 2011-08-16 09:47:57 -04:00
Eric Banks 21afdbfff0 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-16 01:03:12 -04:00
Eric Banks ab1e3d6a98 Use the right set of sample names 2011-08-16 01:03:05 -04:00
Mark DePristo fd37da13af Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-15 21:44:16 -04:00
Mark DePristo c3c11b2462 A simple utility that evaluates parallel performance in tribble
Tests low-level multi-threading performance of tribble Creates a thread pool that reads the input VCF file in parallel with N threads from 1 to maxThreads (in powers of 2) and emits the wall time needed to process the entire file.  Assumes the VCF file has a chromosome named 1 that has at least 250 Mb.

The output is a nice table showing performance of Tribble

  THREAD: 1-250000001 read 2288096 objects
TIME: 1 thread runtime 12.58
  THREAD: 125000001-250000001 read 1098468 objects
  THREAD: 1-125000001 read 1189628 objects
TIME: 2 thread runtime 8.66
  THREAD: 124000001-186000001 read 406935 objects
  THREAD: 62000001-124000001 read 569656 objects
  THREAD: 1-62000001 read 619972 objects
  THREAD: 186000001-248000001 read 678173 objects
TIME: 4 thread runtime 8.57
  THREAD: 124000001-155000001 read 101636 objects
  THREAD: 93000001-124000001 read 271390 objects
  THREAD: 62000001-93000001 read 298266 objects
  THREAD: 155000001-186000001 read 305299 objects
  THREAD: 31000001-62000001 read 297745 objects
  THREAD: 1-31000001 read 322227 objects
  THREAD: 217000001-248000001 read 334136 objects
  THREAD: 186000001-217000001 read 344037 objects
TIME: 8 thread runtime 9.68
2011-08-15 21:42:23 -04:00
David Roazen 3e9ef0622d Revert "1) RFCombine switched to the new ROD system"
This reverts commit cf989bd3cfae119ba9011873c5f5d5b80e37f67b.
2011-08-15 18:45:38 -04:00
David Roazen 1968b65ca8 Revert "Remove merge-added ======'s so this compiles"
This reverts commit be028b6513a129f81aa6f3593ea7d396c0e8fc25.
2011-08-15 18:45:21 -04:00
Christopher Hartl 5aa61fefec Remove merge-added ======'s so this compiles 2011-08-15 16:53:05 -04:00
Christopher Hartl cf3e826a69 1) RFCombine switched to the new ROD system
2) TreeReduce added to useful RODWalkers, but doesn't help very much due to scaling problems
3) RFA refactored, and a genotype-free calculation model added to calculate skew in a genotype-free way (still needs generalization to any ploidy)
4) Added walker to genotype intron loss events, calls into the UG engine to do so. This is very much a first-pass walker.
5) Documentation added for ValidationAmplicons
2011-08-15 16:37:31 -04:00
Eric Banks 36c7f83208 Refactoring VE stratifications so that they don't pass around bulky data; instead just pull needed data from the VE parent. This allows us stop using deprecated features of the rod system. 2011-08-15 16:31:57 -04:00
Eric Banks 1246b89049 Forgot to initialize variants on the merge 2011-08-15 16:00:43 -04:00
Eric Banks 045e8a045e Updating random walkers to new rod system; removing unused GenotypeAndValidateWalker 2011-08-15 14:05:23 -04:00
Eric Banks fc2c21433b Updating random walkers to new rod system 2011-08-15 13:29:31 -04:00
Eric Banks 3d56bbf087 Resolving merge conflicts 2011-08-15 12:28:05 -04:00
Eric Banks 9ddbfdcb9f Check filtered status before applying to alt reference 2011-08-15 12:25:23 -04:00
Mauricio Carneiro ca9bb841c3 disabling ReduceReads integration tests
the new version of the walker is not yet ready for integration tests.
2011-08-14 17:35:30 -04:00
Mauricio Carneiro c7b69a4574 Fixed integration tests 2011-08-14 16:38:20 -04:00
Mauricio Carneiro 6ae3f9e322 Wrapped clipping op information
The clipping op extra information being kept by this walker was specific to the walker, not to the read clipper. Created a wrapper ReadClipperWithData class that keeps the extra information and leaves the ReadClipper slim.

(this is a quick commit to unbreak the build, performing integration tests and will make further commits if necessary)
2011-08-14 15:44:48 -04:00
Mauricio Carneiro 8a51732049 Fixes to ReadClipper and added Reference Coordinate clipping.
* Added reference coordinate based hard clipping functions. This allows you to set a hard cut on where you need the read to be trimmed despite indels.
* soft clipping was messing up cigar string if there was already a hard clip at the beginning of the read. Fixed.
* hard clipping now works with previously hard clipped reads.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro 291d8c7596 Fixed HardClipping and Interval containment
* Hard clipping was wrongfully hard clipping unmapped reads while soft clipping then hard clipping mapped reads. Now we throw exception if we try to hard/soft clip unmapped reads and use the soft->hard clip procedure fore every mapped read.

 * Interval containment needed a <= and >= to make sure it caught the borders right.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro 0be1dacddb Refactored interval clipping utility
reads are clipped in map() and now we cover almost all cases. Left behind the case where the read stretches through two intervals. This will need special treatment later.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro e921230e72 Compressor class variables private 2011-08-14 14:54:33 -04:00
Mauricio Carneiro 3d73b6fe9d Getting rid of minBpForConsensus
The minimum is now 1 with the sliding reads, no need for the MBRC parameter anymore.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro ec6b1731e8 First implementation of the Sliding Window
This is the first attempt on the sliding window approach to reduce the memory footprint and increase performance of the ReduceReads walker.
The sliding window creates a running consensus as it traverses the reads without saving them all in memory. It also generalizes the treatment of variable regions.
2011-08-14 14:54:33 -04:00
Mark DePristo ed2df49743 Rename so filename matches class 2011-08-14 10:44:28 -04:00