Mark DePristo
ee4be4e5cc
Detailed docs. Index loaded only once. Linear threads
...
-- Final testing version
-- Detailed docs on all arguments
-- Runs nThreads linearly from 1, 2, ..., maxThreads
-- Only loads a single index
2011-08-16 10:44:36 -04:00
Mark DePristo
fd37da13af
Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-15 21:44:16 -04:00
Mark DePristo
c3c11b2462
A simple utility that evaluates parallel performance in tribble
...
Tests low-level multi-threading performance of tribble Creates a thread pool that reads the input VCF file in parallel with N threads from 1 to maxThreads (in powers of 2) and emits the wall time needed to process the entire file. Assumes the VCF file has a chromosome named 1 that has at least 250 Mb.
The output is a nice table showing performance of Tribble
THREAD: 1-250000001 read 2288096 objects
TIME: 1 thread runtime 12.58
THREAD: 125000001-250000001 read 1098468 objects
THREAD: 1-125000001 read 1189628 objects
TIME: 2 thread runtime 8.66
THREAD: 124000001-186000001 read 406935 objects
THREAD: 62000001-124000001 read 569656 objects
THREAD: 1-62000001 read 619972 objects
THREAD: 186000001-248000001 read 678173 objects
TIME: 4 thread runtime 8.57
THREAD: 124000001-155000001 read 101636 objects
THREAD: 93000001-124000001 read 271390 objects
THREAD: 62000001-93000001 read 298266 objects
THREAD: 155000001-186000001 read 305299 objects
THREAD: 31000001-62000001 read 297745 objects
THREAD: 1-31000001 read 322227 objects
THREAD: 217000001-248000001 read 334136 objects
THREAD: 186000001-217000001 read 344037 objects
TIME: 8 thread runtime 9.68
2011-08-15 21:42:23 -04:00
David Roazen
3e9ef0622d
Revert "1) RFCombine switched to the new ROD system"
...
This reverts commit cf989bd3cfae119ba9011873c5f5d5b80e37f67b.
2011-08-15 18:45:38 -04:00
David Roazen
1968b65ca8
Revert "Remove merge-added ======'s so this compiles"
...
This reverts commit be028b6513a129f81aa6f3593ea7d396c0e8fc25.
2011-08-15 18:45:21 -04:00
Christopher Hartl
5aa61fefec
Remove merge-added ======'s so this compiles
2011-08-15 16:53:05 -04:00
Christopher Hartl
cf3e826a69
1) RFCombine switched to the new ROD system
...
2) TreeReduce added to useful RODWalkers, but doesn't help very much due to scaling problems
3) RFA refactored, and a genotype-free calculation model added to calculate skew in a genotype-free way (still needs generalization to any ploidy)
4) Added walker to genotype intron loss events, calls into the UG engine to do so. This is very much a first-pass walker.
5) Documentation added for ValidationAmplicons
2011-08-15 16:37:31 -04:00
Eric Banks
1246b89049
Forgot to initialize variants on the merge
2011-08-15 16:00:43 -04:00
Eric Banks
045e8a045e
Updating random walkers to new rod system; removing unused GenotypeAndValidateWalker
2011-08-15 14:05:23 -04:00
Eric Banks
fc2c21433b
Updating random walkers to new rod system
2011-08-15 13:29:31 -04:00
Eric Banks
3d56bbf087
Resolving merge conflicts
2011-08-15 12:28:05 -04:00
Eric Banks
9ddbfdcb9f
Check filtered status before applying to alt reference
2011-08-15 12:25:23 -04:00
Mauricio Carneiro
ca9bb841c3
disabling ReduceReads integration tests
...
the new version of the walker is not yet ready for integration tests.
2011-08-14 17:35:30 -04:00
Mauricio Carneiro
c7b69a4574
Fixed integration tests
2011-08-14 16:38:20 -04:00
Mauricio Carneiro
6ae3f9e322
Wrapped clipping op information
...
The clipping op extra information being kept by this walker was specific to the walker, not to the read clipper. Created a wrapper ReadClipperWithData class that keeps the extra information and leaves the ReadClipper slim.
(this is a quick commit to unbreak the build, performing integration tests and will make further commits if necessary)
2011-08-14 15:44:48 -04:00
Mauricio Carneiro
8a51732049
Fixes to ReadClipper and added Reference Coordinate clipping.
...
* Added reference coordinate based hard clipping functions. This allows you to set a hard cut on where you need the read to be trimmed despite indels.
* soft clipping was messing up cigar string if there was already a hard clip at the beginning of the read. Fixed.
* hard clipping now works with previously hard clipped reads.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro
291d8c7596
Fixed HardClipping and Interval containment
...
* Hard clipping was wrongfully hard clipping unmapped reads while soft clipping then hard clipping mapped reads. Now we throw exception if we try to hard/soft clip unmapped reads and use the soft->hard clip procedure fore every mapped read.
* Interval containment needed a <= and >= to make sure it caught the borders right.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro
0be1dacddb
Refactored interval clipping utility
...
reads are clipped in map() and now we cover almost all cases. Left behind the case where the read stretches through two intervals. This will need special treatment later.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro
e921230e72
Compressor class variables private
2011-08-14 14:54:33 -04:00
Mauricio Carneiro
3d73b6fe9d
Getting rid of minBpForConsensus
...
The minimum is now 1 with the sliding reads, no need for the MBRC parameter anymore.
2011-08-14 14:54:33 -04:00
Mauricio Carneiro
ec6b1731e8
First implementation of the Sliding Window
...
This is the first attempt on the sliding window approach to reduce the memory footprint and increase performance of the ReduceReads walker.
The sliding window creates a running consensus as it traverses the reads without saving them all in memory. It also generalizes the treatment of variable regions.
2011-08-14 14:54:33 -04:00
Mark DePristo
ed2df49743
Rename so filename matches class
2011-08-14 10:44:28 -04:00
Mark DePristo
fc10685709
Measure nt scaling with/without ROD system for CC
2011-08-14 10:41:05 -04:00
David Roazen
9d2cda3d41
Removed a public -> private dependency in our test suite.
2011-08-12 17:29:10 -04:00
Eric Banks
f55ac40f2b
Moving HLA code into the archive.
2011-08-12 16:06:42 -04:00
David Roazen
bb4ced3201
SnpEff-related fixes.
...
-To correctly handle indels and MNPs, only consider features that start at the current locus,
rather than features that span the current locus, when selecting the most significant effect.
-Throw a UserException when a SnpEff rodbinding is not provided instead of simply not adding
any annotations and silently returning.
2011-08-12 15:26:24 -04:00
Mauricio Carneiro
10e873d9c6
Merge branch 'repval'
2011-08-12 15:24:31 -04:00
Mauricio Carneiro
b3b9b74d2c
got rid of the useless parameter classes
2011-08-12 14:21:30 -04:00
Guillermo del Angel
31dc831531
Merged bug fix from Stable into Unstable
2011-08-12 13:26:41 -04:00
Menachem Fromer
9121b8ed65
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-12 12:24:19 -04:00
Menachem Fromer
7ed120361d
Fixed bug that required symbolic alleles to be padded with reference base and added integration test to test parsing and output of symbolic alleles
2011-08-12 12:23:44 -04:00
Eric Banks
7ea9196321
Better error message for name/type clashes.
2011-08-12 11:18:14 -04:00
Eric Banks
27f0748b33
Renaming the HapMap codec and feature to RawHapMap so that we don't get esoteric errors when trying to bind a rod with the name 'hapmap' (since it was also a feature).
2011-08-12 11:11:56 -04:00
Eric Banks
f5b2cc4977
I'm really starting to hate this pipeline test.
2011-08-12 10:45:58 -04:00
Eric Banks
005bd71be3
Working too quickly earlier. Fixing syntax.
2011-08-12 10:29:36 -04:00
Menachem Fromer
c7ca33cbff
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-12 10:12:09 -04:00
Eric Banks
639a01f382
Updating integration test now that VE has been updated
2011-08-12 07:15:08 -04:00
Eric Banks
41f3da75d7
Implementation in VE was confusing 'variant' status vs. 'polymorphic' status. This led to issues because we now match types of eval and comp; specifically, subsetting a VC to a monomorphic sample can't change the 'variant' status of the VC (it's still a variant site or otherwise we'll never match the comps, which breaks GenotypeConcordance). CountVariants really got this wrong. Fixed. VE now passes all integration tests.
2011-08-12 02:22:44 -04:00
Eric Banks
45f973ab1f
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-12 00:40:18 -04:00
Eric Banks
eba316621d
Finish moving VE over to new rod system and fixing up the type inconsistency between eval and comp rods. Now the novel count is always 0 under the known stratification. :)
2011-08-12 00:40:08 -04:00
David Roazen
6ee8a3a8dd
Fixing the javadoc/scaladoc targets in build.xml
...
Usable targets are now:
ant javadoc (public-only)
ant javadoc.private (public + private)
ant scaladoc (public-only)
ant scaladoc.private (public + private)
As documented in the comments, you need to set the ANT_OPTS environment
variable to -Xmx1G before using the scaladoc targets.
Will modify bamboo to auto-generate these and post them to the web after
successful builds.
2011-08-11 19:24:41 -04:00
Menachem Fromer
9de06560df
Update to new RodBinding system
2011-08-11 17:54:16 -04:00
Ryan Poplin
f408d5ea93
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-11 17:04:18 -04:00
Ryan Poplin
f1d1252be2
Fixing syntax of BQSR and UG performance tests.
2011-08-11 17:04:09 -04:00
David Roazen
bd5cdb8a43
The tribble dependency is now handled through ivy. Revved tribble to r18 and removed obsolete build targets in build.xml
2011-08-11 16:38:29 -04:00
Ryan Poplin
902eb0c61e
Adding dbsnp annotation back into the UG integration tests
2011-08-11 13:55:03 -04:00
Eric Banks
90771b74b4
When matching eval to comps, try to choose the one with the same alt allele.
2011-08-11 13:55:01 -04:00
Eric Banks
200f73b008
No reason to warn the user anymore because it's no longer possible for them to specify a dbsnp file on the command-line.
2011-08-11 13:44:07 -04:00
Eric Banks
e93538cdf7
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-11 13:39:36 -04:00
Eric Banks
265c3d744b
Fixing VariantEval logic and having it use the new rod system.
2011-08-11 13:39:34 -04:00