aaron
ca386439be
only emit a warning if the tribble index is out of date, don't remove and replace it for them. Added a test case where the log4j appender checks the logging messages for the appropriate output.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3393 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 15:12:48 +00:00
hanna
017ab6b690
Experimental versions of downsampler and Ryan's deduper are now available either
...
as walker attributes or from the command-line. Not ready yet! Downsampling/deduping
works in a general sense, but this approach has not been completely optimized or validated.
Use with caution.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3392 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-19 05:40:05 +00:00
aaron
7cfb9ff3dc
updates for Tribble 82, fixes for Ryans case where multiple processes would attempt to read/write to the same index, and a couple other Tribble-centric bug fixes.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3382 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 19:34:45 +00:00
chartl
e016491a3d
Major refactoring of Depth of Coverage to allow for more extensible partitions of data (now can do read group, sample, and library; in any combination; adding more is fairly easy). Changed the by-gene code to use clones of stats objects, rather than munging the interval DoCs. (Fix for Avinash. Who, hilariously, thinks my name is Carl.) Added sorting methods to ensure static ordering of header and body fields.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3377 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-18 16:58:13 +00:00
hanna
0791beab8f
Checking in downsampling iterator alongside LocusIteratorByState, and removing
...
the reference implementation. Also implemented a heap size monitor that can
be used to programmatically report the current heap size.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3367 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-17 21:00:44 +00:00
chartl
b7d21627ab
Changes to DepthOfCoverage (JIRA items) and added back an integration test to cover it. Alterations to the design file generator to output all transcripts (rather than choosing one at random).
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3366 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-17 17:23:00 +00:00
ebanks
32389dc0a9
Fixed GQ estimate when chosen genotype isn't the most likely according to the GLs.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3362 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-14 19:17:46 +00:00
hanna
88bd7a2045
Reenabling UG parallelization performance tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3360 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 16:28:08 +00:00
hanna
0490909285
Fixed epic generic paths fail.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3359 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 15:59:57 +00:00
hanna
7ef87e5126
An integration test based on validating pileup to test parallelism in reads, reference, and RODs. This test runs in less
...
than a minute and fell over instantly in the case of the Tribble parallelism issue.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3358 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 15:40:43 +00:00
hanna
ceec525420
Got rid of stray unicode characters in copyright message.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3357 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-13 14:47:39 +00:00
ebanks
c81b910f73
Commenting out the parallelization test which is failing
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3354 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 18:39:53 +00:00
aaron
cac98ba5ef
a couple of small documentation fixes
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3353 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 17:40:27 +00:00
aaron
2c55ac1374
fixes for parallel processing problems with Tribble, a small bug in the resource pool, and some more documentation.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3349 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-12 06:13:26 +00:00
ebanks
34969f304c
Adding dbsnp to all UG performance tests
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3347 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 15:48:05 +00:00
ebanks
140e43b93b
Checking in to see whether it fails. If I start getting bombarded with Bamboo error reports, I'm commenting it out...
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3346 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 15:39:42 +00:00
ebanks
572b383fe2
Make VA annotate dbsnp again
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3345 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-11 14:06:53 +00:00
depristo
64ccaa4c6a
Walkers and integration tests that calculate and compare callable bases
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3328 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 21:33:47 +00:00
aaron
7d2df3f511
example windowed ROD walker for Kristian, and updates to Tribble
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3325 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 17:12:50 +00:00
rpoplin
57f254b13a
VE integration test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3324 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-07 13:58:25 +00:00
aaron
78409dca0d
turned off the progress output from tribble when making an index, and fixing a case where the index file isn't writable so we instead make the index in memory.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3312 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-06 16:36:58 +00:00
aaron
a0d71540df
speed-up for VCF, adding code to the VCF reader to automagically make an index if one doesn't already exist, and a change to the VCF writer unit test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3305 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 20:19:42 +00:00
aaron
a68f3b2e9c
VCF moved over to tribble.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3302 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 17:28:48 +00:00
aaron
ad11201235
adding more ROD pile-up tests
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3301 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-05 16:01:11 +00:00
aaron
f497213933
DbSNP moved over to tribble
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3288 348d0f76-0448-11de-a6fe-93d51630548a
2010-05-03 06:02:35 +00:00
ebanks
9dff578706
Added PG tag to bam header to let people know it's been cleaned.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3284 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 17:30:30 +00:00
ebanks
850f36aa61
Changes to the Unified Genotyper's arguments:
...
1. User can specify 4 confidence thresholds: for calling vs. emitting and at standard vs. 'trigger' sites.
2. User can cap the base quality by the read's mapping quality (not done yet).
3. Default confidence threshold is now Q30.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3281 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-30 16:44:24 +00:00
aaron
cbed0b1ade
Adding GeliText tribble track as the first enabled Tribble track. This mean 'Variants' is no longer valid for a ROD type, use GeliText instead. I've updated all the references in the codebase.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3271 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-29 22:50:17 +00:00
aaron
7fbfd34315
adding the GELI ROD validation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3270 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-29 21:43:00 +00:00
depristo
5dce16a8f1
Better genotype concordance module. Code refactoring for clarity (please see below/after for educational purposes). Now reports variant sensitivity, concordance, and genotype error rate by default. Also aggregates this data across all samples, so you get a per sample and overall stats for each of these in the allSamples row.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3265 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-28 13:10:11 +00:00
ebanks
df31eeff9f
minor change
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3259 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-26 06:05:29 +00:00
depristo
7f4d5d9973
Ti/Tv by AC
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3252 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-23 17:56:29 +00:00
rpoplin
e7c0ded40e
Fixed long-standing bug in GenotypeConcordance module of VariantEval which caused incorrect numbers to be displayed in the concordance table. The format of the concordance table has changed. Added a concordance summary table which gives overall genotype concordance summary stats by sample. None of the VE integration tests contained genotype information so I added a comp track with genotypes to one of the tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3247 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-23 15:48:41 +00:00
aaron
f050beada6
make sure we do delete the temp file we create
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3244 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-23 05:32:49 +00:00
aaron
536f22f3bd
adding VC adaptor for GELI, along with unit tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3243 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-23 05:28:39 +00:00
hanna
32d86cf457
Rev the reservoir downsampler to support partitioning through a functor.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3232 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 19:50:26 +00:00
ebanks
e9e844fbf5
1. Reverting: dbsnp automatically is a comp
...
2. Fixing logic for min Qscore calculation
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3230 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 18:51:35 +00:00
asivache
532263ea25
Oooops, forgot to update the test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3229 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 18:38:24 +00:00
ebanks
4abd3b0b7b
Fixing known/novel calc now that dbsnp isn't a default comp track
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3223 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 05:43:59 +00:00
ebanks
3b5673d967
1. Removed -all; by default all modules are used; use -none for no modules.
...
2. Don't make dbsnp track be a comp by default (to cut back on output). Please let me know if someone wants this back for some reason.
3. Cleaned up dbsnp module output to print the right numbers.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3220 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-21 02:46:42 +00:00
aaron
4e18c54bb8
fixing a couple of commented out portions of the VCFReader test
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3219 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 22:20:35 +00:00
aaron
80c4f88a72
removing the Variation interface.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3216 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 18:56:45 +00:00
hanna
c1e53d407d
The copyright tag that I copied/pasted from a LaTeX document into IntelliJ had
...
unicode quote characters embedded in it. These characters were invisible inside
IntelliJ but cause compile warnings for Ryan and Aaron, who for whatever reason
have a different default charset. Fixed.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3203 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 15:26:32 +00:00
aaron
b5f6f54968
Almost done removing any trace of the old Variation and Genotype interfaces.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3202 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-20 14:52:15 +00:00
hanna
1bc26f69e9
An attempt to cleanup the Utils directory. Email to follow.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3198 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 23:00:08 +00:00
hanna
c08936d6f4
Added a reservoir downsampler which can sample elements in an iterator uniformly
...
from a stream (see Vitter 1985). Thanks to Eric and Andrey for the pointer.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3197 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 20:48:14 +00:00
ebanks
c44f63c846
Fixing the performance tests: we need to catch the RuntimeException (not samtools' RuntimeIOExcpetion). Also, CountCovariates doesn't need the catch.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3196 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 14:28:12 +00:00
ebanks
abf48cee05
Moving over to VariantContext from Variation
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3195 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 06:56:29 +00:00
ebanks
d73c63a99a
Redoing the conversion to VariantContext: instead of walkers passing in a ref allele, they pass in the ref context and the adaptors create the allele. This is the right way of doing it.
...
Also, adding some more useful integration tests.
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3194 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-19 05:47:17 +00:00
aaron
be7cbf948b
adding a catch for the exception thrown by samtools when it attempts to close /dev/null in the performance tests.
...
git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@3186 348d0f76-0448-11de-a6fe-93d51630548a
2010-04-16 17:41:48 +00:00