Khalid Shakir
b8b7f28aa9
Revving Picard to pick up new SamFileHeaderMerger.
...
Updated ReadFilter abstract class to implement (via UnsupportedOperationException) the new SamRecordFilter.filterOut().
In IndelRealignerIntegrationTest updates for Picard fixes to SAMRecord.getInferredInsertSize() in svn r1115 & r1124.
- Ran FixMates to create new input BAM since running IR with variable maxReadsInMemory means all reads weren't realigned leading to different outputs.
- Updated md5s to match new expectations after looking at TLEN diff engine output.
2012-05-02 16:47:28 -04:00
Mauricio Carneiro
b32f09b949
some more updates to the BQSR scala script
2012-05-02 16:23:02 -04:00
Mauricio Carneiro
f51a1d0d61
Better error message to the BAMScheduler
...
In the case where the BAM file was aligned using a reference but analysis is being attempted with a different reference.
2012-05-02 16:10:00 -04:00
Mauricio Carneiro
a5d17e02c7
quick lua script to merge recalibration reports by hand.
2012-05-02 16:06:04 -04:00
Mauricio Carneiro
940029fa5d
Fixing on-the-fly recalibration (caught by Ryan)
...
low quality bases in the tails were being turned to N's in the final read.
2012-05-02 16:06:04 -04:00
Eric Banks
623b36fbc4
Add header lines for AC,AF, and AN tags
2012-05-02 15:33:34 -04:00
Guillermo del Angel
6fac8f2c70
More test coverage on PoolAFCalculationModel: add more tests for multiallelic case with higher ploidy
2012-05-02 14:12:02 -04:00
Joel Thibault
bb756447e2
Move mongodb package to a location where walkers will be visible from the command line
2012-05-02 11:58:06 -04:00
Guillermo del Angel
429800a192
Fix corner case rounding issue in MathUtils unit test: 10^logFactorial(4)) was 23.999999... which if cast directly yielded 23 - so, do pre-rounding to ensure correct integer result if caller will cast value.
2012-05-02 09:57:06 -04:00
Guillermo del Angel
76a95fdedf
Full implementation of multiallelic exact model for pools. Still super-linear so not useable at scale but it should be a gold standard to compare to. Unit tests are not exhaustive yet, will be expanded to provide better test coverage. Small inconsequential optimization in MathUtils: we're already caching log10(factorial(n)) for large n, so might as well use the cached values to compute binomial and multinomial coefficients instead of the log-gamma approximation which is more expensive (doesn't seem to save much time either in PoolCaller nor in UG though).
2012-05-02 09:24:28 -04:00
Joel Thibault
4d732fa586
Move all MongoDB files into private/java/src/org/broadinstitute/sting/mongodb
2012-05-01 18:23:51 -04:00
Mauricio Carneiro
bdf6d1f109
updates to BQSR queue script
2012-05-01 17:36:33 -04:00
Eric Banks
619a69a5f1
As promised in the release notes for 1.6, I am removing the old deprecated genotyping framework revolving around the misordering of alleles and have moved the fixed version in its place in preparation for release 1.7 (or 2.0?).
2012-05-01 16:18:24 -04:00
Joel Thibault
c255dd5917
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-05-01 16:10:38 -04:00
Ryan Poplin
51af61b5d7
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-05-01 16:07:23 -04:00
Ryan Poplin
cc646690d6
updating HaplotypeCaller integration tests
2012-05-01 16:07:18 -04:00
Ryan Poplin
fc55dcec3c
Unfortunately the reverse trimming of alleles still doesn't work with mixed records in some corner cases. Turning it off for now.
2012-05-01 16:02:36 -04:00
Ryan Poplin
2187d71bb2
Adding some quick debugging, custom annotations to the calls coming out of the HaplotypeCaller.
2012-05-01 15:55:14 -04:00
Ryan Poplin
20a0078f23
Merging active regions across shard boundries if they are contiguous, have the same active status and don't grow too big.
2012-05-01 15:51:36 -04:00
Eric Banks
0f3af9555b
Adding an option to SelectVariants which allows the user to re-genotype through the exact model (if PLs are present) the samples in order to recalculate the QUAL and genotypes. This is really the correct way to select a subset of samples, especially when originally called from low coverage data. Also added integration test to cover this case.
2012-05-01 14:58:06 -04:00
Joel Thibault
aa4d41cce0
Minor cleanup before push
2012-05-01 14:16:44 -04:00
Joel Thibault
b101b9c30b
Add Mongo switch
2012-05-01 14:00:48 -04:00
Joel Thibault
1b609e9075
Move Mongo to server couchdb
2012-05-01 13:59:47 -04:00
Joel Thibault
fd57d27f45
Move MongoDB connection handling to a separate class
2012-05-01 13:59:37 -04:00
Joel Thibault
db3cd1abd5
Use 2 MongoDB collections (tables): one for INFO/attributes, one for samples/genotypes.
2012-05-01 13:57:23 -04:00
Joel Thibault
04e1be9106
Better handling of Mongo errors + exceptions
2012-05-01 13:57:23 -04:00
Joel Thibault
ca737479cf
Query for stop locations because we don't have that information in the reference
2012-05-01 13:57:23 -04:00
Joel Thibault
1cda87a4ad
Set ROD priority list to input
2012-05-01 13:57:23 -04:00
Joel Thibault
a7fe847faf
Set the priority list and don't bother combining if not needed
2012-05-01 13:57:23 -04:00
Joel Thibault
f739305f43
Combine the variants found at a location
2012-05-01 13:57:23 -04:00
Joel Thibault
020f884d5a
Use new key of source ROD plus alleles
2012-05-01 13:57:23 -04:00
Joel Thibault
221ce9c3d6
Add alleles to the primary key
2012-05-01 13:57:23 -04:00
Joel Thibault
3198ce5471
Can have multiple variants at a location
2012-05-01 13:57:22 -04:00
Joel Thibault
11ed8e61c9
Add referenceBaseForIndel to the Mongo VariantContext objects
2012-05-01 13:53:44 -04:00
Joel Thibault
7ed0ee7ed0
Skip locations with no genotypes instead of throwing a NPE
2012-05-01 13:53:44 -04:00
Joel Thibault
4bdfeacdaa
Handle multiple samples/genotypes per location
...
TODO: sample selection
2012-05-01 13:53:43 -04:00
Joel Thibault
1f7c628796
Insert the ROD filename into MongoDB as part of the primary key
2012-05-01 13:53:43 -04:00
Joel Thibault
bb8a6e9b0a
Initial test of write and read from MongoDB
2012-05-01 13:53:43 -04:00
Joel Thibault
d93a413f2e
Add MongoDB dependency
2012-05-01 13:53:43 -04:00
Mark DePristo
0cf3603c73
Merged bug fix from Stable into Unstable
2012-05-01 13:39:27 -04:00
Mark DePristo
c2b74eca64
Remove unnecessary and obscure usage of old R
2012-05-01 13:39:09 -04:00
David Roazen
c0084c741b
Pilot BCF2 Implementation: Checkpointing the code
...
* Not working yet, still very much a work-in-progress with lots of placeholders
* Needed to check this in to enable possible collaboration, since it's
going slower than anticipated and the conference deadline looms.
2012-05-01 12:23:10 -04:00
Eric Banks
fdffe1d61b
Merged bug fix from Stable into Unstable
2012-05-01 11:04:46 -04:00
Eric Banks
0c8e801021
Removing public to private dependency
2012-05-01 11:04:11 -04:00
Eric Banks
e964d17518
Removing public to private dependency
2012-05-01 11:02:28 -04:00
Eric Banks
ef082356e9
Merge remote-tracking branch 'unstable/master'
2012-05-01 08:47:08 -04:00
Mauricio Carneiro
462450c3e3
disabling all BQSR unit tests
...
with the changes to the cycle covariate, some tests need updates, others need to be completely re-written.
2012-04-30 14:39:55 -04:00
Mauricio Carneiro
825ad30477
Adding readgroup filter option to BQSR queue script
2012-04-30 14:39:55 -04:00
Guillermo del Angel
e185632013
Exhaustive unit tests for Pool SNP genotype likelihoods:
...
a) Add ability for ErrorModel to be specified by external log-probability vector for testing.
b) For a given depth and ploidy(=2*samples/pool), create artificial high quality pileup testing from AC=0 to AC=ploidy, and test that pool GL's have expected content.Misc. refactorings and cleanups
c) Misc. cleanups and beautification.
2012-04-30 14:29:46 -04:00
Christopher Hartl
7d029b9a28
Merge branch 'master' of ssh://ni.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-04-30 12:16:30 -04:00