Jacob Silterra
1cc0b48caa
Abstract connection to MongoDB so we can specify it through JSON file. Include 2 JSON spec files in GenomeAnalysisTK.jar
...
Create MongoDBManager, which keeps track of connections based on Locator class. Locators can be instantiated directly, or read from JSON files (NA12878DBArgumentCollection uses the GSon library)
2012-11-27 17:44:55 -05:00
Menachem Fromer
31069ffced
Add HC pruning parameter option, as per Ryan's advice
2012-11-27 17:21:22 -05:00
depristo
6f1eb65ec8
Merge pull request #1 from jsilter/master
...
Modifications to NA12878KB classes can so they can more easily be used as a library
2012-11-27 12:35:18 -08:00
Jacob Silterra
b15edd9eb3
Modifications so these classes can more easily be used as a library. In particular:
...
0. Add additional create method to MongoVariantContext as convenience, if we want a custom TruthStatus (and change "type" to less ambiguous "truthStatus")
1. Have NA12878KnowledgeBase return WriteResults from insert methods, so caller can know if there's an error
2. Provide constructors for NA12878DBArgumentCollection, since we need to be able to create this class for NA12878DBKnowledgeBase
2012-11-27 14:49:56 -05:00
Mark DePristo
d10b858e0b
Finalizing setupNA12878kb script for use in cron
2012-11-27 14:44:08 -05:00
Eric Banks
b40d3eb8aa
Merged bug fix from Stable into Unstable
2012-11-27 14:41:07 -05:00
Eric Banks
01abcc3e0f
Tests didn't like my note to Geraldine in the output logs; apparently it's tested in integration tests
2012-11-27 14:40:49 -05:00
Mark DePristo
ffb232bdf0
NA12878 Knowledge Base modules use DEV by default if they modify the database, while accessors use PRODUCTION
...
-- Added script that starts KB server
2012-11-27 14:26:23 -05:00
Mark DePristo
7e4b9c9e6e
Fix failing unit tests for VariantContextUtilsUnitTest
...
-- Previous version was adding multiple samples with the same name to the variant context
2012-11-27 14:26:23 -05:00
Mark DePristo
4281498c2c
Improvements to NA12878KnowledgeBase system
...
-- Cleaned up code for SiteIterator.
-- Added a generic error handling system for the SiteIterator. Created approaches to simply throw errors when invalid records are found, to log them, and to remove them from the sites collection.
-- By default getCalls() produces a SiteIterator that removes incorrectly formatted records from the DB
-- Created NA12878KnowledgeBaseServer GATK walker that (1) continually finds newly added records to the sites database and rebuilds the consensus as needed and (2) archives the reviewed sites to a VCF file upon server termination
-- More, better unit tests everywhere
-- Adding infrastructure to find only newly added sites to the NA12878KnowledgeBase. Uses mongos ordering of _id to obtain the records (and the sites) of variants newly added to the sites collection. This is essential infrastructure to write a NA12878KnowledgeBase server that continually keeps the consensus records updated as new sites are added to the database
2012-11-27 14:26:23 -05:00
Joel Thibault
9bfe39411e
Equal overlap should match right/later region
2012-11-27 13:03:13 -05:00
Joel Thibault
d83ad906ef
Add profile range contract
2012-11-27 13:03:13 -05:00
Joel Thibault
cc550b4145
Add a read and interval on a different contig
2012-11-27 13:03:13 -05:00
Eric Banks
9531e58445
Merged bug fix from Stable into Unstable
2012-11-27 11:00:50 -05:00
Eric Banks
4543ece088
Fixing parsing of genomelocs that contain colons in the contig names (which is allowed by the spec) as reported on the forum. Added unit test for this case.
2012-11-27 11:00:33 -05:00
Eric Banks
a82ec7ad80
Merged bug fix from Stable into Unstable
2012-11-27 10:27:08 -05:00
Eric Banks
e199562c25
I have pulled out all of the documentation URLs and put them into the HelpUtils class as static variables; this way, Appistry can change links as needed to point commercial users to their own internal forum without having to muck things up all over our source. Added some TODOs for Geraldine to update links in the GATK docs that still point to the old wiki. Sorry that I am pushing into stable, but that's what Appistry is pulling from for their release next week (and unstable has been failing forever).
2012-11-27 10:26:17 -05:00
Mauricio Carneiro
97fd5de260
Merging latest CMI updates with UNSTABLE
2012-11-27 09:08:00 -05:00
Eric Banks
b1969a66bd
Update docs
2012-11-27 08:24:41 -05:00
Eric Banks
cc72aaefeb
Minor efficiency: use >= instead of > in test
2012-11-27 01:11:23 -05:00
Eric Banks
405f3c675d
Fix for GSA-649: GenomeLocSortedSet.overlaps is crazy slow. Also improved GenomeLocSortedSet.sizeBeforeLoc.
2012-11-27 01:07:00 -05:00
Ryan Poplin
e27d677c13
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-11-26 12:20:32 -05:00
Ryan Poplin
59cef880d1
Updating HC integration tests because experimental, HC-specific annotations have been removed.
2012-11-26 12:20:07 -05:00
Ryan Poplin
c3b7dd1374
Misc cleanup in the HaplotypeCaller. Cleaning up unused arguments after recent changes to HC-GenotypingEngine
2012-11-26 12:19:11 -05:00
Eric Banks
4f7fa3009a
I forget why I thought that the VariantAnnotator couldn't run multi-threaded because it works just fine. Now you can specify -nt with VA.
2012-11-26 11:34:59 -05:00
Mauricio Carneiro
c0261f75ce
Merging master and develop together
...
(because I forgot to do so when I merged in nov 14th, now develop has a few extra commits not present in master).
2012-11-26 11:31:47 -05:00
Mauricio Carneiro
a3f5932501
Fixed null pointer exception in Integration Tests
...
When running Utils.setupWriter with NO_PG_TAG set, the writer was attempting to create a program record with the null pointer. Fixed.
2012-11-26 11:12:27 -05:00
Eric Banks
b15b62157a
Use correct path in imports
2012-11-26 10:09:13 -05:00
Menachem Fromer
3784bb5258
Fixes to process all SNPs and indels simultaneously (even those at same site)
2012-11-26 03:59:36 -05:00
Ryan Poplin
fedc4fde6c
Merged bug fix from Stable into Unstable
2012-11-25 21:55:55 -05:00
Ryan Poplin
d978cfe835
Soft clipped bases shouldn't be counted in the delocalized BQSR.
2012-11-25 21:55:29 -05:00
Eric Banks
9719ba7adc
Remove -number example from the docs since it's no longer supported.
2012-11-22 21:53:42 -05:00
Menachem Fromer
2306518ab6
Fix to deal with 'proper' options of casting
2012-11-22 01:45:18 -05:00
Menachem Fromer
d33a412b5f
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-11-22 01:42:29 -05:00
Mark DePristo
48f271c5bd
Adding 80% support for multi-allelic variants
...
-- Multi-allelic variants are split into their bi-allelic version, trimmed, and we attempt to provide a meaningful genotype for NA12878 here. It's not perfect and needs some discussion on how to handle het/alt variants
-- Adding splitInBiallelic funtion to VariantContextUtils as well as extensive unit tests that also indirectly test reverseTrimAlleles (which worked perfectly FYI)
2012-11-21 17:24:59 -05:00
Mark DePristo
e14bfa9f5c
Update reviews.vcf to have genotypes, and downstreams changes to tools to support this
2012-11-21 17:24:59 -05:00
Mark DePristo
5d2ee32936
Documentation and validation for NA12878 KB tools
...
-- MongoVariantContexts and MongoGenotype have a validate() function that ensures that the information is consistent, in anticipation of potential problems with the data coming in from reviews via IGV
-- Divide the world into production, development, and test DB, via the NA12878DBArgumentCollection
2012-11-21 17:24:59 -05:00
Joel Thibault
c68bc95db6
Initial read mapping tests
...
- Failing tests are commented out
2012-11-21 17:16:46 -05:00
Joel Thibault
3ad9128800
Add some reads
...
- Move intervals and reads to init
- Update intervals and reads
2012-11-21 17:16:46 -05:00
Joel Thibault
3fa3b00f4a
Add ActiveRegion tests and refactor
2012-11-21 17:16:45 -05:00
Joel Thibault
e8defcb20d
Test multiple bases and intervals
2012-11-21 17:16:45 -05:00
Joel Thibault
c08b782743
Count isActive calls directly
2012-11-21 17:16:45 -05:00
Eric Banks
7580a487f3
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-11-21 16:20:49 -05:00
Eric Banks
4f2229d399
As per the TODO message, I removed a check that was no longer necessary. Now ID is an allowable INFO field key.
2012-11-21 16:01:26 -05:00
Menachem Fromer
a8c7edca05
Fixed fragment handling in DepthOfCoverage
2012-11-21 16:01:10 -05:00
Menachem Fromer
06261b58c2
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-11-21 15:57:08 -05:00
Eric Banks
ed50814ccb
Finally found a case where user errors were being masked behind other errors and could debug. It turns out that the checkForMaskedUserErrors() method needs to run recursively over all levels (calling exception.getCause()) to check for the original cause.
2012-11-21 15:57:05 -05:00
Menachem Fromer
c8be7c3102
Keep SNPs and indels separately for batch merging; Add options to DepthOfCoverage to count fragments (to not double-count overlapping reads of same fragment); DepthOfCoverage should now support ReducedReads; Replace recusrion with loop in DoC/package.scala (for lists longer than 5000 elements)
2012-11-21 15:56:53 -05:00
Ryan Poplin
17d1c9ed53
Updating NA12878 reviews with Mark.
2012-11-21 12:47:04 -05:00
Douglas Voet
d3817e789c
made dedup setup not intermediate
2012-11-21 09:07:08 -05:00