Commit Graph

11168 Commits (1cc0b48caab07426a3d54b34db3043ca96a28a4e)

Author SHA1 Message Date
Jacob Silterra 1cc0b48caa Abstract connection to MongoDB so we can specify it through JSON file. Include 2 JSON spec files in GenomeAnalysisTK.jar
Create MongoDBManager, which keeps track of connections based on Locator class. Locators can be instantiated directly, or read from JSON files (NA12878DBArgumentCollection uses the GSon library)
2012-11-27 17:44:55 -05:00
Menachem Fromer 31069ffced Add HC pruning parameter option, as per Ryan's advice 2012-11-27 17:21:22 -05:00
depristo 6f1eb65ec8 Merge pull request #1 from jsilter/master
Modifications to NA12878KB classes can so they can more easily be used as a library
2012-11-27 12:35:18 -08:00
Jacob Silterra b15edd9eb3 Modifications so these classes can more easily be used as a library. In particular:
0. Add additional create method to MongoVariantContext as convenience, if we want a custom TruthStatus (and change "type" to less ambiguous "truthStatus")
1. Have NA12878KnowledgeBase return WriteResults from insert methods, so caller can know if there's an error
2. Provide constructors for NA12878DBArgumentCollection, since we need to be able to create this class for NA12878DBKnowledgeBase
2012-11-27 14:49:56 -05:00
Mark DePristo d10b858e0b Finalizing setupNA12878kb script for use in cron 2012-11-27 14:44:08 -05:00
Eric Banks b40d3eb8aa Merged bug fix from Stable into Unstable 2012-11-27 14:41:07 -05:00
Eric Banks 01abcc3e0f Tests didn't like my note to Geraldine in the output logs; apparently it's tested in integration tests 2012-11-27 14:40:49 -05:00
Mark DePristo ffb232bdf0 NA12878 Knowledge Base modules use DEV by default if they modify the database, while accessors use PRODUCTION
-- Added script that starts KB server
2012-11-27 14:26:23 -05:00
Mark DePristo 7e4b9c9e6e Fix failing unit tests for VariantContextUtilsUnitTest
-- Previous version was adding multiple samples with the same name to the variant context
2012-11-27 14:26:23 -05:00
Mark DePristo 4281498c2c Improvements to NA12878KnowledgeBase system
-- Cleaned up code for SiteIterator.
-- Added a generic error handling system for the SiteIterator.  Created approaches to simply throw errors when invalid records are found, to log them, and to remove them from the sites collection.
-- By default getCalls() produces a SiteIterator that removes incorrectly formatted records from the DB
-- Created NA12878KnowledgeBaseServer GATK walker that (1) continually finds newly added records to the sites database and rebuilds the consensus as needed and (2) archives the reviewed sites to a VCF file upon server termination
-- More, better unit tests everywhere
-- Adding infrastructure to find only newly added sites to the NA12878KnowledgeBase.  Uses mongos ordering of _id to obtain the records (and the sites) of variants newly added to the sites collection.  This is essential infrastructure to write a NA12878KnowledgeBase server that continually keeps the consensus records updated as new sites are added to the database
2012-11-27 14:26:23 -05:00
Joel Thibault 9bfe39411e Equal overlap should match right/later region 2012-11-27 13:03:13 -05:00
Joel Thibault d83ad906ef Add profile range contract 2012-11-27 13:03:13 -05:00
Joel Thibault cc550b4145 Add a read and interval on a different contig 2012-11-27 13:03:13 -05:00
Eric Banks 9531e58445 Merged bug fix from Stable into Unstable 2012-11-27 11:00:50 -05:00
Eric Banks 4543ece088 Fixing parsing of genomelocs that contain colons in the contig names (which is allowed by the spec) as reported on the forum. Added unit test for this case. 2012-11-27 11:00:33 -05:00
Eric Banks a82ec7ad80 Merged bug fix from Stable into Unstable 2012-11-27 10:27:08 -05:00
Eric Banks e199562c25 I have pulled out all of the documentation URLs and put them into the HelpUtils class as static variables; this way, Appistry can change links as needed to point commercial users to their own internal forum without having to muck things up all over our source. Added some TODOs for Geraldine to update links in the GATK docs that still point to the old wiki. Sorry that I am pushing into stable, but that's what Appistry is pulling from for their release next week (and unstable has been failing forever). 2012-11-27 10:26:17 -05:00
Mauricio Carneiro 97fd5de260 Merging latest CMI updates with UNSTABLE 2012-11-27 09:08:00 -05:00
Eric Banks b1969a66bd Update docs 2012-11-27 08:24:41 -05:00
Eric Banks cc72aaefeb Minor efficiency: use >= instead of > in test 2012-11-27 01:11:23 -05:00
Eric Banks 405f3c675d Fix for GSA-649: GenomeLocSortedSet.overlaps is crazy slow. Also improved GenomeLocSortedSet.sizeBeforeLoc. 2012-11-27 01:07:00 -05:00
Ryan Poplin e27d677c13 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-11-26 12:20:32 -05:00
Ryan Poplin 59cef880d1 Updating HC integration tests because experimental, HC-specific annotations have been removed. 2012-11-26 12:20:07 -05:00
Ryan Poplin c3b7dd1374 Misc cleanup in the HaplotypeCaller. Cleaning up unused arguments after recent changes to HC-GenotypingEngine 2012-11-26 12:19:11 -05:00
Eric Banks 4f7fa3009a I forget why I thought that the VariantAnnotator couldn't run multi-threaded because it works just fine. Now you can specify -nt with VA. 2012-11-26 11:34:59 -05:00
Mauricio Carneiro c0261f75ce Merging master and develop together
(because I forgot to do so when I merged in nov 14th, now develop has a few extra commits not present in master).
2012-11-26 11:31:47 -05:00
Mauricio Carneiro a3f5932501 Fixed null pointer exception in Integration Tests
When running Utils.setupWriter with NO_PG_TAG set, the writer was attempting to create a program record with the null pointer. Fixed.
2012-11-26 11:12:27 -05:00
Eric Banks b15b62157a Use correct path in imports 2012-11-26 10:09:13 -05:00
Menachem Fromer 3784bb5258 Fixes to process all SNPs and indels simultaneously (even those at same site) 2012-11-26 03:59:36 -05:00
Ryan Poplin fedc4fde6c Merged bug fix from Stable into Unstable 2012-11-25 21:55:55 -05:00
Ryan Poplin d978cfe835 Soft clipped bases shouldn't be counted in the delocalized BQSR. 2012-11-25 21:55:29 -05:00
Eric Banks 9719ba7adc Remove -number example from the docs since it's no longer supported. 2012-11-22 21:53:42 -05:00
Menachem Fromer 2306518ab6 Fix to deal with 'proper' options of casting 2012-11-22 01:45:18 -05:00
Menachem Fromer d33a412b5f Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-11-22 01:42:29 -05:00
Mark DePristo 48f271c5bd Adding 80% support for multi-allelic variants
-- Multi-allelic variants are split into their bi-allelic version, trimmed, and we attempt to provide a meaningful genotype for NA12878 here.  It's not perfect and needs some discussion on how to handle het/alt variants
-- Adding splitInBiallelic funtion to VariantContextUtils as well as extensive unit tests that also indirectly test reverseTrimAlleles (which worked perfectly FYI)
2012-11-21 17:24:59 -05:00
Mark DePristo e14bfa9f5c Update reviews.vcf to have genotypes, and downstreams changes to tools to support this 2012-11-21 17:24:59 -05:00
Mark DePristo 5d2ee32936 Documentation and validation for NA12878 KB tools
-- MongoVariantContexts and MongoGenotype have a validate() function that ensures that the information is consistent, in anticipation of potential problems with the data coming in from reviews via IGV
-- Divide the world into production, development, and test DB, via the NA12878DBArgumentCollection
2012-11-21 17:24:59 -05:00
Joel Thibault c68bc95db6 Initial read mapping tests
- Failing tests are commented out
2012-11-21 17:16:46 -05:00
Joel Thibault 3ad9128800 Add some reads
- Move intervals and reads to init
- Update intervals and reads
2012-11-21 17:16:46 -05:00
Joel Thibault 3fa3b00f4a Add ActiveRegion tests and refactor 2012-11-21 17:16:45 -05:00
Joel Thibault e8defcb20d Test multiple bases and intervals 2012-11-21 17:16:45 -05:00
Joel Thibault c08b782743 Count isActive calls directly 2012-11-21 17:16:45 -05:00
Eric Banks 7580a487f3 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-11-21 16:20:49 -05:00
Eric Banks 4f2229d399 As per the TODO message, I removed a check that was no longer necessary. Now ID is an allowable INFO field key. 2012-11-21 16:01:26 -05:00
Menachem Fromer a8c7edca05 Fixed fragment handling in DepthOfCoverage 2012-11-21 16:01:10 -05:00
Menachem Fromer 06261b58c2 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-11-21 15:57:08 -05:00
Eric Banks ed50814ccb Finally found a case where user errors were being masked behind other errors and could debug. It turns out that the checkForMaskedUserErrors() method needs to run recursively over all levels (calling exception.getCause()) to check for the original cause. 2012-11-21 15:57:05 -05:00
Menachem Fromer c8be7c3102 Keep SNPs and indels separately for batch merging; Add options to DepthOfCoverage to count fragments (to not double-count overlapping reads of same fragment); DepthOfCoverage should now support ReducedReads; Replace recusrion with loop in DoC/package.scala (for lists longer than 5000 elements) 2012-11-21 15:56:53 -05:00
Ryan Poplin 17d1c9ed53 Updating NA12878 reviews with Mark. 2012-11-21 12:47:04 -05:00
Douglas Voet d3817e789c made dedup setup not intermediate 2012-11-21 09:07:08 -05:00