Commit Graph

11230 Commits (ef87b18e09d64dda2e483c77573b792af46d4f93)

Author SHA1 Message Date
Eric Banks ef87b18e09 In retrospect, it wasn't a good idea to have FisherStrand handle reduced reads since they are always on the forward strand. For now, FS ignores reduced reads but I've added a note (and JIRA) to make this work once the RR het compression is enabled (since we will have directionality in reads then). 2012-12-05 02:00:35 -05:00
Eric Banks 726332db79 Disabling the testNoCmdLineHeaderStdout test in UG because it keeps crashing when I run it locally 2012-12-05 00:54:00 -05:00
Randal Moore 8d2d0253a2 introduce a level of indirection for the forum URLs - this new function will allow me a place to morph the URL into something that is supported by Confluence
Signed-off-by: Eric Banks <ebanks@broadinstitute.org>
2012-12-03 22:33:02 -05:00
Eric Banks 1af41754e3 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-03 22:01:11 -05:00
Eric Banks bca860723a Updating tests to handle bad validation data files (that used the wrong qual score encoding); overrides push from stable. 2012-12-03 22:01:07 -05:00
Eric Banks 387c0defed don't change md5 here because I am handling it separately from unstable with a better command-line in the test 2012-12-03 21:49:45 -05:00
Eric Banks ef95757311 Fix MD5 because of a need to fix a busted bam file in our validation directory (it used the wrong quality score encoding...) 2012-12-03 21:46:46 -05:00
Menachem Fromer 472381245a Allow for more refined control of memory and queues to run with 2012-12-03 17:07:03 -05:00
Eric Banks 67932b357d Bug fix for RR: don't let the softclip start position be less than 1 2012-12-03 15:59:14 -05:00
Ryan Poplin d5ed184691 Updating the HC integration test md5s. According to the NA12878 knowledge base this commit cuts down the FP rate by more than 50 percent with no loss in sensitivity. 2012-12-03 15:38:59 -05:00
Ryan Poplin a47da9bb2f Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-03 14:30:14 -05:00
Ryan Poplin 156d6a5e0b misc minor bug fixes to GenotypingEngine. 2012-12-03 12:47:35 -05:00
Eric Banks 5fed9df295 Quick fix: base qual array in the GATKSAMRecord stores the actual phred values (-33) and not the original bytes (duh). 2012-12-03 12:18:20 -05:00
Eric Banks b6839b3049 Added checking in the GATK for mis-encoded quality scores.
The check is performed by a Read Transformer that samples (currently set to once
every 1000 reads so that we don't hurt overall GATK performance) from the input
reads and checks to make sure that none of the base quals is too high (> Q60). If
we encounter such a base then we fail with a User Error.

* Can be over-ridden with --allow_potentially_misencoded_quality_scores.
* Also, the user can choose to fix his quals on the fly (presumably using PrintReads
  to write out a fixed bam) with the --fix_misencoded_quality_scores argument.

Added unit tests.
2012-12-03 11:18:41 -05:00
Ryan Poplin 18b002c99c Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-03 10:08:56 -05:00
Eric Banks 6f523a1ea0 Merge branch 'master' of github.com:broadinstitute/gsa-unstable 2012-12-03 08:41:21 -05:00
Eric Banks 59fc7456cf Updated expectations for novel TiTv in HSP after Mark's fixes to the exact model 2012-12-03 08:41:13 -05:00
Mark DePristo f0a4710247 Callset summary now includes a table for the consensus itself 2012-12-02 16:40:12 -05:00
Mark DePristo ce9a323c04 NA12878 knowledge base automatically filters duplicate records out in the SiteIterator
-- Now it doesn't matter if there are duplicate records (all fields equal up to the date) in the knowledge base
2012-12-02 14:21:29 -05:00
Ryan Poplin 1bdf17ef53 Reworking of how the likelihood calculation is organized in the HaplotypeCaller to facilitate the inclusion of per allele downsampling. We now use the downsampling for both the GL calculations and the annotation calculations. 2012-12-02 11:58:32 -05:00
Mark DePristo 1828d33a5a Bugfix to AssessNA12878
-- Wasn't handling indel overlaps correctly in SiteIterator.getSitesBefore, causing it to incorrectly skip variants underlying indels (the getSitesBefore was considering both start and stop [not the correct behavior]) causing it to only get sites up to the first record whose stop overlapped the requested start.
2012-12-02 11:09:15 -05:00
Eric Banks d7b951b6f3 Finished up my reviews for megabase chr20:10M-11M. Fixed out of order record from earlier. 2012-12-01 23:35:21 -05:00
Mark DePristo 2849889af5 Updating md5 for UG 2012-12-01 14:24:19 -05:00
depristo 3105f13df3 Merge pull request #4 from jsilter/master
Remove validate, add note to put it back in when public gatk catches up
2012-11-30 13:24:44 -08:00
Mark DePristo 1100f0733b Reviews for all unique omni poly sites on chr20
Updated setup script to includes these and ebanks reviews as well.  Eric -- your file is currently not sorted, fyi
2012-11-30 16:23:27 -05:00
Jacob Silterra 02e98fa516 Remove validate, add note to put it back in when public gatk catches up 2012-11-30 16:08:00 -05:00
Mark DePristo 8020ba14db Minor cleanup of SAMDataSource as part of my system review
-- Changed a few function from public to protected, as they are only used by the package contents, to simplify the SAMDataSource interface
2012-11-30 15:04:41 -05:00
Mark DePristo 66bbe46e5b MongoDBManager prints out meaningful information with toString 2012-11-30 15:04:41 -05:00
Mark DePristo 3248ca3f91 Validate MongoVariantContext on creation 2012-11-30 15:04:40 -05:00
Mark DePristo 79dbcc205c Minor cleanup for working version of igv 2012-11-30 15:04:40 -05:00
Mark DePristo 6b6a14cc6d Moving ConsensusSummarizer to its appropriate home in core of NA12878KB 2012-11-30 15:04:40 -05:00
Mauricio Carneiro db2a045321 Useful walker to establish minimum depth necessary for confident calling of different types of variants 2012-11-30 00:42:05 -05:00
Mauricio Carneiro fc7fab5f3b Fixed ReadBackedPileup downsampling
Downsampling in the PerSampleReadBackedPileup was broken, it didn't downsample anything, always returning a copy the original pileup.
2012-11-30 00:42:05 -05:00
Eric Banks 0e1287a843 Adding reviews for 1st 400kb of my target megabase (10-11) on chr20 2012-11-29 16:15:45 -05:00
Joel Thibault 97d29f203e Add walltime changes to LSF
- Check whether the specified attribute is available
- Add pipeline test (disabled due to missing attribute)
2012-11-29 15:23:37 -05:00
Johan Dahlberg daf6269b65 Setting the walltime
Signed-off-by: Joel Thibault <thibault@broadinstitute.org>
2012-11-29 15:23:36 -05:00
Mark DePristo f837e6ced7 Refactored entire NA12878KB to allow us to easily build a na12878kb.jar for IGV integration
-- Just separated infrastructure into core package, away from the walkers themselves.
-- Added na12878kb.jar target that builds a jar that can run a test main function (see testNA12878kbJar.csh)
2012-11-29 14:38:09 -05:00
Mark DePristo 52a6df4f1a Add SummarizeConsensus walker that spits out information about the callsets in the KB
-- Added summary to update consensus as well, so you can see what's been added as well
2012-11-29 13:07:46 -05:00
depristo ed7a89c0c7 Merge pull request #3 from jsilter/master
Fix NA12878DBArgumentCollectionUnitTest
2012-11-29 08:52:38 -08:00
Jacob Silterra d9e8a414ef Fix NA12878DBArgumentCollectionUnitTest so it uses testng, and testCompareLocalRemoteLocators compare the right things 2012-11-29 11:03:21 -05:00
David Roazen df2c26b554 Rename NA12878DBArgumentCollectionTest to NA12878DBArgumentCollectionUnitTest
Otherwise this test won't get run as part of the test suite...
2012-11-28 22:57:04 -05:00
David Roazen b06e71cedf Use build jars in test classpaths by default
-Allows packaged resource files to be accessed within tests

-Guards against packaging errors in dist/ jars by testing the
jars that actually get run rather than unpackaged class files.
Previously we were only protected against packaging errors in the
monolithic jars posted to our website, not the dist/ jars used in
everyday runs.

-"ant fasttest" still uses the unpackaged class files for speed
(don't want to have to rebuild the jars in fasttest). Relies on
dubious methods to get at the resource files that would end up
in the jars.

-Eliminated the stupid separate "test" ivy config. Now we only
invoke ivy ONCE during an ant build that includes tests.
2012-11-28 22:57:04 -05:00
Eric Banks add1ab5d0e Fix status of largeScaleValidationPools for NA12878-KB 2012-11-28 20:34:13 -05:00
Mark DePristo b9be8850e2 Bugfixes to NA12878DBArgumentCollection and JSON and the GATK argument value injection system
-- Functions that depend on the value of variables that have GATK injection values must be initialized lazy, not at object creation time.  Previous version broken dbToUse and useLocal arguments.  Fixed
2012-11-28 19:02:07 -05:00
Mark DePristo 7b74bf6677 Excluding large scale validation callsets from KB until further reviewed, rebuilding production server now 2012-11-28 18:41:49 -05:00
Mark DePristo 4729f0858d ExtractConsensusSites -include and -exclude callsets now works on supporting callsets not the actual name
-- Allows you to include / exclude callsets that appear in other callsets (as one would expect)
2012-11-28 18:41:16 -05:00
Mark DePristo 65357d26bc New walker ExtractConsensusSites that extracts a VCF from the NA12878 Knowledge Base meeting criteria
-- See @link http://gatkforums.broadinstitute.org/discussion/1848/using-the-na12878-knowledge-base for more information
2012-11-28 18:13:07 -05:00
Mark DePristo de7049463c New walker ExtractConsensusSites that extracts a VCF from the NA12878 Knowledge Base meeting criteria
-- See @link http://gatkforums.broadinstitute.org/discussion/1848/using-the-na12878-knowledge-base for more information
2012-11-28 17:19:22 -05:00
Eric Banks ff8b3904e2 Added many new resources to the NA12878 KB truth set 2012-11-28 17:18:24 -05:00
David Roazen b2e699169c Update GATK packaging settings to package arbitrary resources
With the newly-added support for packaging arbitrary resources, the
resources were getting packaged in a normal build but not when
creating a standalone GATK jar. This corrects this oversight.
2012-11-28 15:26:05 -05:00