Eric Banks
ef87b18e09
In retrospect, it wasn't a good idea to have FisherStrand handle reduced reads since they are always on the forward strand. For now, FS ignores reduced reads but I've added a note (and JIRA) to make this work once the RR het compression is enabled (since we will have directionality in reads then).
2012-12-05 02:00:35 -05:00
Eric Banks
726332db79
Disabling the testNoCmdLineHeaderStdout test in UG because it keeps crashing when I run it locally
2012-12-05 00:54:00 -05:00
Randal Moore
8d2d0253a2
introduce a level of indirection for the forum URLs - this new function will allow me a place to morph the URL into something that is supported by Confluence
...
Signed-off-by: Eric Banks <ebanks@broadinstitute.org>
2012-12-03 22:33:02 -05:00
Eric Banks
1af41754e3
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-03 22:01:11 -05:00
Eric Banks
bca860723a
Updating tests to handle bad validation data files (that used the wrong qual score encoding); overrides push from stable.
2012-12-03 22:01:07 -05:00
Eric Banks
387c0defed
don't change md5 here because I am handling it separately from unstable with a better command-line in the test
2012-12-03 21:49:45 -05:00
Eric Banks
ef95757311
Fix MD5 because of a need to fix a busted bam file in our validation directory (it used the wrong quality score encoding...)
2012-12-03 21:46:46 -05:00
Menachem Fromer
472381245a
Allow for more refined control of memory and queues to run with
2012-12-03 17:07:03 -05:00
Eric Banks
67932b357d
Bug fix for RR: don't let the softclip start position be less than 1
2012-12-03 15:59:14 -05:00
Ryan Poplin
d5ed184691
Updating the HC integration test md5s. According to the NA12878 knowledge base this commit cuts down the FP rate by more than 50 percent with no loss in sensitivity.
2012-12-03 15:38:59 -05:00
Ryan Poplin
a47da9bb2f
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-03 14:30:14 -05:00
Ryan Poplin
156d6a5e0b
misc minor bug fixes to GenotypingEngine.
2012-12-03 12:47:35 -05:00
Eric Banks
5fed9df295
Quick fix: base qual array in the GATKSAMRecord stores the actual phred values (-33) and not the original bytes (duh).
2012-12-03 12:18:20 -05:00
Eric Banks
b6839b3049
Added checking in the GATK for mis-encoded quality scores.
...
The check is performed by a Read Transformer that samples (currently set to once
every 1000 reads so that we don't hurt overall GATK performance) from the input
reads and checks to make sure that none of the base quals is too high (> Q60). If
we encounter such a base then we fail with a User Error.
* Can be over-ridden with --allow_potentially_misencoded_quality_scores.
* Also, the user can choose to fix his quals on the fly (presumably using PrintReads
to write out a fixed bam) with the --fix_misencoded_quality_scores argument.
Added unit tests.
2012-12-03 11:18:41 -05:00
Ryan Poplin
18b002c99c
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-03 10:08:56 -05:00
Eric Banks
6f523a1ea0
Merge branch 'master' of github.com:broadinstitute/gsa-unstable
2012-12-03 08:41:21 -05:00
Eric Banks
59fc7456cf
Updated expectations for novel TiTv in HSP after Mark's fixes to the exact model
2012-12-03 08:41:13 -05:00
Mark DePristo
f0a4710247
Callset summary now includes a table for the consensus itself
2012-12-02 16:40:12 -05:00
Mark DePristo
ce9a323c04
NA12878 knowledge base automatically filters duplicate records out in the SiteIterator
...
-- Now it doesn't matter if there are duplicate records (all fields equal up to the date) in the knowledge base
2012-12-02 14:21:29 -05:00
Ryan Poplin
1bdf17ef53
Reworking of how the likelihood calculation is organized in the HaplotypeCaller to facilitate the inclusion of per allele downsampling. We now use the downsampling for both the GL calculations and the annotation calculations.
2012-12-02 11:58:32 -05:00
Mark DePristo
1828d33a5a
Bugfix to AssessNA12878
...
-- Wasn't handling indel overlaps correctly in SiteIterator.getSitesBefore, causing it to incorrectly skip variants underlying indels (the getSitesBefore was considering both start and stop [not the correct behavior]) causing it to only get sites up to the first record whose stop overlapped the requested start.
2012-12-02 11:09:15 -05:00
Eric Banks
d7b951b6f3
Finished up my reviews for megabase chr20:10M-11M. Fixed out of order record from earlier.
2012-12-01 23:35:21 -05:00
Mark DePristo
2849889af5
Updating md5 for UG
2012-12-01 14:24:19 -05:00
depristo
3105f13df3
Merge pull request #4 from jsilter/master
...
Remove validate, add note to put it back in when public gatk catches up
2012-11-30 13:24:44 -08:00
Mark DePristo
1100f0733b
Reviews for all unique omni poly sites on chr20
...
Updated setup script to includes these and ebanks reviews as well. Eric -- your file is currently not sorted, fyi
2012-11-30 16:23:27 -05:00
Jacob Silterra
02e98fa516
Remove validate, add note to put it back in when public gatk catches up
2012-11-30 16:08:00 -05:00
Mark DePristo
8020ba14db
Minor cleanup of SAMDataSource as part of my system review
...
-- Changed a few function from public to protected, as they are only used by the package contents, to simplify the SAMDataSource interface
2012-11-30 15:04:41 -05:00
Mark DePristo
66bbe46e5b
MongoDBManager prints out meaningful information with toString
2012-11-30 15:04:41 -05:00
Mark DePristo
3248ca3f91
Validate MongoVariantContext on creation
2012-11-30 15:04:40 -05:00
Mark DePristo
79dbcc205c
Minor cleanup for working version of igv
2012-11-30 15:04:40 -05:00
Mark DePristo
6b6a14cc6d
Moving ConsensusSummarizer to its appropriate home in core of NA12878KB
2012-11-30 15:04:40 -05:00
Mauricio Carneiro
db2a045321
Useful walker to establish minimum depth necessary for confident calling of different types of variants
2012-11-30 00:42:05 -05:00
Mauricio Carneiro
fc7fab5f3b
Fixed ReadBackedPileup downsampling
...
Downsampling in the PerSampleReadBackedPileup was broken, it didn't downsample anything, always returning a copy the original pileup.
2012-11-30 00:42:05 -05:00
Eric Banks
0e1287a843
Adding reviews for 1st 400kb of my target megabase (10-11) on chr20
2012-11-29 16:15:45 -05:00
Joel Thibault
97d29f203e
Add walltime changes to LSF
...
- Check whether the specified attribute is available
- Add pipeline test (disabled due to missing attribute)
2012-11-29 15:23:37 -05:00
Johan Dahlberg
daf6269b65
Setting the walltime
...
Signed-off-by: Joel Thibault <thibault@broadinstitute.org>
2012-11-29 15:23:36 -05:00
Mark DePristo
f837e6ced7
Refactored entire NA12878KB to allow us to easily build a na12878kb.jar for IGV integration
...
-- Just separated infrastructure into core package, away from the walkers themselves.
-- Added na12878kb.jar target that builds a jar that can run a test main function (see testNA12878kbJar.csh)
2012-11-29 14:38:09 -05:00
Mark DePristo
52a6df4f1a
Add SummarizeConsensus walker that spits out information about the callsets in the KB
...
-- Added summary to update consensus as well, so you can see what's been added as well
2012-11-29 13:07:46 -05:00
depristo
ed7a89c0c7
Merge pull request #3 from jsilter/master
...
Fix NA12878DBArgumentCollectionUnitTest
2012-11-29 08:52:38 -08:00
Jacob Silterra
d9e8a414ef
Fix NA12878DBArgumentCollectionUnitTest so it uses testng, and testCompareLocalRemoteLocators compare the right things
2012-11-29 11:03:21 -05:00
David Roazen
df2c26b554
Rename NA12878DBArgumentCollectionTest to NA12878DBArgumentCollectionUnitTest
...
Otherwise this test won't get run as part of the test suite...
2012-11-28 22:57:04 -05:00
David Roazen
b06e71cedf
Use build jars in test classpaths by default
...
-Allows packaged resource files to be accessed within tests
-Guards against packaging errors in dist/ jars by testing the
jars that actually get run rather than unpackaged class files.
Previously we were only protected against packaging errors in the
monolithic jars posted to our website, not the dist/ jars used in
everyday runs.
-"ant fasttest" still uses the unpackaged class files for speed
(don't want to have to rebuild the jars in fasttest). Relies on
dubious methods to get at the resource files that would end up
in the jars.
-Eliminated the stupid separate "test" ivy config. Now we only
invoke ivy ONCE during an ant build that includes tests.
2012-11-28 22:57:04 -05:00
Eric Banks
add1ab5d0e
Fix status of largeScaleValidationPools for NA12878-KB
2012-11-28 20:34:13 -05:00
Mark DePristo
b9be8850e2
Bugfixes to NA12878DBArgumentCollection and JSON and the GATK argument value injection system
...
-- Functions that depend on the value of variables that have GATK injection values must be initialized lazy, not at object creation time. Previous version broken dbToUse and useLocal arguments. Fixed
2012-11-28 19:02:07 -05:00
Mark DePristo
7b74bf6677
Excluding large scale validation callsets from KB until further reviewed, rebuilding production server now
2012-11-28 18:41:49 -05:00
Mark DePristo
4729f0858d
ExtractConsensusSites -include and -exclude callsets now works on supporting callsets not the actual name
...
-- Allows you to include / exclude callsets that appear in other callsets (as one would expect)
2012-11-28 18:41:16 -05:00
Mark DePristo
65357d26bc
New walker ExtractConsensusSites that extracts a VCF from the NA12878 Knowledge Base meeting criteria
...
-- See @link http://gatkforums.broadinstitute.org/discussion/1848/using-the-na12878-knowledge-base for more information
2012-11-28 18:13:07 -05:00
Mark DePristo
de7049463c
New walker ExtractConsensusSites that extracts a VCF from the NA12878 Knowledge Base meeting criteria
...
-- See @link http://gatkforums.broadinstitute.org/discussion/1848/using-the-na12878-knowledge-base for more information
2012-11-28 17:19:22 -05:00
Eric Banks
ff8b3904e2
Added many new resources to the NA12878 KB truth set
2012-11-28 17:18:24 -05:00
David Roazen
b2e699169c
Update GATK packaging settings to package arbitrary resources
...
With the newly-added support for packaging arbitrary resources, the
resources were getting packaged in a normal build but not when
creating a standalone GATK jar. This corrects this oversight.
2012-11-28 15:26:05 -05:00