Mauricio Carneiro
7c7ca0d799
fixing bug with fastq extension
...
* PPP only recognized .fasta and .fq, failing when the user provided a .fastq file. Fixed.
2012-01-24 11:02:15 -05:00
Mark DePristo
0a3172a9f1
Fix for ref 0 bases for Chris
...
-- Disturbingly, fixing this bug doesn't actually cause an test failures.
-- Wrote a new QCRefWalker to actually check in detail that the reference bases coming into the RefWalker are all correct when comparing against a clean uncached load of the contig bases directly.
-- However, I cannot run this tool due to some kind of weird BAM error -- sending this on to Matt
2012-01-24 10:55:09 -05:00
Mauricio Carneiro
945cf03889
IntelliJ ate my import!
2012-01-23 21:46:45 -05:00
Mauricio Carneiro
2bb9525e7f
Don't set base qualities if fastQ is provided
...
* Pacbio Processing pipeline now works with the new fastQ files outputted by the Pacbio instrument
2012-01-23 17:57:29 -05:00
Mark DePristo
b6c816fe12
Turn off unnecessary printing in analyzeRunReports
2012-01-23 17:49:37 -05:00
Mark DePristo
1172517abb
Bugfix for version parsing.
...
-- Now maps anything that doesn't exactly fit our git / svn schemes to unknown
-- Added max records and specific id options
2012-01-23 17:49:35 -05:00
Mark DePristo
ceca7e0b37
Bugfix to now separate completed, sting and user exceptions. Added dry run mode
2012-01-23 17:49:34 -05:00
Mark DePristo
1f620c79e6
Add busers and bugroup information to queueStatus
2012-01-23 17:49:32 -05:00
Mark DePristo
10bc26079d
bugfix to actually run correct python script
2012-01-23 17:49:31 -05:00
Mark DePristo
4b17fc3cc1
Parallel implementation of random forest training. Very cool (and easy) example of parallel processing in R
2012-01-23 17:49:29 -05:00
Mark DePristo
bb203ccf0a
combined analyses of snps and indels.
2012-01-23 17:49:28 -05:00
Mark DePristo
0ec6f86c21
Tests for event length, combined snps and indels. Partial infrastructure to train and eval trees.
2012-01-23 17:49:26 -05:00
Khalid Shakir
c18beadbdb
Device files like /dev/null are now tracked as special by Queue and are not used to generate .out file paths, scattered into a temporary directory, gathered, deleted, etc.
...
Attempted workaround for xdr_resourceInfoReq unsatisfied link during loading of libbat.so.
2012-01-23 16:17:04 -05:00
Christopher Hartl
cc4ba7372f
Why is reference_bases even an option anymore?
2012-01-23 15:18:59 -05:00
Christopher Hartl
3392d67c1a
Maybe a switch to reference bases will fix this
2012-01-23 15:10:03 -05:00
Christopher Hartl
15c0c294c1
Adding in this walker to try to debug the 0-byte ref bases
2012-01-23 14:51:24 -05:00
Mark DePristo
02450e4b12
Merged bug fix from Stable into Unstable
2012-01-23 12:08:39 -05:00
Christopher Hartl
798596257b
Enable the Genotype Phasing Evaluator. Because it didn't have the same argument structure as the base class, update2 of VariantEvaluator was being called, rather than update2 of the actual module.
2012-01-23 10:50:16 -05:00
Mark DePristo
80a4ce0edf
Bugfix for incorrect error messages for missing BAMs and VCFs
...
-- Missing BAMs were appearing as StingExceptions
-- Missing VCFs were showing up as CommandLineErrors, but it's clearer for them to be CouldNotReadInputFile exceptions
-- Added integration tests to ensure missing BAMs, VCFs, and -L files are properly thrown as CouldNotReadInputFile exceptions
-- Added path to standard b37 BAM to BaseTest
-- Cleaned up code in SAMDataSource, removing my parallel loading code as this just didn't prove to be useful.
2012-01-23 09:52:07 -05:00
Guillermo del Angel
31d2f04368
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-01-23 09:23:03 -05:00
Guillermo del Angel
966387ca0b
Next intermediate commit in the pool caller. Lots of bug fixes and now we can emit true vcf's with calls in discovery mode (still of unknown quality) - old validation mode is temporarily broken,will be fixed in next refactoring.
2012-01-23 09:22:31 -05:00
Christopher Hartl
4a08e8ca6e
Minor tweaks to T2D-related qscripts. Replacing old md5s from the BeagleIntegrationTest. All differences boiled down either to the accounting of genotypes changed (./. --> 0/0 is no longer a "changed" genotype, and original genotypes that were ./. are represented as OG=. rather than OG=./. .)
...
This is somewhat of an arbitrary decision, and is negotiable. I could see treating
GT:PL ./.:.
differently from
GT:PL .:0,3,6
but am not sure the worth of doing so.
2012-01-23 08:25:34 -05:00
Ryan Poplin
4d6312d4ea
HaplotypeCaller is now an ActiveRegionWalker.
2012-01-22 14:31:01 -05:00
Christopher Hartl
3b1aad4f17
After a minor and abject freakout, alter the T2D script to seek out truth sensitivities between 80 and 100, rather than between 0.8 and 1. Also, don't consider a genotype "changed by beagle" if the initial genotype is a no-call.
2012-01-20 23:43:51 -05:00
Christopher Hartl
9b4f6afa21
Alterations to scripts for better performance. Grid search now expands the sens/spec tradeoff (90 was far too aggressive against hapmap chr20), and 20 max gaussians was too many, and caused errors. For consensus genotypes: remember to gunzip the beagle outputs before converting to VCF. Also, beagle can in fact create 'null' alleles in certain circumstances. I'm not sure what exactly those circumstances are, but those sites should be ignored. When it does, all alleles apear to be set to null, so this should not affect the actual phasing in the output VCF.
2012-01-20 23:07:59 -05:00
Christopher Hartl
f3564bbf43
Ugh. Darn intelliJ not telling me I was missing an import statement.
2012-01-20 13:25:11 -05:00
Christopher Hartl
b902d778ca
.
2012-01-20 13:22:46 -05:00
Christopher Hartl
7c6a9471e8
After ensuring MultiplyLikelihoods does what I want it to do, add a quick and simple integration test to ensure I don't break it.
2012-01-20 13:20:13 -05:00
Christopher Hartl
e245cde47f
A new beagle script for generating a reference panel from lowpass, exome, and chip data. This is for T2D, but potentially useful.
2012-01-20 12:48:32 -05:00
Christopher Hartl
a91dd5d137
Merge branch 'master' of ssh://tin.broadinstitute.org/humgen/gsa-scr1/chartl/dev/unstable
2012-01-20 12:45:16 -05:00
Christopher Hartl
3fe73f155c
Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-01-20 12:44:22 -05:00
Ryan Poplin
4b18786b5d
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-01-19 22:05:20 -05:00
Ryan Poplin
ace9333068
Active region walkers can now see the reads in a buffer around thier active reigons. This buffer size is specified as a walker annotation. Intervals are internally extended by this buffer size so that the extra reads make their way through the traversal engine but the walker author only needs to see the original interval. Also, several corner case bug fixes in active region traversal.
2012-01-19 22:05:08 -05:00
Christopher Hartl
cd38110b7b
GQs are not always purged with this method of modifying attributes. To drop them, create the Genotype anew.
2012-01-19 20:11:20 -05:00
Christopher Hartl
b9f7103d09
Fix edge case where DP annotations (format) were creeping in
2012-01-19 19:41:43 -05:00
Christopher Hartl
72cd0a2450
And do it conditional on having likelihoods in the first place
2012-01-19 18:52:06 -05:00
Christopher Hartl
ed5302667b
Oops. Let's actually retain the genotype likelihoods.
2012-01-19 18:44:39 -05:00
Christopher Hartl
0644b75089
Remove attribute data from VariantContext and genotypes.
2012-01-19 18:30:32 -05:00
Menachem Fromer
fda29ebcbd
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-01-19 18:22:04 -05:00
Menachem Fromer
253d6483e1
Updated Batch-merge to retain ALL sites in input (SNPs, indels, regardless of their filtering status), and also optionally go back to the BAMs to perform VariantAnnotation
2012-01-19 18:21:22 -05:00
Menachem Fromer
066da80a3d
Added KEEP_UNCONDTIONAL option which permits even sites with only filtered records to be included as unfiltered sites in the output
2012-01-19 18:19:58 -05:00
Christopher Hartl
6e30d715cf
Minor changes to T2D VQSR. Adding in a small walker for multiplying likelihoods for generation of a consensus panel.
2012-01-19 18:00:07 -05:00
Aaron McKenna
ced6775de3
Changes to allow for external tests
...
Changes to the build script that allow the external directory to have tests.
This means groups like CGA don't have to reinvent the wheel on testing, and
can instead use the GATKs unit and integration tests.
Signed-off-by: David Roazen <droazen@broadinstitute.org>
2012-01-19 13:04:24 -05:00
Christopher Hartl
98f8431b07
Right. Forgot the = true. If only there were some way to silently commit this OH WAIT
2012-01-19 12:36:30 -05:00
Christopher Hartl
7f3ad25b01
Adding a mode to VariantFiltration to invalidate previously-applied filters to allow complete re-filtering of a VCF.
...
T2D VQSR: re-calling now done with appropriate quality settings and using BAQ.
2012-01-19 10:54:48 -05:00
Ryan Poplin
ecdd07b748
updating HaplotypeCaller integration test
2012-01-19 09:31:22 -05:00
Ryan Poplin
7e082c7750
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-01-19 09:11:23 -05:00
Christopher Hartl
d1c8c38541
A QScript to generate a VQSR of union sites for T2D, using a broad set and a union site set as input.
2012-01-19 02:04:04 -05:00
Christopher Hartl
39e6df5aa9
Fix edge case for very small VCFs
2012-01-19 00:51:28 -05:00
Christopher Hartl
1e037a0ecf
Ensure second-to-last line printed
2012-01-19 00:33:08 -05:00