Commit Graph

6297 Commits (f19862a643504ebe99d71b778ebd85e5a00726f0)

Author SHA1 Message Date
Mauricio Carneiro f19862a643 Fixing conflicts. 2011-07-14 17:13:31 -04:00
Mauricio Carneiro 43c6a8565b looks better now. 2011-07-14 17:10:44 -04:00
Mauricio Carneiro 09ffe277ae Added a qscripts util package with some utility functions commonly shared across queue scripts. Refactored some of my public scripts to use it in an effort to make queue scripts more reusable and "supportable". 2011-07-14 17:09:35 -04:00
Mauricio Carneiro 4f8230c750 Merged bug fix from Stable into Unstable 2011-07-14 16:44:57 -04:00
Mauricio Carneiro 9f5180ab05 Recalibrates a list of bam files allowing multiple bams to be recalibrated out of a single 'mother' queue job. 2011-07-14 16:42:17 -04:00
Mark DePristo c0bbeb23ba Now providing more information when the index on the fly isn't equal to the one created by reading the file from disk. 2011-07-14 15:12:28 -04:00
Mark DePristo 5ffeddd3b1 better to use _ instead of ., as this is a special case later. 2011-07-14 14:45:16 -04:00
Eric Banks 9540df6998 Oops, forgot to update unit test 2011-07-14 14:00:19 -04:00
Eric Banks ed6beae1f3 Adding headers to diffable reading for VCFs 2011-07-14 13:55:35 -04:00
Eric Banks 57a90173f3 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-14 11:56:20 -04:00
Eric Banks 66c652d687 Added some extra error checks in the VCF codec. Now that we've moved this back into the GATK, changed some of the standard exceptions to be USerErrors (instead of TribbleExceptions). 2011-07-14 11:56:10 -04:00
Mark DePristo ad373ffa7e Broke down and implemented parallel get. Now running 10x parallel no problems on gsa1. 2011-07-14 09:12:43 -04:00
Mark DePristo 1390f354dc Giving up on parallel s3 fetching. The solution here is to do incremental processing, and not try to download 1.5M records at once, but obtain the ~10K per day that are appearing. 2011-07-13 23:12:19 -04:00
Mark DePristo ae044af678 Forgot to comment out a debug 2011-07-13 22:39:58 -04:00
Mark DePristo caa3629467 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 22:38:27 -04:00
Mark DePristo d6c1540e57 Parallel version. 2011-07-13 22:37:49 -04:00
Mark DePristo 63c58c5f7e Log file no longer optional. 2011-07-13 21:41:15 -04:00
Mark DePristo d163a2106a Now uses the log file to determine which files exist locally, avoiding the potentially expensive check on disk 2011-07-13 21:36:02 -04:00
Mauricio Carneiro a4ab19d040 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 15:14:06 -04:00
Eric Banks 0c54c796ed Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 14:57:33 -04:00
Eric Banks bb0e3a26fc Added integration test for VCF writing. Also, bug fix for writing the GT-free records. 2011-07-13 14:57:21 -04:00
Mauricio Carneiro df996a1a73 more progress report for the Data Processing Pipeline.
Bam lists can now have empty lines, comments and whitespaces anywhere.
2011-07-13 14:53:58 -04:00
Eric Banks 6a431da554 Don't output source and ref header lines anymore. Short-term motivation for this is that I'd like this tool when run on a VCF to emit the exact same VCF. Long-term motivation is that these tags should be output by the VCF writer itself for all tools. 2011-07-13 14:40:01 -04:00
Mauricio Carneiro e2f2917bd2 Merged bug fix from Stable into Unstable 2011-07-13 13:00:55 -04:00
Mauricio Carneiro ff4e31c554 Changing the file names as per Kris request. 2011-07-13 12:59:18 -04:00
Menachem Fromer 74aa49e423 Merged bug fix from Stable into Unstable 2011-07-13 12:12:42 -04:00
Menachem Fromer fa3ff53508 Filters should only be applied to the new VC if the old VC had filters applied 2011-07-13 11:58:16 -04:00
Mark DePristo e86113e537 Minor fixes for logging. Now logs even if you don't need to download the file (it's a complete log now).
Bug fix for del logging.
2011-07-13 10:27:24 -04:00
Mark DePristo b04b8b20a5 Added support for -p log so that we can track progress without trying to ls a directory with 1.5M entries in it. 2011-07-13 10:22:32 -04:00
Mark DePristo 6450b48951 manageGATKS3Logs now downloads and deletes files in groups of 100 files by default. This should significantly improve the performance of the S3 log synchronization. 2011-07-13 10:14:38 -04:00
Eric Banks 969227c657 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-13 10:01:28 -04:00
Eric Banks 797c50e689 Fixing integration tests I broke yesterday; removing batch merging test since we don't support that anymore. 2011-07-13 10:01:23 -04:00
Eric Banks 6007eea3ff Allowing VCF records without GTs in vf4.1 2011-07-13 09:56:08 -04:00
Christopher Hartl 95040d95b9 Adding in check for filtered site (Sorry Mark, looks like it wasn't checking the validated rod, only the mask). Also by allowing user to lowercase SNPs, could miss 'SNP_TOO_NEAR_PROBE', now we properly check for that. 2011-07-12 19:21:26 -04:00
Christopher Hartl 61dad4f090 Merge branch 'master' of ssh://chartl@tin.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 18:33:30 -04:00
Christopher Hartl 30768eccbb Big change to PSP2: Amplicon sequence no longer lower-cased for repetitiveness, but instead for non-uniqueness via alignment by bwa. Performance heavily dependent on length of sequence (duh), with size=30 a good balance, but default is 20 because that's the default length of a sequenom primer. Indentation changes to other stuff. 2011-07-12 18:33:12 -04:00
Mauricio Carneiro 60870b360a Fixed the calculation of the allele count prior, it was using ac instead of maxAlleleCount. 2011-07-12 17:18:52 -04:00
Mauricio Carneiro d21388f0cb Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:56:38 -04:00
Mauricio Carneiro 775d2c2598 Added VQSR to the ReducedBAM evaluation script. Our tests need to address the annotations used by VQSR so we can actually measure the impact of changing the parameters in the ReduceReads walker (especially context size). 2011-07-12 15:56:24 -04:00
Ryan Poplin c944019678 Adding dev qscript I used to perform the exome t2d+1kg calling experiment 2011-07-12 15:44:45 -04:00
Ryan Poplin 837fb8f689 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:39:26 -04:00
Ryan Poplin 5077c94d85 Adding MappingQualityUnavailableReadFilter to the SNP and indel CountCovariates 2011-07-12 15:39:07 -04:00
Mark DePristo 2092fb439c Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:29:52 -04:00
Roger Zurawicki 6ee8a86197 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:26:13 -04:00
Roger Zurawicki 7991d69ba4 I added experimental function in the ReducedBAMEvaluation.
At the beginning of the scrpit you can specify a list of values to test and it will process the ReduceReadsWalker for your parameter.

I also added a method to convert the number when naming files to be sort-friendly.
2011-07-12 15:25:43 -04:00
Mark DePristo 01fd6a6949 Merge branch 'master' of ssh://gsa1/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-07-12 15:20:44 -04:00
Mark DePristo ccedd6ff4c Difference is now the general form -- used to be SummarizedDifference. The old Difference class is now a subclass of Difference that includes pointers to specific the master and test DiffElements.
Added a size() function that calculates the number of elements tree from a DiffElement.
2011-07-12 15:20:28 -04:00
Eric Banks a2597e7f00 This commit incorporates several different changes that each pretty much break all the VCF-based integration tests, so I bunched them all together. We now officially emit VCF4.1 files (woo hoo), which means that the VCF headers are now all different (header version is 4.1 plus counts for some of the annotations are 'A' or 'G'). Also, I've added a Read Filter for reads with MQ=255 ('unavailable' in the SAM spec) and have applied this to the UG and the RMS MQ annotation. 2011-07-12 14:11:53 -04:00
Ryan Poplin 329c3d8050 Merged bug fix from Stable into Unstable 2011-07-12 13:55:51 -04:00
Ryan Poplin 73735863b0 Fix for the case of requesting genotype for a sample that doesn't exist in a VariantContext 2011-07-12 13:55:21 -04:00