Commit Graph

64 Commits (5e832254a4e024378f7fdee252abf7df9e289c6a)

Author SHA1 Message Date
Eric Banks 85626e7a5d We no longer want people to use the August 2010 Dindel calls for indel realignment but instead Guillermo's new whole genome bi-allelic indel calls; updating the bundle accordingly. Also, there was some confusion by the 1000G data processing folks as to exactly what these indel files are, so I've renamed them so that it's clear. Wiki updated too. 2011-09-19 12:24:05 -04:00
Khalid Shakir 33967a4e0c Fixed issue reported by chartl where cloned functions lost tags on @Inputs.
Updated ExampleUnifiedGenotyper.scala with new syntax.
2011-09-16 12:46:07 -04:00
Ryan Poplin 981b78ea50 Changing the VQSR command line syntax back to the parsed tags approach. This cleans up the code and makes sure we won't be parsing the same rod file multiple times. I've tried to update the appropriate qscripts. 2011-09-12 12:17:43 -04:00
Mauricio Carneiro 7f9000382e Making indel calls default in the MDCP
You can turn off indel calling by using -noIndels.
2011-09-09 14:09:26 -04:00
Mauricio Carneiro ee9d599558 Just cleaning up
clean up old commented code from tha data processing pipeline.
2011-09-07 13:32:40 -04:00
Mauricio Carneiro 28d782b4c7 Allowing multiple dnsnp and indel files in the DPP 2011-09-02 13:38:47 -04:00
Mauricio Carneiro ad4ea0b80b Merged bug fix from Stable into Unstable 2011-09-01 18:14:45 -04:00
Mauricio Carneiro e253f6f05d Fixing typo in DPP
platform and library were exchanged when rebuilding the read group information
2011-09-01 18:13:52 -04:00
Mauricio Carneiro d2a33beff7 Added WGS/WEX b37-decoy CEU trio datasets 2011-09-01 13:14:40 -04:00
Mauricio Carneiro 16caca0822 BLASR BAMs and new BWA parameters
*Added the functions to turn a BLASR generated BAM file into a usable BAM file.
*Modified the bwa parameters according to test results from NA12878 pb2k dataset.
2011-08-24 17:04:07 -04:00
Mauricio Carneiro dc8398e165 fixing bai output for indel cleaning. 2011-08-24 15:58:34 -04:00
Mauricio Carneiro cd12f7f286 Fixed list dependency
Instead of creating a bam list file, I dynamically create a scala list and pass as parameters. This way the intermediate bam files don't get deleted before they should.
2011-08-24 11:12:46 -04:00
Mauricio Carneiro 219252a566 Adapting to the new RodBinding framework 2011-08-24 11:12:46 -04:00
Mauricio Carneiro 136f0eb685 Creating sample-bam list instead of joining
This should save us at least one day in the trio decoy processing.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro 04d8bcaf19 Fixed bai removal on picard tools
BAM index files were not being deleted because picard replaces the name of the file with bai instead of appending to it.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro caebc88e9a Consensus mode and new RodBinding framework.
The DPP was not using the parameter correctly. It didn't matter for the default option (which is the only one we have been testing) but it would not work for knowns only or smith waterman. It is fixed now.

It now complies with the new rod binding framework.
2011-08-22 18:03:39 -04:00
Ryan Poplin f93a554b01 updating exome specific parameters in MDCP 2011-08-21 10:25:36 -04:00
Ryan Poplin b008676878 fixing the previous fix 2011-08-20 21:21:55 -04:00
Ryan Poplin 539e157ecd Fixing misc parameters in MDCP. The pipeline now does VariantEval of output by default. Fix for NaN vqslod values in VQSR 2011-08-20 11:28:48 -04:00
Ryan Poplin ddb5045e14 Updating the methods development calling pipeline for the new rod binding syntax and the new best practices. 2011-08-19 19:29:51 -04:00
Mauricio Carneiro b0ff5b1ff7 a better name for the pacbio processing pipeline 2011-08-10 16:16:53 -04:00
Mauricio Carneiro 481630da00 BWA parameters added 2011-08-09 17:05:24 -04:00
Mauricio Carneiro 22d2563823 added BWA SW alignment
The pipeline now accepts fasta/fastq files and aligns them using BWA SW, adds default basequalities, creates read groups and performs BQSR.
2011-08-09 17:05:24 -04:00
Mauricio Carneiro bd1cf4c7bc Pacbio Pipeline
Added the base quality "filling" step to allow the pipeline to handle raw pacbio BAM files. This is the first step towards a generic pacbio data processing pipeline.
2011-08-09 17:05:24 -04:00
Ryan Poplin 8072bd9831 Updating resource bundle generation qscript for changeover to git 2011-08-08 12:35:39 -04:00
Mauricio Carneiro 2fd101135c Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-08 10:49:43 -04:00
Mauricio Carneiro 4d6cb33612 removing temporary bam index
The clean bai file was left behind after the data processing pipeline was done
2011-08-08 10:49:28 -04:00
Ryan Poplin 21dc9a5543 Adding mills/devine indel dataset to the resource bundle 2011-08-04 12:31:28 -04:00
Mauricio Carneiro aff681e407 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-08-04 11:05:25 -04:00
Mauricio Carneiro 23ec5b94cf fixed a missing check for null
There was a missed check for the case when you don't provide an indels vcf for the cleaner.
2011-08-04 09:50:02 -04:00
Mauricio Carneiro 8981367307 Updating memory usage for picard programs 2011-08-03 15:48:28 -04:00
Khalid Shakir a587f38808 Fixed example unified genotyper pipeline to wrap filter expressions with quotes and use rod binding name "variant" instead of "vcf". 2011-08-03 02:21:01 -04:00
Mauricio Carneiro 2d94037ad0 Remove temporary index files (*.bai)
some temporary index files were not being removed.
2011-07-30 02:05:22 -04:00
Mauricio Carneiro dcf21f379a Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable 2011-07-23 12:59:53 -04:00
Mauricio Carneiro f0a6dd27a1 Renaming the plot output directory names. 2011-07-23 12:59:37 -04:00
Mauricio Carneiro 4f78025b0b Merged bug fix from Stable into Unstable 2011-07-22 14:42:04 -04:00
Mauricio Carneiro 4080e2cd88 * Added the decoy reference to the bundle under the b37 resources.
* Updated the -svn argument to -ver since we don't use svn anymore (also updated the wiki).
2011-07-22 14:41:22 -04:00
Mauricio Carneiro 9ad5c7dfa4 Resolving simple conflicts in the data processing pipeline.
Conflicts:
	public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala
2011-07-19 08:05:11 -04:00
Mauricio Carneiro 7688bda1a6 better progress report for the DPP 2011-07-18 23:39:47 -04:00
Mauricio Carneiro 2b465ab43b * added optional 'no validation' for the Data Processing pipeline.
* some simplifications on the picard classes
2011-07-18 23:30:31 -04:00
Mauricio Carneiro 4cf7a2af23 Removed broad specific default paths so people from outside the broad can use it. 2011-07-18 23:25:21 -04:00
Mauricio Carneiro 5cb5a4ec75 Merged bug fix from Stable into Unstable 2011-07-16 00:23:59 -04:00
Mauricio Carneiro dd92a14b40 Made extra indel VCF optional but DBSNP mandatory. 2011-07-16 00:23:35 -04:00
Mauricio Carneiro 2fa5dbb0fe Merged bug fix from Stable into Unstable 2011-07-16 00:15:19 -04:00
Mauricio Carneiro ed55182a4c Removing Broad specific paths from parameters and making them required. This should make it unambiguous for people inside and outside the Broad to use the DataProcessingPipeline (as per request in the GetSatisfaction) 2011-07-16 00:09:00 -04:00
Mauricio Carneiro 43bd45fcad Merged bug fix from Stable into Unstable 2011-07-15 19:40:02 -04:00
Mauricio Carneiro fd1df31ef0 changing the output directory names for Analyze Covariates 2011-07-15 19:39:42 -04:00
Mauricio Carneiro aa30f416a3 Resolving conflicts
Conflicts:
	private/scala/qscript/depristo/ExomePostQCEval.scala
	private/scala/qscript/depristo/PostCallingQC.scala
	private/scala/qscript/org/broadinstitute/sting/queue/qscripts/archive/ExomePostQCEval.scala
2011-07-15 16:21:42 -04:00
Mauricio Carneiro 7b7d40d5d9 A better name for the qscript utilities. Throw here every method you find yourself repeatedly implementing in your qscripts!
Refactoring appropriately.
2011-07-15 14:34:50 -04:00
Mauricio Carneiro a670d6420a Refactoring Qscript utils into queue general utils package. 2011-07-15 14:31:43 -04:00