Mauricio Carneiro
86305a5dcf
Adjusting the memory limits of the MDCP
...
Indel caller needs more than 3G for large datasets.
2011-10-21 17:41:52 -04:00
Mauricio Carneiro
c9d8b22092
Added BWASW support to the pipeline
...
Data Processing Pipeline can now use BWASW for realigning the reads. Useful for Ion Torrent data.
2011-10-20 18:36:28 -04:00
Mauricio Carneiro
093cd95c5d
Merged bug fix from Stable into Unstable
2011-10-20 17:03:22 -04:00
Mauricio Carneiro
d7367c152a
Fixing 'revert' when not realigning
...
RevertSam was reverting the alignment information and that was screwing up the pipeline if you didn't want to run it with BWA. Fixed.
2011-10-20 17:01:54 -04:00
Mauricio Carneiro
ed402588cc
Adding the "gold standard NA12878" target
2011-10-20 16:19:13 -04:00
Mauricio Carneiro
0939d16a8d
String not empty bug
...
Apparently var X: String = _ is not the same as var X: String = "". :(
2011-10-13 13:22:05 -04:00
Mauricio Carneiro
66b5646f95
Adding hidden options to the DPP
...
controlling the default platform parameter to Count Covariates and the number of scatter gather jobs to generate are now available under hidden parameters
2011-10-11 13:56:00 -04:00
Mark DePristo
a91509e7dd
Shouldn't be public
2011-10-05 15:22:57 -07:00
Mauricio Carneiro
d3cc25454c
Updating the MDCP
2011-09-22 11:27:40 -04:00
Mauricio Carneiro
623c49765d
NO BAQ ON EXOMES!
...
says the boss.
2011-09-22 11:13:40 -04:00
Ryan Poplin
5d0f284305
Fixing exome specific arguments to the VQSR in the methods development calling pipeline
2011-09-21 20:26:28 -04:00
Mauricio Carneiro
758ecf2d43
Bringing latest updates of ReduceReads to the master repository
2011-09-20 16:35:09 -04:00
Mauricio Carneiro
08ffb18b96
Renaming datasets in the MDCP
...
Making dataset names and files generated by the MDCP more uniform.
2011-09-20 11:02:51 -04:00
Eric Banks
ba150570f3
Updating to use new rod system syntax plus name change for CountRODs
2011-09-19 13:30:32 -04:00
Eric Banks
85626e7a5d
We no longer want people to use the August 2010 Dindel calls for indel realignment but instead Guillermo's new whole genome bi-allelic indel calls; updating the bundle accordingly. Also, there was some confusion by the 1000G data processing folks as to exactly what these indel files are, so I've renamed them so that it's clear. Wiki updated too.
2011-09-19 12:24:05 -04:00
Khalid Shakir
33967a4e0c
Fixed issue reported by chartl where cloned functions lost tags on @Inputs.
...
Updated ExampleUnifiedGenotyper.scala with new syntax.
2011-09-16 12:46:07 -04:00
Ryan Poplin
981b78ea50
Changing the VQSR command line syntax back to the parsed tags approach. This cleans up the code and makes sure we won't be parsing the same rod file multiple times. I've tried to update the appropriate qscripts.
2011-09-12 12:17:43 -04:00
Mauricio Carneiro
7f9000382e
Making indel calls default in the MDCP
...
You can turn off indel calling by using -noIndels.
2011-09-09 14:09:26 -04:00
Mauricio Carneiro
ee9d599558
Just cleaning up
...
clean up old commented code from tha data processing pipeline.
2011-09-07 13:32:40 -04:00
Mauricio Carneiro
28d782b4c7
Allowing multiple dnsnp and indel files in the DPP
2011-09-02 13:38:47 -04:00
Mauricio Carneiro
ad4ea0b80b
Merged bug fix from Stable into Unstable
2011-09-01 18:14:45 -04:00
Mauricio Carneiro
e253f6f05d
Fixing typo in DPP
...
platform and library were exchanged when rebuilding the read group information
2011-09-01 18:13:52 -04:00
Mauricio Carneiro
d2a33beff7
Added WGS/WEX b37-decoy CEU trio datasets
2011-09-01 13:14:40 -04:00
Mauricio Carneiro
16caca0822
BLASR BAMs and new BWA parameters
...
*Added the functions to turn a BLASR generated BAM file into a usable BAM file.
*Modified the bwa parameters according to test results from NA12878 pb2k dataset.
2011-08-24 17:04:07 -04:00
Mauricio Carneiro
dc8398e165
fixing bai output for indel cleaning.
2011-08-24 15:58:34 -04:00
Mauricio Carneiro
cd12f7f286
Fixed list dependency
...
Instead of creating a bam list file, I dynamically create a scala list and pass as parameters. This way the intermediate bam files don't get deleted before they should.
2011-08-24 11:12:46 -04:00
Mauricio Carneiro
219252a566
Adapting to the new RodBinding framework
2011-08-24 11:12:46 -04:00
Mauricio Carneiro
136f0eb685
Creating sample-bam list instead of joining
...
This should save us at least one day in the trio decoy processing.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
04d8bcaf19
Fixed bai removal on picard tools
...
BAM index files were not being deleted because picard replaces the name of the file with bai instead of appending to it.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
caebc88e9a
Consensus mode and new RodBinding framework.
...
The DPP was not using the parameter correctly. It didn't matter for the default option (which is the only one we have been testing) but it would not work for knowns only or smith waterman. It is fixed now.
It now complies with the new rod binding framework.
2011-08-22 18:03:39 -04:00
Ryan Poplin
f93a554b01
updating exome specific parameters in MDCP
2011-08-21 10:25:36 -04:00
Ryan Poplin
b008676878
fixing the previous fix
2011-08-20 21:21:55 -04:00
Ryan Poplin
539e157ecd
Fixing misc parameters in MDCP. The pipeline now does VariantEval of output by default. Fix for NaN vqslod values in VQSR
2011-08-20 11:28:48 -04:00
Ryan Poplin
ddb5045e14
Updating the methods development calling pipeline for the new rod binding syntax and the new best practices.
2011-08-19 19:29:51 -04:00
Mauricio Carneiro
b0ff5b1ff7
a better name for the pacbio processing pipeline
2011-08-10 16:16:53 -04:00
Mauricio Carneiro
481630da00
BWA parameters added
2011-08-09 17:05:24 -04:00
Mauricio Carneiro
22d2563823
added BWA SW alignment
...
The pipeline now accepts fasta/fastq files and aligns them using BWA SW, adds default basequalities, creates read groups and performs BQSR.
2011-08-09 17:05:24 -04:00
Mauricio Carneiro
bd1cf4c7bc
Pacbio Pipeline
...
Added the base quality "filling" step to allow the pipeline to handle raw pacbio BAM files. This is the first step towards a generic pacbio data processing pipeline.
2011-08-09 17:05:24 -04:00
Ryan Poplin
8072bd9831
Updating resource bundle generation qscript for changeover to git
2011-08-08 12:35:39 -04:00
Mauricio Carneiro
2fd101135c
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-08-08 10:49:43 -04:00
Mauricio Carneiro
4d6cb33612
removing temporary bam index
...
The clean bai file was left behind after the data processing pipeline was done
2011-08-08 10:49:28 -04:00
Ryan Poplin
21dc9a5543
Adding mills/devine indel dataset to the resource bundle
2011-08-04 12:31:28 -04:00
Mauricio Carneiro
aff681e407
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-08-04 11:05:25 -04:00
Mauricio Carneiro
23ec5b94cf
fixed a missing check for null
...
There was a missed check for the case when you don't provide an indels vcf for the cleaner.
2011-08-04 09:50:02 -04:00
Mauricio Carneiro
8981367307
Updating memory usage for picard programs
2011-08-03 15:48:28 -04:00
Khalid Shakir
a587f38808
Fixed example unified genotyper pipeline to wrap filter expressions with quotes and use rod binding name "variant" instead of "vcf".
2011-08-03 02:21:01 -04:00
Mauricio Carneiro
2d94037ad0
Remove temporary index files (*.bai)
...
some temporary index files were not being removed.
2011-07-30 02:05:22 -04:00
Mauricio Carneiro
dcf21f379a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/stable
2011-07-23 12:59:53 -04:00
Mauricio Carneiro
f0a6dd27a1
Renaming the plot output directory names.
2011-07-23 12:59:37 -04:00
Mauricio Carneiro
4f78025b0b
Merged bug fix from Stable into Unstable
2011-07-22 14:42:04 -04:00