Guillermo del Angel
b961f78f49
Temp fixes
2012-10-12 16:14:43 -04:00
Kristian Cibulskis
661fa5b98c
added support for indel calling (with non-VCF format output)
2012-10-12 16:02:05 -04:00
Eric Banks
a8efa5451a
Protect against bad bases users have screwy data (or try to use zipped references)
2012-10-12 15:05:03 -04:00
Guillermo del Angel
7e1657d243
Merge branch 'unstable' of github.com:broadinstitute/cmi-gatk into unstable
2012-10-12 14:49:37 -04:00
Mauricio Carneiro
274ac4836f
Allowing the GATK to have non-required outputs
...
Modified the SAMFileWriterArgumentTypeDescriptor to accept output bam files that are null if they're not required (in the @Output annotation).
This change enables the nWayOut parameter for the IndeRealigner and ReduceReads to operate optionally while maintaining the original single way out.
[#DEV-10 transition:31 resolution:1]
2012-10-12 14:49:16 -04:00
Mauricio Carneiro
05111eeaef
Making nContigs parameter hidden in ReduceReads
...
For now, the het reduction should only be performed for diploids (n=2). We haven't really tested it for other ploidy so it should remain hidden until someone braves it out.
2012-10-12 14:49:15 -04:00
Guillermo del Angel
32e377a0db
Fix bugs so that we can pass in 2 simultaneous samples in metadata (no co-cleaning yet but at least we don't need to run pipeline twice) to produce 2 bams. Pasted temp mutect so it's also run at the end of the run
2012-10-12 14:39:28 -04:00
David Roazen
da1cffbfca
Run performance tests in gsa-engineering queue on gsa4 rather than gsa queue
...
Running the performance tests on the farm wasn't working out very well --
it's been too long since they've run to completion. Switching back to
running them on gsa4 for now.
2012-10-12 14:21:27 -04:00
Guillermo del Angel
dc03a09722
Merge branch 'develop' into unstable
2012-10-12 14:19:42 -04:00
Kristian Cibulskis
c1706ef0ef
upgraded mutation caller with VCF output
...
raw indel calls (non filtered,non vcf)
2012-10-12 14:18:12 -04:00
Guillermo del Angel
5971006678
Bug fix when running nondiploid mode in UG with EMIT_ALL_SITES: if site was reference-only, QUAL is produced OK but genotypes were being set to no-call because of unnecessary likelihood normalization. May change integration test md5 which I'll fix later today
2012-10-12 12:45:55 -04:00
Eric Banks
81532a0529
Missing file are user errors.
2012-10-12 09:48:12 -04:00
Eric Banks
fa77a83783
Update the out of space error to include another permutation
2012-10-12 09:38:12 -04:00
Eric Banks
85525d9e6e
Make Geraldine's life easier: from now on we treat problems where a temp file cannot be found when running the GATK with multiple threads as User Errors (since they are 99.9% of the time). This is an extremely large class of errors in Tableau and on the forums. Helpful error message tells users exactly what we tell them on the forums anyways (Geraldine: feel free to edit).
2012-10-12 09:19:50 -04:00
Eric Banks
ad60300bee
Catch malformed BAM files at the source since this is the largest class of errors in Tableau.
2012-10-12 09:07:57 -04:00
Eric Banks
593c8065d9
Fix docs for BadMateFilter
2012-10-12 08:35:45 -04:00
Christopher Hartl
6b9987cf1b
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2012-10-12 00:48:42 -04:00
Christopher Hartl
c1211ad3a1
Full test suite of LD-corrected GRM calculation. The correctness of this code is now largely verified. Matches GCTA when no correction is used (up to 6 decimal places). Bed reading relies on a particular test directory that is still local. The rest is all generated in unit test fashion.
2012-10-12 00:46:02 -04:00
David Roazen
3861212dab
Fix inefficiency in FilePointer GenomeLoc validation
...
Validation of GenomeLocs in the FilePointer class was extremely inefficient
when the GenomeLocs were added one at a time rather than all at once.
Appears to mostly fix GSA-604
2012-10-11 19:55:14 -04:00
Guillermo del Angel
47e9d967fe
Merging in from cmi-develop branch - staying in this branch for now
2012-10-11 15:35:43 -04:00
Guillermo del Angel
77949ec740
Some fixes to QC commands in pipeline, and workaround for critical engine bug in GATK that makes it hang when doing small targeted BAM's with a whole exome interval list
2012-10-11 15:08:30 -04:00
Ami Levy Moonshine
ef3882f439
PhaseByTransmission: small typo /n. variantCallQC_summaryTablesOnly.R: small changes (more to come) /n GeneralCallingPipeline.scala: the new pipeline script. It is not as clean as I want it to be, but it works. I still going to work on it a little bit more. Also, it does not include yet: (1) the RR step (2) need better eval step (3) need to include other targets (currently it eork on the CEU Trio)
2012-10-11 14:51:41 -04:00
Guillermo del Angel
af5a6fdace
Resolve [DEV-7]: add single-sample VCF calling at end of FASTQ-BAM pipeline. Initial steps of [DEV-4]: queue extensions for Picard QC metrics
2012-10-11 11:09:49 -04:00
Mark DePristo
9b19f5ce99
No longer include stack traces for user exceptions in GATK logs
...
-- Was taking a shocking large amount of space on the server, and slowing down Tableau so much all stack traces had to be disabled
2012-10-10 20:41:03 -04:00
Ryan Poplin
08b8ce6903
Fixing merge conflicts related to the comment formatting in the BQSR.
2012-10-10 16:03:58 -04:00
Ryan Poplin
45717349dc
Fixing BQSR bug reported on the forum for reads that begin with insertions.
2012-10-10 16:01:37 -04:00
David Roazen
40a3b5bfe2
Revert "Testing github auto-mirroring attempt #2 ; please ignore"
...
This reverts commit aacbe369446af8d7901820bf828ed15d72497005.
2012-10-10 15:28:50 -04:00
David Roazen
fba6a084e4
Testing github auto-mirroring attempt #2 ; please ignore
2012-10-10 15:28:13 -04:00
David Roazen
267d1ff59c
Revert "Testing the new github auto-mirroring; please ignore"
...
This reverts commit bd8b321132167f6f393f234ea0e93edcfd8701ff.
2012-10-10 15:07:48 -04:00
David Roazen
66ee3f230f
Testing the new github auto-mirroring; please ignore
2012-10-10 15:06:50 -04:00
Mauricio Carneiro
e9eaa33c0b
adding some directories to gitignore
2012-10-10 13:26:13 -04:00
Mauricio Carneiro
29195cd3aa
Removed the intellij files from the root and made an example package for new users. This allows users to start at the same page and then change it as they see fit without interfering with the repo (thanks guillermo!)
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
fdf29503fb
removing annoying xml from IDEA configuration
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
e29bcab42e
Updating Intellij enviroment and adding Scala
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
f085f5d46a
Adding default intellij configuration files
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
88297606f0
Adding intellij example configuration files
2012-10-10 13:20:30 -04:00
Guillermo del Angel
c0b7d53170
a) Initial raw version of CMI BAM->VCF pipeline (most likely not working yet, but at least compiles and produces reasonable command lines), b) rename FASTQ->BAM script so name is more descriptive
2012-10-10 13:19:05 -04:00
Kristian Cibulskis
2311606de4
initial cancer pipeline with mutations and partial indel support
2012-10-10 13:19:04 -04:00
Guillermo del Angel
45aa59a31c
BAM pipeline fixes: a) temp workaround for DEV-9: -nWayOut argument in IndelRealigner is broken, for now things will only really work in single sample mode, b) correct extension of RealignerTargetCreator output, previous extension caused an error
2012-10-10 13:19:04 -04:00
Guillermo del Angel
b8c721e6ec
Minor tweaks to CMIProcessing Pipeline: a) don't hard-code job mem limit to 4 G since it's too much for most AWS instances, leave it instead as input argument, b) minor doc cleanups
2012-10-10 13:19:04 -04:00
Mauricio Carneiro
ca055d8804
Reimplementation of the BAM procesing pipeline using the metadata information file.
...
Pipeline runs end-to-end using example metadata and has been tested only for cases where everything is ideal.
Next step is to bring this to the cloud, test all different scenario (multiple tumors, single ended, missing parameters etc).
Parallel next step is to add QC metrics.
2012-10-10 13:19:04 -04:00
Mauricio Carneiro
25ff934e5a
New version of the pipeline starting from an ALIGNED bam going all the way to reducing using n-way out cleaning
2012-10-10 13:19:04 -04:00
Mauricio Carneiro
7d4adea183
Revised implementation of the RAWBAM => BAM pipeline
...
stripped out all the FQ pipeline and tumor/normal information.
2012-10-10 13:19:03 -04:00
Mauricio Carneiro
e413b9fe51
First implementation of the CMI data processing pipeline, handling both germline and cancer BAM/FQ => BAM.
...
Not ready for prime time yet, need more work!
2012-10-10 13:19:03 -04:00
Mauricio Carneiro
0c17709223
First implementation of a generic 'bundled' Data Processing Pipeline for germline and cancer.
...
not ready for prime time yet!
2012-10-10 13:19:03 -04:00
Mauricio Carneiro
08b6d1559c
Reverting the DPP to the original version, going to create a new simplified version for CMI in private.
2012-10-10 13:19:03 -04:00
Mauricio Carneiro
f9095c7ab7
Generic input file name recognition (still need to implement support to FastQ, but it now can at least accept it)
2012-10-10 13:19:03 -04:00
Ryan Poplin
15b405d458
Merge branch 'master' of ssh://gsa2.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2012-10-10 10:47:40 -04:00
Ryan Poplin
2a9ee89c19
Turning on allele trimming for the haplotype caller.
2012-10-10 10:47:26 -04:00
Christopher Hartl
7381d5c243
Since this GRM now matches GCTA output for uncorrected intervals, implement and start proofing methods for LD-correction for genome partitioning. Very rudimentary tests just to solidify current position.
...
Wish I could do this in the GATK, but it has to run on bed files natively. Phooey.
2012-10-10 01:59:13 -04:00