Mark DePristo
99ad7b2d71
GeneralPloidyExact should use indel max alt alleles
2012-10-15 07:53:54 -04:00
Mark DePristo
bf276baca0
Don't try to compute full exact model for > 100 samples
2012-10-15 07:53:54 -04:00
Mark DePristo
b924e9ebb4
Add OptimizedDiploidExactAF to PerformanceTesting framework
2012-10-15 07:53:54 -04:00
Mark DePristo
f800f3fb88
Optimized diploid exact AF calculation uses maxACs to stop the calculation by maxAC by allele
...
-- Added unit tests to ensure the approximation isn't so far from our reference implementation (DiploidExactAFCalculation)
2012-10-15 07:53:54 -04:00
Mark DePristo
efad215edb
Greedy version of function to compute the max achievable AC for each alt allele
...
-- walks over the genotypes in VC, and computes for each alt allele the maximum AC we need to consider in that alt allele dimension. Does the calculation based on the PLs in each genotype g, choosing to update the max AC for the alt alleles corresponding to that PL. Only takes the first lowest PL, if there are multiple genotype configurations with the same PL value. It takes values in the order of the alt alleles.
2012-10-15 07:53:54 -04:00
Mark DePristo
7666a58773
Function to compute the max achievable AC for each alt allele
...
-- Additional minor cleanup of ExactAFCalculation
2012-10-15 07:53:53 -04:00
Mark DePristo
b3cb33a416
simple script to run nano schedule main[]
2012-10-15 07:52:02 -04:00
Guillermo del Angel
a4767a20be
Bug fixes for temp mutect integration
2012-10-13 22:03:41 -04:00
Guillermo del Angel
e3a8ed2151
Further bug fixes to merge cancer/germline fastq-bam pipelines
2012-10-13 11:16:14 -04:00
Guillermo del Angel
b961f78f49
Temp fixes
2012-10-12 16:14:43 -04:00
Kristian Cibulskis
661fa5b98c
added support for indel calling (with non-VCF format output)
2012-10-12 16:02:05 -04:00
Eric Banks
a8efa5451a
Protect against bad bases users have screwy data (or try to use zipped references)
2012-10-12 15:05:03 -04:00
Guillermo del Angel
7e1657d243
Merge branch 'unstable' of github.com:broadinstitute/cmi-gatk into unstable
2012-10-12 14:49:37 -04:00
Mauricio Carneiro
274ac4836f
Allowing the GATK to have non-required outputs
...
Modified the SAMFileWriterArgumentTypeDescriptor to accept output bam files that are null if they're not required (in the @Output annotation).
This change enables the nWayOut parameter for the IndeRealigner and ReduceReads to operate optionally while maintaining the original single way out.
[#DEV-10 transition:31 resolution:1]
2012-10-12 14:49:16 -04:00
Mauricio Carneiro
05111eeaef
Making nContigs parameter hidden in ReduceReads
...
For now, the het reduction should only be performed for diploids (n=2). We haven't really tested it for other ploidy so it should remain hidden until someone braves it out.
2012-10-12 14:49:15 -04:00
Guillermo del Angel
32e377a0db
Fix bugs so that we can pass in 2 simultaneous samples in metadata (no co-cleaning yet but at least we don't need to run pipeline twice) to produce 2 bams. Pasted temp mutect so it's also run at the end of the run
2012-10-12 14:39:28 -04:00
David Roazen
da1cffbfca
Run performance tests in gsa-engineering queue on gsa4 rather than gsa queue
...
Running the performance tests on the farm wasn't working out very well --
it's been too long since they've run to completion. Switching back to
running them on gsa4 for now.
2012-10-12 14:21:27 -04:00
Guillermo del Angel
dc03a09722
Merge branch 'develop' into unstable
2012-10-12 14:19:42 -04:00
Kristian Cibulskis
c1706ef0ef
upgraded mutation caller with VCF output
...
raw indel calls (non filtered,non vcf)
2012-10-12 14:18:12 -04:00
Guillermo del Angel
5971006678
Bug fix when running nondiploid mode in UG with EMIT_ALL_SITES: if site was reference-only, QUAL is produced OK but genotypes were being set to no-call because of unnecessary likelihood normalization. May change integration test md5 which I'll fix later today
2012-10-12 12:45:55 -04:00
Eric Banks
81532a0529
Missing file are user errors.
2012-10-12 09:48:12 -04:00
Eric Banks
fa77a83783
Update the out of space error to include another permutation
2012-10-12 09:38:12 -04:00
Eric Banks
85525d9e6e
Make Geraldine's life easier: from now on we treat problems where a temp file cannot be found when running the GATK with multiple threads as User Errors (since they are 99.9% of the time). This is an extremely large class of errors in Tableau and on the forums. Helpful error message tells users exactly what we tell them on the forums anyways (Geraldine: feel free to edit).
2012-10-12 09:19:50 -04:00
Eric Banks
ad60300bee
Catch malformed BAM files at the source since this is the largest class of errors in Tableau.
2012-10-12 09:07:57 -04:00
Eric Banks
593c8065d9
Fix docs for BadMateFilter
2012-10-12 08:35:45 -04:00
Christopher Hartl
6b9987cf1b
Merge branch 'master' of gsa2:/humgen/gsa-scr1/chartl/dev/unstable
2012-10-12 00:48:42 -04:00
Christopher Hartl
c1211ad3a1
Full test suite of LD-corrected GRM calculation. The correctness of this code is now largely verified. Matches GCTA when no correction is used (up to 6 decimal places). Bed reading relies on a particular test directory that is still local. The rest is all generated in unit test fashion.
2012-10-12 00:46:02 -04:00
David Roazen
3861212dab
Fix inefficiency in FilePointer GenomeLoc validation
...
Validation of GenomeLocs in the FilePointer class was extremely inefficient
when the GenomeLocs were added one at a time rather than all at once.
Appears to mostly fix GSA-604
2012-10-11 19:55:14 -04:00
Guillermo del Angel
47e9d967fe
Merging in from cmi-develop branch - staying in this branch for now
2012-10-11 15:35:43 -04:00
Guillermo del Angel
77949ec740
Some fixes to QC commands in pipeline, and workaround for critical engine bug in GATK that makes it hang when doing small targeted BAM's with a whole exome interval list
2012-10-11 15:08:30 -04:00
Ami Levy Moonshine
ef3882f439
PhaseByTransmission: small typo /n. variantCallQC_summaryTablesOnly.R: small changes (more to come) /n GeneralCallingPipeline.scala: the new pipeline script. It is not as clean as I want it to be, but it works. I still going to work on it a little bit more. Also, it does not include yet: (1) the RR step (2) need better eval step (3) need to include other targets (currently it eork on the CEU Trio)
2012-10-11 14:51:41 -04:00
Guillermo del Angel
af5a6fdace
Resolve [DEV-7]: add single-sample VCF calling at end of FASTQ-BAM pipeline. Initial steps of [DEV-4]: queue extensions for Picard QC metrics
2012-10-11 11:09:49 -04:00
Mark DePristo
9b19f5ce99
No longer include stack traces for user exceptions in GATK logs
...
-- Was taking a shocking large amount of space on the server, and slowing down Tableau so much all stack traces had to be disabled
2012-10-10 20:41:03 -04:00
Ryan Poplin
08b8ce6903
Fixing merge conflicts related to the comment formatting in the BQSR.
2012-10-10 16:03:58 -04:00
Ryan Poplin
45717349dc
Fixing BQSR bug reported on the forum for reads that begin with insertions.
2012-10-10 16:01:37 -04:00
David Roazen
40a3b5bfe2
Revert "Testing github auto-mirroring attempt #2 ; please ignore"
...
This reverts commit aacbe369446af8d7901820bf828ed15d72497005.
2012-10-10 15:28:50 -04:00
David Roazen
fba6a084e4
Testing github auto-mirroring attempt #2 ; please ignore
2012-10-10 15:28:13 -04:00
David Roazen
267d1ff59c
Revert "Testing the new github auto-mirroring; please ignore"
...
This reverts commit bd8b321132167f6f393f234ea0e93edcfd8701ff.
2012-10-10 15:07:48 -04:00
David Roazen
66ee3f230f
Testing the new github auto-mirroring; please ignore
2012-10-10 15:06:50 -04:00
Mauricio Carneiro
e9eaa33c0b
adding some directories to gitignore
2012-10-10 13:26:13 -04:00
Mauricio Carneiro
29195cd3aa
Removed the intellij files from the root and made an example package for new users. This allows users to start at the same page and then change it as they see fit without interfering with the repo (thanks guillermo!)
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
fdf29503fb
removing annoying xml from IDEA configuration
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
e29bcab42e
Updating Intellij enviroment and adding Scala
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
f085f5d46a
Adding default intellij configuration files
2012-10-10 13:25:38 -04:00
Mauricio Carneiro
88297606f0
Adding intellij example configuration files
2012-10-10 13:20:30 -04:00
Guillermo del Angel
c0b7d53170
a) Initial raw version of CMI BAM->VCF pipeline (most likely not working yet, but at least compiles and produces reasonable command lines), b) rename FASTQ->BAM script so name is more descriptive
2012-10-10 13:19:05 -04:00
Kristian Cibulskis
2311606de4
initial cancer pipeline with mutations and partial indel support
2012-10-10 13:19:04 -04:00
Guillermo del Angel
45aa59a31c
BAM pipeline fixes: a) temp workaround for DEV-9: -nWayOut argument in IndelRealigner is broken, for now things will only really work in single sample mode, b) correct extension of RealignerTargetCreator output, previous extension caused an error
2012-10-10 13:19:04 -04:00
Guillermo del Angel
b8c721e6ec
Minor tweaks to CMIProcessing Pipeline: a) don't hard-code job mem limit to 4 G since it's too much for most AWS instances, leave it instead as input argument, b) minor doc cleanups
2012-10-10 13:19:04 -04:00
Mauricio Carneiro
ca055d8804
Reimplementation of the BAM procesing pipeline using the metadata information file.
...
Pipeline runs end-to-end using example metadata and has been tested only for cases where everything is ideal.
Next step is to bring this to the cloud, test all different scenario (multiple tumors, single ended, missing parameters etc).
Parallel next step is to add QC metrics.
2012-10-10 13:19:04 -04:00