Commit Graph

7157 Commits (e5008aba00a8c0d89a4622b2f66920b0f5484256)

Author SHA1 Message Date
Ryan Poplin e5008aba00 Output the top two haplotypes as a variant call by running smith-waterman alignment against the reference and calling any difference as variation. This is the first verion that runs end-to-end by taking in reads as bam file and writing out variant calls in VCF. 2011-08-24 15:18:44 -04:00
Ryan Poplin f37875600a Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-24 09:02:44 -04:00
Khalid Shakir 1ecbf05aae Avoid segfaults due to out of date and possibly abandonded LSF DRMAA implementation when use'ing LSF instead of .combined_LSF_SGE 2011-08-23 23:49:36 -04:00
Ryan Poplin 08e5503a60 Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties 2011-08-23 13:43:43 -04:00
Ryan Poplin a1a1fac9e4 Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties 2011-08-23 13:43:07 -04:00
Ryan Poplin 3c37d841db Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-23 13:05:12 -04:00
Guillermo del Angel 6e2552a9ef Merge fix 2011-08-23 12:40:43 -04:00
Guillermo del Angel 8b7a0b3b62 Two new arguments to SelectVariants to exclude either multiallelic or biallelic sites from input vcf 2011-08-23 12:40:01 -04:00
Guillermo del Angel bcc0cae89e Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-23 10:51:24 -04:00
Guillermo del Angel bd26de9741 Resurrect IndelCountCovariates walker, update to new rod system. IndelCountCovariate (singular) seems broken, not to be used for now (but it's not very useful anyway) 2011-08-23 10:50:55 -04:00
Mauricio Carneiro feeab6075f Merging ReduceReads development with unstable repo
It is time to bring the ReadClipper class to the main repo. Read Clipper has tested functionality for soft and hard clipping reads. I will prepare thorough documentation for it as it will be very useful for the assembler and the GATK in general.
2011-08-22 23:03:03 -04:00
Guillermo del Angel af4db34407 Apparently git can't deal with renaming a file that only changes capitalization? 2011-08-22 22:00:04 -04:00
Guillermo del Angel ee68713267 Further Bug fixes to CountVariants: stratifications were wrong in case genotypes had no-calls, for example if we stratified by sample and a sample had a no-call, this no-call was considered a true variant and counts were incorrectly increased 2011-08-22 20:42:47 -04:00
Guillermo del Angel c270384b2e Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-22 20:39:32 -04:00
Guillermo del Angel 8ae24912f4 a) Misc fixes in Phase1 indel vqsr script,
b) More R-friendly VariantsToTable printing of AC in case of multiple alt alleles
c) Rename FixPLOrderingWalker to FixGenotypesWalker and rewrote: no longer need older code, replaced with code to replace genotypes with all-zero PL's with a no-call.
2011-08-22 20:39:06 -04:00
Mauricio Carneiro 136f0eb685 Creating sample-bam list instead of joining
This should save us at least one day in the trio decoy processing.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro 04d8bcaf19 Fixed bai removal on picard tools
BAM index files were not being deleted because picard replaces the name of the file with bai instead of appending to it.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro 8aed151a71 Created RevertSam queue class
Class for the picard tool RevertSam with all the options for queue scripts.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro caebc88e9a Consensus mode and new RodBinding framework.
The DPP was not using the parameter correctly. It didn't matter for the default option (which is the only one we have been testing) but it would not work for knowns only or smith waterman. It is fixed now.

It now complies with the new rod binding framework.
2011-08-22 18:03:39 -04:00
Mark DePristo 85c5a6f890 Merge branch 'rodTesting'
Conflicts:
	private/java/src/org/broadinstitute/sting/gatk/walkers/performance/ProfileRodSystem.java
2011-08-22 17:43:47 -04:00
Mark DePristo e728e407d9 First pass Queue script to test progress on ROD refactoring towards Eric+Mark goals 2011-08-22 17:40:35 -04:00
Mark DePristo fe7d3ee236 A bit of refinement of the modes 2011-08-22 17:40:02 -04:00
Mark DePristo 649f134536 Added mode argument that lets you test separately tribble IO and GATK 2011-08-22 17:33:41 -04:00
Mark DePristo 1eab9be35d Now with accurate javadoc 2011-08-22 17:25:15 -04:00
Mark DePristo 3612a3501d info, not warn, about dynamic type determination 2011-08-22 17:24:51 -04:00
Roger Zurawicki 9fb208c8f0 First Working Implementation
Used ReadClipper to hardClip to VariableRegion
Runs 9,000,000-9,050,000
close() not tested
Does not use toSAMRecord() method
*Commented out debug info
2011-08-22 16:46:24 -04:00
Eric Banks dc42571dd9 Only create the genotype map when necessary 2011-08-22 15:40:36 -04:00
Khalid Shakir c4c90c8826 Updates to JobRunners from the Queue developer community and from running the WholeGenomePipeline:
- Ability to pass a different resident memory reservation and limits. Useful for large pileups of low pass genome data that sometimes need high -Xmx6g but usually don't exceed 2-3g in actual heap size.
- Fixed jobPriority to work for all job runners. Now must be a integer between 0 and 100- even for GridEngine- and will be mapped to the correct values.
- Passing parallel environment and job resource requests to LSF and GridEngine. Useful for passing tokens like iodine_io=1 and -pe pe_slots 8
- Refactored GridEngine JobRunner to also provide basic support for other job dispatchers with DRMAA implementations such as Torque/PBS. Should work for basic running but advanced users must pass their own jobNativeArgs from the command line or in customized QScripts until someone maps properties like jobQueue, jobPriority, residentRequest, etc. into a Torque/PBS/etc. dispatcher.
2011-08-22 15:13:27 -04:00
Eric Banks 4ea1ce3c1a Check for null VCs 2011-08-22 15:11:37 -04:00
Eric Banks 2c24b68a96 Working implementation of DecodeLoc for VCF parsing. Makes indexing 3x faster. 2011-08-22 15:11:21 -04:00
Eric Banks 518b3dd291 Don't let the genotypes map be null 2011-08-22 15:10:30 -04:00
Roger Zurawicki 17ce38dd63 Partially Fixed bug where the 'H' operator was counted extra times
NOTE: bug still exist
Invalid cigar strings are made, ex: 79H20M16H
2011-08-22 13:56:12 -04:00
Ryan Poplin 600456f880 Merging initial minor fixes / modifications to the assembler 2011-08-22 09:22:44 -04:00
Ryan Poplin a4cf34ef38 Initial commit of the haplotype caller 2011-08-22 08:52:31 -04:00
Roger Zurawicki d111903322 Added trimmedLeft/Right field, and output clones SAMRecord
This should help with Cigar strings
Output can now clone the object so it no longer edit the original data
2011-08-21 17:57:38 -04:00
Ryan Poplin f93a554b01 updating exome specific parameters in MDCP 2011-08-21 10:25:36 -04:00
Ryan Poplin dbff84c54e Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-21 10:09:19 -04:00
Khalid Shakir 22ca44c015 Fixed Queue's tagging of RodBindings.
Fixed argument definition names.
2011-08-21 02:34:20 -04:00
Menachem Fromer 249614e6bb Record all genes spanning this interval, not just those that start in this interval 2011-08-21 00:23:28 -04:00
Menachem Fromer 9bcceed706 Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-20 23:36:33 -04:00
Menachem Fromer 188f28d02b Increase memory required for merging 2011-08-20 23:35:53 -04:00
Eric Banks a8cbced71b Bug fix for Ryan: check for no context 2011-08-20 22:49:51 -04:00
Eric Banks 0ccd173967 Fixing the recent SelectVariants fix 2011-08-20 21:30:08 -04:00
Ryan Poplin b008676878 fixing the previous fix 2011-08-20 21:21:55 -04:00
Guillermo del Angel 782453235a Updated VariantEvalIntegrationTest since there's a new column separating nMixed and nComplex in CountVariants
Misc updates to WholeGenomeIndelCalling.scala
Bug fix in VariantEval (may be temporary, need more investigation): if -disc option is used in sites-only vcf's then a null pointer exception is produced, caused by recent introduction of -xl_sf options.
2011-08-20 12:24:22 -04:00
Guillermo del Angel 0de49f5752 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-20 12:20:38 -04:00
Ryan Poplin 539e157ecd Fixing misc parameters in MDCP. The pipeline now does VariantEval of output by default. Fix for NaN vqslod values in VQSR 2011-08-20 11:28:48 -04:00
Ryan Poplin 824583e007 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-20 11:10:15 -04:00
Mark DePristo c0227b5cd7 mutt just is broken on gsa2 2011-08-20 10:13:07 -04:00
Guillermo del Angel 4939648fd4 Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-20 08:50:43 -04:00