Ryan Poplin
e5008aba00
Output the top two haplotypes as a variant call by running smith-waterman alignment against the reference and calling any difference as variation. This is the first verion that runs end-to-end by taking in reads as bam file and writing out variant calls in VCF.
2011-08-24 15:18:44 -04:00
Ryan Poplin
f37875600a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-24 09:02:44 -04:00
Khalid Shakir
1ecbf05aae
Avoid segfaults due to out of date and possibly abandonded LSF DRMAA implementation when use'ing LSF instead of .combined_LSF_SGE
2011-08-23 23:49:36 -04:00
Ryan Poplin
08e5503a60
Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties
2011-08-23 13:43:43 -04:00
Ryan Poplin
a1a1fac9e4
Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties
2011-08-23 13:43:07 -04:00
Ryan Poplin
3c37d841db
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-23 13:05:12 -04:00
Guillermo del Angel
6e2552a9ef
Merge fix
2011-08-23 12:40:43 -04:00
Guillermo del Angel
8b7a0b3b62
Two new arguments to SelectVariants to exclude either multiallelic or biallelic sites from input vcf
2011-08-23 12:40:01 -04:00
Guillermo del Angel
bcc0cae89e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-23 10:51:24 -04:00
Guillermo del Angel
bd26de9741
Resurrect IndelCountCovariates walker, update to new rod system. IndelCountCovariate (singular) seems broken, not to be used for now (but it's not very useful anyway)
2011-08-23 10:50:55 -04:00
Mauricio Carneiro
feeab6075f
Merging ReduceReads development with unstable repo
...
It is time to bring the ReadClipper class to the main repo. Read Clipper has tested functionality for soft and hard clipping reads. I will prepare thorough documentation for it as it will be very useful for the assembler and the GATK in general.
2011-08-22 23:03:03 -04:00
Guillermo del Angel
af4db34407
Apparently git can't deal with renaming a file that only changes capitalization?
2011-08-22 22:00:04 -04:00
Guillermo del Angel
ee68713267
Further Bug fixes to CountVariants: stratifications were wrong in case genotypes had no-calls, for example if we stratified by sample and a sample had a no-call, this no-call was considered a true variant and counts were incorrectly increased
2011-08-22 20:42:47 -04:00
Guillermo del Angel
c270384b2e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-22 20:39:32 -04:00
Guillermo del Angel
8ae24912f4
a) Misc fixes in Phase1 indel vqsr script,
...
b) More R-friendly VariantsToTable printing of AC in case of multiple alt alleles
c) Rename FixPLOrderingWalker to FixGenotypesWalker and rewrote: no longer need older code, replaced with code to replace genotypes with all-zero PL's with a no-call.
2011-08-22 20:39:06 -04:00
Mauricio Carneiro
136f0eb685
Creating sample-bam list instead of joining
...
This should save us at least one day in the trio decoy processing.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
04d8bcaf19
Fixed bai removal on picard tools
...
BAM index files were not being deleted because picard replaces the name of the file with bai instead of appending to it.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
8aed151a71
Created RevertSam queue class
...
Class for the picard tool RevertSam with all the options for queue scripts.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
caebc88e9a
Consensus mode and new RodBinding framework.
...
The DPP was not using the parameter correctly. It didn't matter for the default option (which is the only one we have been testing) but it would not work for knowns only or smith waterman. It is fixed now.
It now complies with the new rod binding framework.
2011-08-22 18:03:39 -04:00
Mark DePristo
85c5a6f890
Merge branch 'rodTesting'
...
Conflicts:
private/java/src/org/broadinstitute/sting/gatk/walkers/performance/ProfileRodSystem.java
2011-08-22 17:43:47 -04:00
Mark DePristo
e728e407d9
First pass Queue script to test progress on ROD refactoring towards Eric+Mark goals
2011-08-22 17:40:35 -04:00
Mark DePristo
fe7d3ee236
A bit of refinement of the modes
2011-08-22 17:40:02 -04:00
Mark DePristo
649f134536
Added mode argument that lets you test separately tribble IO and GATK
2011-08-22 17:33:41 -04:00
Mark DePristo
1eab9be35d
Now with accurate javadoc
2011-08-22 17:25:15 -04:00
Mark DePristo
3612a3501d
info, not warn, about dynamic type determination
2011-08-22 17:24:51 -04:00
Roger Zurawicki
9fb208c8f0
First Working Implementation
...
Used ReadClipper to hardClip to VariableRegion
Runs 9,000,000-9,050,000
close() not tested
Does not use toSAMRecord() method
*Commented out debug info
2011-08-22 16:46:24 -04:00
Eric Banks
dc42571dd9
Only create the genotype map when necessary
2011-08-22 15:40:36 -04:00
Khalid Shakir
c4c90c8826
Updates to JobRunners from the Queue developer community and from running the WholeGenomePipeline:
...
- Ability to pass a different resident memory reservation and limits. Useful for large pileups of low pass genome data that sometimes need high -Xmx6g but usually don't exceed 2-3g in actual heap size.
- Fixed jobPriority to work for all job runners. Now must be a integer between 0 and 100- even for GridEngine- and will be mapped to the correct values.
- Passing parallel environment and job resource requests to LSF and GridEngine. Useful for passing tokens like iodine_io=1 and -pe pe_slots 8
- Refactored GridEngine JobRunner to also provide basic support for other job dispatchers with DRMAA implementations such as Torque/PBS. Should work for basic running but advanced users must pass their own jobNativeArgs from the command line or in customized QScripts until someone maps properties like jobQueue, jobPriority, residentRequest, etc. into a Torque/PBS/etc. dispatcher.
2011-08-22 15:13:27 -04:00
Eric Banks
4ea1ce3c1a
Check for null VCs
2011-08-22 15:11:37 -04:00
Eric Banks
2c24b68a96
Working implementation of DecodeLoc for VCF parsing. Makes indexing 3x faster.
2011-08-22 15:11:21 -04:00
Eric Banks
518b3dd291
Don't let the genotypes map be null
2011-08-22 15:10:30 -04:00
Roger Zurawicki
17ce38dd63
Partially Fixed bug where the 'H' operator was counted extra times
...
NOTE: bug still exist
Invalid cigar strings are made, ex: 79H20M16H
2011-08-22 13:56:12 -04:00
Ryan Poplin
600456f880
Merging initial minor fixes / modifications to the assembler
2011-08-22 09:22:44 -04:00
Ryan Poplin
a4cf34ef38
Initial commit of the haplotype caller
2011-08-22 08:52:31 -04:00
Roger Zurawicki
d111903322
Added trimmedLeft/Right field, and output clones SAMRecord
...
This should help with Cigar strings
Output can now clone the object so it no longer edit the original data
2011-08-21 17:57:38 -04:00
Ryan Poplin
f93a554b01
updating exome specific parameters in MDCP
2011-08-21 10:25:36 -04:00
Ryan Poplin
dbff84c54e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-21 10:09:19 -04:00
Khalid Shakir
22ca44c015
Fixed Queue's tagging of RodBindings.
...
Fixed argument definition names.
2011-08-21 02:34:20 -04:00
Menachem Fromer
249614e6bb
Record all genes spanning this interval, not just those that start in this interval
2011-08-21 00:23:28 -04:00
Menachem Fromer
9bcceed706
Merge branch 'master' of ssh://copper.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-20 23:36:33 -04:00
Menachem Fromer
188f28d02b
Increase memory required for merging
2011-08-20 23:35:53 -04:00
Eric Banks
a8cbced71b
Bug fix for Ryan: check for no context
2011-08-20 22:49:51 -04:00
Eric Banks
0ccd173967
Fixing the recent SelectVariants fix
2011-08-20 21:30:08 -04:00
Ryan Poplin
b008676878
fixing the previous fix
2011-08-20 21:21:55 -04:00
Guillermo del Angel
782453235a
Updated VariantEvalIntegrationTest since there's a new column separating nMixed and nComplex in CountVariants
...
Misc updates to WholeGenomeIndelCalling.scala
Bug fix in VariantEval (may be temporary, need more investigation): if -disc option is used in sites-only vcf's then a null pointer exception is produced, caused by recent introduction of -xl_sf options.
2011-08-20 12:24:22 -04:00
Guillermo del Angel
0de49f5752
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-20 12:20:38 -04:00
Ryan Poplin
539e157ecd
Fixing misc parameters in MDCP. The pipeline now does VariantEval of output by default. Fix for NaN vqslod values in VQSR
2011-08-20 11:28:48 -04:00
Ryan Poplin
824583e007
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-20 11:10:15 -04:00
Mark DePristo
c0227b5cd7
mutt just is broken on gsa2
2011-08-20 10:13:07 -04:00
Guillermo del Angel
4939648fd4
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-20 08:50:43 -04:00