Commit Graph

7430 Commits (5e832254a4e024378f7fdee252abf7df9e289c6a)

Author SHA1 Message Date
Mark DePristo 08fb21f127 Removing hostname 2011-08-24 16:45:50 -04:00
Mauricio Carneiro d50474f14c Merged bug fix from Stable into Unstable
resolved conflicts by maintaining all the changes in UNSTABLE where this bug had already been fixed.

Conflicts:
	public/scala/qscript/org/broadinstitute/sting/queue/qscripts/DataProcessingPipeline.scala
2011-08-24 16:06:36 -04:00
Mauricio Carneiro dc8398e165 fixing bai output for indel cleaning. 2011-08-24 15:58:34 -04:00
Ryan Poplin da5e6b52e7 Refactoring the smith-waterman and genotyping-related pieces into their own engine 2011-08-24 15:54:17 -04:00
Mark DePristo 06e30a81d1 Fixes throughout for getting job information
-- no more hostname -- it's just not going to be important
2011-08-24 15:30:09 -04:00
Ryan Poplin 29c7b10f7b Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-24 15:18:58 -04:00
Ryan Poplin e5008aba00 Output the top two haplotypes as a variant call by running smith-waterman alignment against the reference and calling any difference as variation. This is the first verion that runs end-to-end by taking in reads as bam file and writing out variant calls in VCF. 2011-08-24 15:18:44 -04:00
Mark DePristo 4918519a58 No more NPE in getRuntime() when you cntr-c out of Queue 2011-08-24 14:14:01 -04:00
Mark DePristo 16d8360592 QJobReport is now the official capability name 2011-08-24 13:59:14 -04:00
Mark DePristo d047c19ad1 Writes output to file 2011-08-24 13:52:05 -04:00
Guillermo del Angel 61f5968807 Change indel/snp selection to new arguments 2011-08-24 13:42:23 -04:00
Mark DePristo 3ae68e2397 JobLogging trait now writes out GATKReport log of jobs 2011-08-24 13:36:39 -04:00
Guillermo del Angel e618cb1e79 a) Renamed/expanded SelectVariants arguments that choose particular kinds of variants and particular allelic types, now instead of -Indels or -SNPs we can specify for example -selectType [MIXED|INDEL|SNP|MNP|SYMBOLIC]. To select biallelic, multiallelic variants, use -restrictAllelesTo [BIALLELIC|MULTIALLELIC]. Corresponding gatkdocs changes.
b) More useful AC,AF logging in VariantsToTable with multiallelic sites: instead of logging comma-separated values, log max value by default. Hidden, experimental argument -logACSum to log sum of ACs instead. This is due to extreme slowness of R in parsing strings to tokens and computing max/sum itself (~100x slower than gatk).
c) Added integrationtest for new SelectVariants commands
2011-08-24 12:25:50 -04:00
Mauricio Carneiro cd12f7f286 Fixed list dependency
Instead of creating a bam list file, I dynamically create a scala list and pass as parameters. This way the intermediate bam files don't get deleted before they should.
2011-08-24 11:12:46 -04:00
Mauricio Carneiro 219252a566 Adapting to the new RodBinding framework 2011-08-24 11:12:46 -04:00
Mark DePristo 28ee6dac41 Fixed spelling mistake 2011-08-24 10:14:45 -04:00
Ryan Poplin f37875600a Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-24 09:02:44 -04:00
Khalid Shakir 1ecbf05aae Avoid segfaults due to out of date and possibly abandonded LSF DRMAA implementation when use'ing LSF instead of .combined_LSF_SGE 2011-08-23 23:49:36 -04:00
Mark DePristo b8bc03bb42 JobRunInfo improvements
-- dry-run now adds some info, for testing
-- InProcessRunner adds some, but not all, of the information we want
2011-08-23 17:11:22 -04:00
Mark DePristo 569e1a1089 Walker.isDone() aborts execution early
-- Useful if you want to have a parameter like MAX_RECORDS that wants the walker to stop after some number of map calls without having to resort to the old System.exit() call directly.
2011-08-23 16:53:06 -04:00
Mark DePristo 31ec6e316c First implementation of JobRunInfo
-- onExecutionDone(Map(QFunction, JobRunInfo)) is the new signature, so that you can walk over your jobs and inspect their success/failure and runtime characteristics
2011-08-23 16:51:54 -04:00
Roger Zurawicki 7d653c48b1 Consensus Reads call close() when crossing a gap
addToAlignment returns a collection of SAMRecords
Not fully working yet
2011-08-23 16:07:10 -04:00
Ryan Poplin 08e5503a60 Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties 2011-08-23 13:43:43 -04:00
Ryan Poplin a1a1fac9e4 Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties 2011-08-23 13:43:07 -04:00
Ryan Poplin 3c37d841db Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-23 13:05:12 -04:00
Guillermo del Angel 6e2552a9ef Merge fix 2011-08-23 12:40:43 -04:00
Guillermo del Angel 8b7a0b3b62 Two new arguments to SelectVariants to exclude either multiallelic or biallelic sites from input vcf 2011-08-23 12:40:01 -04:00
Roger Zurawicki ac36271457 Fixed extra reads showing up in Variable Sites
Reads that were not hard clipped for the variable site no longer show up in output file
Walker now uses unclippedStart of Read to determine position in the sliding Window
2011-08-23 11:26:00 -04:00
Mark DePristo 6d6feb5540 Better error message when you cannot determine a ROD type because the file doesn't exist or cannot be read 2011-08-23 10:56:37 -04:00
Guillermo del Angel bcc0cae89e Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-23 10:51:24 -04:00
Guillermo del Angel bd26de9741 Resurrect IndelCountCovariates walker, update to new rod system. IndelCountCovariate (singular) seems broken, not to be used for now (but it's not very useful anyway) 2011-08-23 10:50:55 -04:00
Mark DePristo a9ba945595 onExecutionDone(jobs, successFlag) added to QScript.
-- This function is called when the Qscript ends, so scripts can overload this function if they want to run some code after all of the jobs have completed
2011-08-23 10:09:51 -04:00
Roger Zurawicki 9597d6edad Commented out debug info
Bases of incoming reads no longer print on screen
2011-08-23 10:04:11 -04:00
Mauricio Carneiro feeab6075f Merging ReduceReads development with unstable repo
It is time to bring the ReadClipper class to the main repo. Read Clipper has tested functionality for soft and hard clipping reads. I will prepare thorough documentation for it as it will be very useful for the assembler and the GATK in general.
2011-08-22 23:03:03 -04:00
Guillermo del Angel af4db34407 Apparently git can't deal with renaming a file that only changes capitalization? 2011-08-22 22:00:04 -04:00
Guillermo del Angel ee68713267 Further Bug fixes to CountVariants: stratifications were wrong in case genotypes had no-calls, for example if we stratified by sample and a sample had a no-call, this no-call was considered a true variant and counts were incorrectly increased 2011-08-22 20:42:47 -04:00
Guillermo del Angel c270384b2e Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable 2011-08-22 20:39:32 -04:00
Guillermo del Angel 8ae24912f4 a) Misc fixes in Phase1 indel vqsr script,
b) More R-friendly VariantsToTable printing of AC in case of multiple alt alleles
c) Rename FixPLOrderingWalker to FixGenotypesWalker and rewrote: no longer need older code, replaced with code to replace genotypes with all-zero PL's with a no-call.
2011-08-22 20:39:06 -04:00
Mauricio Carneiro 136f0eb685 Creating sample-bam list instead of joining
This should save us at least one day in the trio decoy processing.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro 04d8bcaf19 Fixed bai removal on picard tools
BAM index files were not being deleted because picard replaces the name of the file with bai instead of appending to it.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro 8aed151a71 Created RevertSam queue class
Class for the picard tool RevertSam with all the options for queue scripts.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro caebc88e9a Consensus mode and new RodBinding framework.
The DPP was not using the parameter correctly. It didn't matter for the default option (which is the only one we have been testing) but it would not work for knowns only or smith waterman. It is fixed now.

It now complies with the new rod binding framework.
2011-08-22 18:03:39 -04:00
Mark DePristo 85c5a6f890 Merge branch 'rodTesting'
Conflicts:
	private/java/src/org/broadinstitute/sting/gatk/walkers/performance/ProfileRodSystem.java
2011-08-22 17:43:47 -04:00
Mark DePristo e728e407d9 First pass Queue script to test progress on ROD refactoring towards Eric+Mark goals 2011-08-22 17:40:35 -04:00
Mark DePristo fe7d3ee236 A bit of refinement of the modes 2011-08-22 17:40:02 -04:00
Mark DePristo 649f134536 Added mode argument that lets you test separately tribble IO and GATK 2011-08-22 17:33:41 -04:00
Mark DePristo 1eab9be35d Now with accurate javadoc 2011-08-22 17:25:15 -04:00
Mark DePristo 3612a3501d info, not warn, about dynamic type determination 2011-08-22 17:24:51 -04:00
Roger Zurawicki 9fb208c8f0 First Working Implementation
Used ReadClipper to hardClip to VariableRegion
Runs 9,000,000-9,050,000
close() not tested
Does not use toSAMRecord() method
*Commented out debug info
2011-08-22 16:46:24 -04:00
Eric Banks dc42571dd9 Only create the genotype map when necessary 2011-08-22 15:40:36 -04:00