Mark DePristo
a5c65fc133
Debugging information to print out the Query tracks
2011-08-28 18:54:49 -04:00
Mark DePristo
b38de1fa35
Now captures the exechost in the job report
...
-- Works for in process, shell, and LSF runners
-- Cleanup of debugging output
2011-08-28 12:05:56 -04:00
Mark DePristo
7bf006278d
Moved ResolveHostname to general utils as a static function
2011-08-28 12:04:16 -04:00
Mark DePristo
ccec0b4d73
AnalyzeCovariates uses the general RScript system now
...
-- Convenience constructor for collection for testing
-- callRScript() now accepts Objects not Strings, for convenience
2011-08-27 12:54:13 -04:00
Mark DePristo
1ceb020fae
UnitTests for RScript
2011-08-27 10:50:05 -04:00
Mark DePristo
e37a638e09
Fix for disallowed characters in GATKReportTable
...
-- Illegal characters are automatically replaced with _
2011-08-26 13:24:06 -04:00
Mark DePristo
0cb1605df0
Clean documentation for JobRunInfo
2011-08-26 09:22:58 -04:00
Mark DePristo
415d5d5301
LSF long times are in seconds, convert to milliseconds to meet standard
2011-08-26 09:18:28 -04:00
Mark DePristo
c0503283df
Spelling fix requires md5 updates
2011-08-26 07:40:44 -04:00
Mark DePristo
eef1ac415a
Merge branch 'master' into rodTesting
...
Conflicts:
public/java/src/org/broadinstitute/sting/gatk/walkers/variantutils/VariantsToTable.java
2011-08-26 00:35:41 -04:00
Eric Banks
9b7512fd94
Just because there's a ref base doesn't mean the VC needs to be padded
2011-08-25 22:42:14 -04:00
Mark DePristo
e03dfdb0ab
Automatic iteration field addition works properly.
2011-08-25 16:59:02 -04:00
Mark DePristo
e01273ca7c
Queue now writes out queueJobReport.pdf
...
-- General purpose RScript executor in java (please use when invoking RScripts)
-- Removed groupName. This is now analysisName
-- Explicitly added capability to enable/disable individual QFunction
2011-08-25 16:57:11 -04:00
Eric Banks
09a729da3a
Removing incorrect comment
2011-08-25 15:42:52 -04:00
Eric Banks
8bbef79fc2
Create clipped alleles during allele parsing instead of creating a full VC, clipping alleles, and regenerating the VC from scratch.
2011-08-25 15:37:26 -04:00
Mark DePristo
0f4be2c4a4
Argument to disable queueJobReport entirely
...
-- Minor improvements to RodPerformanceGoals
2011-08-25 13:32:03 -04:00
Mark DePristo
d65faf509c
Default output name for Queue JobReport is queue_jobreport.gatkreport.txt
2011-08-25 13:15:20 -04:00
Mark DePristo
a7d6946b22
Refactored QJobReport and QFunction, which is now automatically tracked
...
-- All QFunctions, including sg ones, are tracked
-- Removed memory information
2011-08-25 13:13:55 -04:00
Mauricio Carneiro
16caca0822
BLASR BAMs and new BWA parameters
...
*Added the functions to turn a BLASR generated BAM file into a usable BAM file.
*Modified the bwa parameters according to test results from NA12878 pb2k dataset.
2011-08-24 17:04:07 -04:00
Mauricio Carneiro
e3f5d7067a
Added ReorderSam queue binding
2011-08-24 17:03:11 -04:00
Mark DePristo
08fb21f127
Removing hostname
2011-08-24 16:45:50 -04:00
Mark DePristo
06e30a81d1
Fixes throughout for getting job information
...
-- no more hostname -- it's just not going to be important
2011-08-24 15:30:09 -04:00
Ryan Poplin
29c7b10f7b
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-24 15:18:58 -04:00
Ryan Poplin
e5008aba00
Output the top two haplotypes as a variant call by running smith-waterman alignment against the reference and calling any difference as variation. This is the first verion that runs end-to-end by taking in reads as bam file and writing out variant calls in VCF.
2011-08-24 15:18:44 -04:00
Mark DePristo
4918519a58
No more NPE in getRuntime() when you cntr-c out of Queue
2011-08-24 14:14:01 -04:00
Mark DePristo
16d8360592
QJobReport is now the official capability name
2011-08-24 13:59:14 -04:00
Mark DePristo
d047c19ad1
Writes output to file
2011-08-24 13:52:05 -04:00
Mark DePristo
3ae68e2397
JobLogging trait now writes out GATKReport log of jobs
2011-08-24 13:36:39 -04:00
Guillermo del Angel
e618cb1e79
a) Renamed/expanded SelectVariants arguments that choose particular kinds of variants and particular allelic types, now instead of -Indels or -SNPs we can specify for example -selectType [MIXED|INDEL|SNP|MNP|SYMBOLIC]. To select biallelic, multiallelic variants, use -restrictAllelesTo [BIALLELIC|MULTIALLELIC]. Corresponding gatkdocs changes.
...
b) More useful AC,AF logging in VariantsToTable with multiallelic sites: instead of logging comma-separated values, log max value by default. Hidden, experimental argument -logACSum to log sum of ACs instead. This is due to extreme slowness of R in parsing strings to tokens and computing max/sum itself (~100x slower than gatk).
c) Added integrationtest for new SelectVariants commands
2011-08-24 12:25:50 -04:00
Mauricio Carneiro
cd12f7f286
Fixed list dependency
...
Instead of creating a bam list file, I dynamically create a scala list and pass as parameters. This way the intermediate bam files don't get deleted before they should.
2011-08-24 11:12:46 -04:00
Mauricio Carneiro
219252a566
Adapting to the new RodBinding framework
2011-08-24 11:12:46 -04:00
Mark DePristo
28ee6dac41
Fixed spelling mistake
2011-08-24 10:14:45 -04:00
Ryan Poplin
f37875600a
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-24 09:02:44 -04:00
Khalid Shakir
1ecbf05aae
Avoid segfaults due to out of date and possibly abandonded LSF DRMAA implementation when use'ing LSF instead of .combined_LSF_SGE
2011-08-23 23:49:36 -04:00
Mark DePristo
b8bc03bb42
JobRunInfo improvements
...
-- dry-run now adds some info, for testing
-- InProcessRunner adds some, but not all, of the information we want
2011-08-23 17:11:22 -04:00
Mark DePristo
569e1a1089
Walker.isDone() aborts execution early
...
-- Useful if you want to have a parameter like MAX_RECORDS that wants the walker to stop after some number of map calls without having to resort to the old System.exit() call directly.
2011-08-23 16:53:06 -04:00
Mark DePristo
31ec6e316c
First implementation of JobRunInfo
...
-- onExecutionDone(Map(QFunction, JobRunInfo)) is the new signature, so that you can walk over your jobs and inspect their success/failure and runtime characteristics
2011-08-23 16:51:54 -04:00
Ryan Poplin
a1a1fac9e4
Likelihood engine now gives non-zero likelihoods. Using HMM function that can handle context specific gap open and gap continuation penalties
2011-08-23 13:43:07 -04:00
Guillermo del Angel
6e2552a9ef
Merge fix
2011-08-23 12:40:43 -04:00
Guillermo del Angel
8b7a0b3b62
Two new arguments to SelectVariants to exclude either multiallelic or biallelic sites from input vcf
2011-08-23 12:40:01 -04:00
Mark DePristo
6d6feb5540
Better error message when you cannot determine a ROD type because the file doesn't exist or cannot be read
2011-08-23 10:56:37 -04:00
Mark DePristo
a9ba945595
onExecutionDone(jobs, successFlag) added to QScript.
...
-- This function is called when the Qscript ends, so scripts can overload this function if they want to run some code after all of the jobs have completed
2011-08-23 10:09:51 -04:00
Mauricio Carneiro
feeab6075f
Merging ReduceReads development with unstable repo
...
It is time to bring the ReadClipper class to the main repo. Read Clipper has tested functionality for soft and hard clipping reads. I will prepare thorough documentation for it as it will be very useful for the assembler and the GATK in general.
2011-08-22 23:03:03 -04:00
Guillermo del Angel
ee68713267
Further Bug fixes to CountVariants: stratifications were wrong in case genotypes had no-calls, for example if we stratified by sample and a sample had a no-call, this no-call was considered a true variant and counts were incorrectly increased
2011-08-22 20:42:47 -04:00
Guillermo del Angel
c270384b2e
Merge branch 'master' of ssh://nickel.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable
2011-08-22 20:39:32 -04:00
Guillermo del Angel
8ae24912f4
a) Misc fixes in Phase1 indel vqsr script,
...
b) More R-friendly VariantsToTable printing of AC in case of multiple alt alleles
c) Rename FixPLOrderingWalker to FixGenotypesWalker and rewrote: no longer need older code, replaced with code to replace genotypes with all-zero PL's with a no-call.
2011-08-22 20:39:06 -04:00
Mauricio Carneiro
136f0eb685
Creating sample-bam list instead of joining
...
This should save us at least one day in the trio decoy processing.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
04d8bcaf19
Fixed bai removal on picard tools
...
BAM index files were not being deleted because picard replaces the name of the file with bai instead of appending to it.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
8aed151a71
Created RevertSam queue class
...
Class for the picard tool RevertSam with all the options for queue scripts.
2011-08-22 18:03:39 -04:00
Mauricio Carneiro
caebc88e9a
Consensus mode and new RodBinding framework.
...
The DPP was not using the parameter correctly. It didn't matter for the default option (which is the only one we have been testing) but it would not work for knowns only or smith waterman. It is fixed now.
It now complies with the new rod binding framework.
2011-08-22 18:03:39 -04:00