gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Roger Zurawicki	63cf7ec7ec	Added more primitives to GATK Report Column Type - The Integer column type now accepts byte and shorts - Updated Unit Tests and added a new testParse() test Signed-off-by: Mauricio Carneiro <carneiro@broadinstitute.org>	2012-03-28 09:07:54 -04:00
Eric Banks	ed69f4ff7c	Merge branch 'master' of ssh://gsa1.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-03-13 09:28:16 -04:00
Eric Banks	9b9856ead5	quick todo for next time we make a bundle	2012-03-13 09:28:11 -04:00
Eric Banks	6e9b8559d8	Unfortunately need to bump up memory needed for liftover to get Omni file sorted	2012-03-12 23:20:00 -04:00
Eric Banks	359090c4b7	Updating dbsnp to v135	2012-03-12 13:17:58 -04:00
Eric Banks	7e9a535c4d	Updated the bundle to use the official filtered (final) indel calls	2012-03-12 12:12:24 -04:00
Christopher Hartl	2c1b14d35e	Mostly small changes to my own scala scripts: .vcf.gz compatibility for output files, smarter beagle generation, simple script to scatter-gather combine variants. Whole genome indel calling now uses the gold standard indel set.	2012-02-22 17:20:04 -05:00
Christopher Hartl	685bcaced2	Merge branch 'master' of ssh://ni.broadinstitute.org/humgen/gsa-scr1/chartl/dev/unstable	2012-02-21 13:53:37 -05:00
Khalid Shakir	cda1e1b207	Minor manual merge update for List class to Seq interface usage.	2012-02-08 02:24:54 -05:00
Khalid Shakir	ef74363b1b	Merged bug fix from Stable into Unstable	2012-02-08 02:14:26 -05:00
Khalid Shakir	23e7f1bed9	When an interval list specifies overlapping intervals merge them before scattering.	2012-02-08 02:12:16 -05:00
Christopher Hartl	974c2499cc	Bugfixed to script.	2012-02-02 12:55:54 -05:00
Christopher Hartl	27ea6426a4	Small script to chunk up a VCF into equal-sized chunks	2012-02-02 12:29:03 -05:00
Christopher Hartl	0c562756eb	Add a memory limit so this thing doesn't get killed on the farm	2012-02-02 10:30:09 -05:00
Christopher Hartl	45bf2562cc	.	2012-02-02 09:11:17 -05:00
Christopher Hartl	f8c5406084	Add the ability to extract samples	2012-02-02 09:06:39 -05:00
Christopher Hartl	b567ed8793	Use the right reference path :(	2012-02-01 12:35:18 -05:00
Christopher Hartl	87a63d54d6	fix the script!	2012-02-01 12:05:29 -05:00
Christopher Hartl	810996cfca	Introducing: VariantsToPed, the world's most annoying walker! And also a busted QScript to run it that I need Khalid's help debugging ( frownie face ). Note that VariantsToPed and PlinkSeq generate the same binary file (up to strand flips...thanks PlinkSeq), so I know it's working properly. Hooray!	2012-02-01 10:39:03 -05:00
Mauricio Carneiro	052a4bdb9c	Turning off PHONE HOME option in the MDCP * MDCP is for internal use and there is no need to report to the Amazon cloud. * Reporting to ASW_S3 is not allowing jobs to finish, this is probably a bug.	2012-01-27 11:13:30 -05:00
Mauricio Carneiro	97499529c7	another small bug with the file extension.	2012-01-24 16:14:35 -05:00
Mauricio Carneiro	7c7ca0d799	fixing bug with fastq extension * PPP only recognized .fasta and .fq, failing when the user provided a .fastq file. Fixed.	2012-01-24 11:02:15 -05:00
Mauricio Carneiro	945cf03889	IntelliJ ate my import!	2012-01-23 21:46:45 -05:00
Mauricio Carneiro	2bb9525e7f	Don't set base qualities if fastQ is provided * Pacbio Processing pipeline now works with the new fastQ files outputted by the Pacbio instrument	2012-01-23 17:57:29 -05:00
Khalid Shakir	c18beadbdb	Device files like /dev/null are now tracked as special by Queue and are not used to generate .out file paths, scattered into a temporary directory, gathered, deleted, etc. Attempted workaround for xdr_resourceInfoReq unsatisfied link during loading of libbat.so.	2012-01-23 16:17:04 -05:00
Christopher Hartl	39e6df5aa9	Fix edge case for very small VCFs	2012-01-19 00:51:28 -05:00
Christopher Hartl	1e037a0ecf	Ensure second-to-last line printed	2012-01-19 00:33:08 -05:00
Christopher Hartl	9946853039	Remove duplicated line	2012-01-19 00:25:22 -05:00
Christopher Hartl	cf9b1d350a	Some minor changes to in-process functions that nobody else uses. CGL now properly ignores no-calls for external VCFs.	2012-01-19 00:20:49 -05:00
David Roazen	b7c65cb089	Merged bug fix from Stable into Unstable	2012-01-18 09:52:47 -05:00
David Roazen	d5199db8ec	Be explicit about setting the snpEff -onlyCoding option in the pipeline When run without an explicit -onlyCoding option, as we've been doing up to now, snpEff automatically sets -onlyCoding to "true" provided that there is at least one transcript marked as "protein_coding", which will always be the case for us in practice (and indeed, all pipeline runs so far with snpEff 2.0.5 have run with -onlyCoding auto-set to "true"). However, given the disastrous effect on annotation quality setting "-onlyCoding false" has, we wish to be explicit with this option rather than relying on snpEff's auto-detection logic.	2012-01-17 20:04:27 -05:00
Ryan Poplin	75f87db468	Replacing Mills file with new gold standard indel set in the resource bundle for release with v1.5	2012-01-17 15:02:45 -05:00
Khalid Shakir	a9a6516527	Merged bug fix from Stable into Unstable	2012-01-10 16:16:10 -05:00
Khalid Shakir	ef50e77ee2	When running Queue jobs locally, merge the stderr to the stdout log if the error file is NOT specified. Updated VE strats in the HSP for plotting Ka/Ks by AC.	2012-01-10 16:10:25 -05:00
Mauricio Carneiro	5bf960deb8	adding dbsnp to indel VQSR	2012-01-10 12:38:49 -05:00
Mauricio Carneiro	6f2abd76df	Updating the MDCP with the new indel gold standard from Ryan.	2012-01-09 15:31:18 -05:00
Khalid Shakir	5793625592	No more "Q-<pid>@<host>". Generated log file names now use the first output + ".out" (ex. my.vcf.out) or the name of the first QScript plus the order the function was added (ex. MyScript-1.out). The same function added twice with the same outputs will now have the same default logs, meaning the 2nd instance of the function won't be added to the graph twice. QScript accessor to QSettings to specify a default runName and other default function settings. Because log files are no longer pseudo-random their presense can be used to tell if a job without other file outputs is "done". For now still using the log's .done file in addition to original outputs. Gathered log files concatenate all log files together into the stdout. InProcessFunctions now have PrintStreams for stdout and stderr. Updated ivy to use commons-io 2.1 for copying logs to the stdout PrintStream. Removed snakeyaml. During graph tracking of outputs the Index files, and now BAM MD5s, are tracked with the gathering of the original file. In Queue generated wrappers for the GATK the Index and MD5s used for tracking are switched to private scope. Added more detailed output when running with -l DEBUG. Simplified graphviz visualization for additional debugging. Switched usage of the scala class 'List' to the trait 'Seq' (think java.util.ArrayList vs. using the interface java.util.List) Minor cleanup to build including sending ant gsalib to R's default libloc.	2012-01-08 12:11:55 -05:00
Mauricio Carneiro	f6a18aea63	Updated MDCP with INDEL best practices * chose 90.0 indel cut target for most datasets (this is arbitrary).	2012-01-06 17:21:59 -05:00
Mauricio Carneiro	3358c132a8	Updating the MD5s Clipping adaptor boundaries changed the results of CountCovariates which affected the PPP output. a few more loci were visible to locus walkers.	2011-12-21 15:14:05 -05:00
Mark DePristo	0cc5c3d799	General improvements to Queue -- Support for collecting resources info from DRMAA runners -- Disabled the non-standard mem_free argument so that we can actually use our own SGE cluster gsa4 -- NCoresRequest is a testing queue script for this. -- Added two command line arguments: -- multiCoreJerk: don't request multiple cores for jobs with nt > 1. This was the old behavior but it's really not the best way to run parallel jobs. Now with queue if you run nt = 4 the system requests 4 cores on your host. If this flag is thrown, though, it will only request 1 and you'll just use 4, like a jerk -- job_parallel_env: parallel environment named used with SGE to request multicore jobs. Equivalent to -pe job_parallel_env NT for NT > 1 jobs	2011-12-20 14:05:09 -05:00
Khalid Shakir	6059ca76e8	Removing cruft that snuck in last commit.	2011-12-16 23:00:16 -05:00
Khalid Shakir	7486696c07	When using bam list mode in HSP deriving VCF name from bam list instead of requiring an additional parameter. Creating a single temporary directory per ant test run instead of a putting temp files across all runs in the same directory. Updated various tests for above items and other small fixes.	2011-12-16 18:09:25 -05:00
Mark DePristo	550fb498be	Support for NT testing (default up to 4) for CC and UG -- Added convenience function addJobReportBinding to just new binding to the map (x -> y) as well	2011-12-14 18:45:00 -05:00
Mauricio Carneiro	663184ee9d	Added test mode to PPP * in test mode, no @PG tags are output to the final bam file * updated pipeline test to use -test mode. * MD5s updated accordingly	2011-12-12 18:29:06 -05:00
Mauricio Carneiro	a3c3d72313	Added test mode to DPP * in test mode, no @PG tags are output to the final bam file * updated pipeline test to use -test mode. * MD5s are now dependent on BWA version	2011-12-12 18:29:06 -05:00
Mauricio Carneiro	52c64b971f	Updating MD5s -- really dont know why it didn't update before	2011-12-12 09:48:58 -05:00
Mauricio Carneiro	ed91461c49	Data Processing Pipeline Test * Added standard pipeline test for the DPP * Added a full BWA pipeline test for the DPP * Included the extra files for the reference needed by BWA (to be used by DPP and PPP tests)	2011-12-12 00:24:51 -05:00
Mauricio Carneiro	cca8a18608	PPP pipeline test * added a pipeline test to the Pacbio Processing Pipeline. * updated exampleBAM with more complete RG information so we can use it in a wider variety of pipeline tests * added exampleDBSNP.vcf file with only chromosome 1 in the range of the exampleFASTA.fasta reference for pipeline tests	2011-12-11 17:32:21 -05:00
Mauricio Carneiro	21ac3b59d7	Merged bug fix from Stable into Unstable	2011-12-09 16:51:46 -05:00
Mauricio Carneiro	13905c00b3	Updating PacbioProcessingPipeline to new Queue standards	2011-12-09 16:51:02 -05:00

1 2 3 4 5

215 Commits (f9f8589692fece0185a7e8e059b75ee4672d1c8d)