gatk-3.8

Commit Graph

Author	SHA1	Message	Date
Eric Banks	18728ec5bd	Updates to the bundle script: 1. Add the symbolic 'current' link for the new bundle dir 2. Don't gzip and copy .out files 3. Don't call chr20 SNPs on the example BAM because it's now just a few reads on chr1	2012-12-18 11:16:42 -05:00
Menachem Fromer	a8c7edca05	Fixed fragment handling in DepthOfCoverage	2012-11-21 16:01:10 -05:00
Menachem Fromer	c8be7c3102	Keep SNPs and indels separately for batch merging; Add options to DepthOfCoverage to count fragments (to not double-count overlapping reads of same fragment); DepthOfCoverage should now support ReducedReads; Replace recusrion with loop in DoC/package.scala (for lists longer than 5000 elements)	2012-11-21 15:56:53 -05:00
Menachem Fromer	9111966261	Merge branch 'master' of github.com:broadinstitute/gsa-unstable	2012-11-20 12:19:58 -05:00
Eric Banks	843384e435	Rename hg19 files in bundle to b37 since that's what they are	2012-11-14 11:47:09 -05:00
Eric Banks	eccb76c304	Only run UG in the bundle for chr20	2012-10-30 15:09:46 -04:00
Eric Banks	8a402024c2	Updating bundle script to handle new naming convention of CEU trio best practices callset	2012-10-30 09:11:56 -04:00
Menachem Fromer	9af4b34fd8	Changed @Input to @Argument for non-File types	2012-10-26 01:21:05 -04:00
Menachem Fromer	0fe36b1c72	Merge branch 'master' of ssh://gsa3.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-10-25 16:18:57 -04:00
Menachem Fromer	cde4f037d3	Begin moving XHMM scripts to public	2012-10-25 16:18:25 -04:00
Ami Levy Moonshine	dde3060bb8	add the CEUtrio best practices results (UG + PBT) to the bundle	2012-10-25 15:36:17 -04:00
Khalid Shakir	2ef456d51a	Added explicit @ClassType annotations to @Argument for Option[Int] or Option[Double] since scala seems to change the reflected type to Option[Object] on some systems. Changed ReflectionUtils.getGenericTypes' order of looking for @ClassType since the primitive generic wasn't completely erased, only changed to Object which is incorrect. More fixes to @Arguments labeled as java.io.File via incorrect @Input annotation. Put in a default undocumented implementation of @Argument doc() to match the one added to @Input.	2012-10-19 13:20:29 -04:00
Khalid Shakir	403654d40a	Fixed null checkes in ArgumentTypeDescriptor due to ArgumentMatchValue updates. Fixed @Arguments such as scatter count that were labeled as java.io.File via incorrect @Input annotation.	2012-10-18 16:57:15 -04:00
Khalid Shakir	f66284658d	RetryMemoryLimit now works with Scatter/Gather.	2012-10-09 21:51:03 -04:00
Eric Banks	277ba94c7b	Update from dbsnp135 to dbsnp137.	2012-08-31 14:06:29 -04:00
Eric Banks	5ea7cd6dcc	Updating resource bundle: no reason to include both genotype and sites files for Omni and HM3, sites are enough. Also, don't include duplicate entry for the Mills indels.	2012-08-31 14:01:54 -04:00
Khalid Shakir	22b4466cf5	Added setupRetry() to modify jobs when Queue is run with '-retry' and jobs are about to restart after an error. Implemented a mixin called "RetryMemoryLimit" which will by default double the memory. GridEngine memory request parameter can be selected on the command line via '-resMemReqParam mem_free' or '-resMemReqParam virtual_free'. Java optimizations now enabled by default: - Only 4 GC threads instead of each job using java's default O(number of cores) GC threads. Previously on a machine with N cores if you have N jobs running and java allocates N GC threads by default, then the machines are using up to N^2 threads if all jobs are in heavy GC (thanks elauzier). - Exit if GC spends more than 50% of time in GC (thanks ktibbett). - Exit if GC reclaims lest than 10% of max heap (thanks ktibbett). Added a -noGCOpt command line option to disable new java optimizations.	2012-08-13 15:43:05 -04:00
Eric Banks	7cf4b63d76	Disabling indel quals in BaseRecalibrator as it should be, not PrintReads.	2012-08-01 09:23:04 -04:00
Eric Banks	675ccab2fa	Renaming BQSR to BaseRecalibrator	2012-07-23 10:17:17 -04:00
Eric Banks	863eb5b5c0	Use Context not Dinuc covariate	2012-07-17 15:18:11 -04:00
Eric Banks	17d627b86d	Update the DPP and PBPP to use the BQSRv2 walkers	2012-07-17 13:15:32 -04:00
Mauricio Carneiro	9346c5b37a	Merged bug fix from Stable into Unstable	2012-06-26 14:55:41 -04:00
Mauricio Carneiro	334d66f2b1	Updating validation parameter in the DPP users were very confused with the failing validation of their 'unpicarded' bam files. Changed the default to OFF and added an option to turn it on.	2012-06-26 14:54:37 -04:00
Ryan Poplin	c3fb321014	Minor updates to pacbio data processing script to make it work with the latest bwa version/settings.	2012-05-22 10:24:45 -04:00
Khalid Shakir	91cb654791	AggregateMetrics: - By porting from jython to java now accessible to Queue via automatic extension generation. - Better handling for problematic sample names by using PicardAggregationUtils. GATKReportTable looks up keys using arrays instead of dot-separated strings, which is useful when a sample has a period in the name. CombineVariants has option to suppress the header with the command line, which is now invoked during VCF gathering. Added SelectHeaders walker for filtering headers for dbGAP submission. Generated command line for read filters now correctly prefixes the argument name as --read_filter instead of -read_filter. Latest WholeGenomePipeline. Other minor cleanup to utility methods.	2012-04-17 11:45:32 -04:00
Eric Banks	ed69f4ff7c	Merge branch 'master' of ssh://gsa1.broadinstitute.org/humgen/gsa-scr1/gsa-engineering/git/unstable	2012-03-13 09:28:16 -04:00
Eric Banks	9b9856ead5	quick todo for next time we make a bundle	2012-03-13 09:28:11 -04:00
Eric Banks	6e9b8559d8	Unfortunately need to bump up memory needed for liftover to get Omni file sorted	2012-03-12 23:20:00 -04:00
Eric Banks	359090c4b7	Updating dbsnp to v135	2012-03-12 13:17:58 -04:00
Eric Banks	7e9a535c4d	Updated the bundle to use the official filtered (final) indel calls	2012-03-12 12:12:24 -04:00
Christopher Hartl	2c1b14d35e	Mostly small changes to my own scala scripts: .vcf.gz compatibility for output files, smarter beagle generation, simple script to scatter-gather combine variants. Whole genome indel calling now uses the gold standard indel set.	2012-02-22 17:20:04 -05:00
Christopher Hartl	974c2499cc	Bugfixed to script.	2012-02-02 12:55:54 -05:00
Christopher Hartl	27ea6426a4	Small script to chunk up a VCF into equal-sized chunks	2012-02-02 12:29:03 -05:00
Christopher Hartl	0c562756eb	Add a memory limit so this thing doesn't get killed on the farm	2012-02-02 10:30:09 -05:00
Christopher Hartl	45bf2562cc	.	2012-02-02 09:11:17 -05:00
Christopher Hartl	f8c5406084	Add the ability to extract samples	2012-02-02 09:06:39 -05:00
Christopher Hartl	b567ed8793	Use the right reference path :(	2012-02-01 12:35:18 -05:00
Christopher Hartl	87a63d54d6	fix the script!	2012-02-01 12:05:29 -05:00
Christopher Hartl	810996cfca	Introducing: VariantsToPed, the world's most annoying walker! And also a busted QScript to run it that I need Khalid's help debugging ( frownie face ). Note that VariantsToPed and PlinkSeq generate the same binary file (up to strand flips...thanks PlinkSeq), so I know it's working properly. Hooray!	2012-02-01 10:39:03 -05:00
Mauricio Carneiro	052a4bdb9c	Turning off PHONE HOME option in the MDCP * MDCP is for internal use and there is no need to report to the Amazon cloud. * Reporting to ASW_S3 is not allowing jobs to finish, this is probably a bug.	2012-01-27 11:13:30 -05:00
Mauricio Carneiro	97499529c7	another small bug with the file extension.	2012-01-24 16:14:35 -05:00
Mauricio Carneiro	945cf03889	IntelliJ ate my import!	2012-01-23 21:46:45 -05:00
Mauricio Carneiro	2bb9525e7f	Don't set base qualities if fastQ is provided * Pacbio Processing pipeline now works with the new fastQ files outputted by the Pacbio instrument	2012-01-23 17:57:29 -05:00
Khalid Shakir	c18beadbdb	Device files like /dev/null are now tracked as special by Queue and are not used to generate .out file paths, scattered into a temporary directory, gathered, deleted, etc. Attempted workaround for xdr_resourceInfoReq unsatisfied link during loading of libbat.so.	2012-01-23 16:17:04 -05:00
Ryan Poplin	75f87db468	Replacing Mills file with new gold standard indel set in the resource bundle for release with v1.5	2012-01-17 15:02:45 -05:00
Mauricio Carneiro	5bf960deb8	adding dbsnp to indel VQSR	2012-01-10 12:38:49 -05:00
Mauricio Carneiro	6f2abd76df	Updating the MDCP with the new indel gold standard from Ryan.	2012-01-09 15:31:18 -05:00
Khalid Shakir	5793625592	No more "Q-<pid>@<host>". Generated log file names now use the first output + ".out" (ex. my.vcf.out) or the name of the first QScript plus the order the function was added (ex. MyScript-1.out). The same function added twice with the same outputs will now have the same default logs, meaning the 2nd instance of the function won't be added to the graph twice. QScript accessor to QSettings to specify a default runName and other default function settings. Because log files are no longer pseudo-random their presense can be used to tell if a job without other file outputs is "done". For now still using the log's .done file in addition to original outputs. Gathered log files concatenate all log files together into the stdout. InProcessFunctions now have PrintStreams for stdout and stderr. Updated ivy to use commons-io 2.1 for copying logs to the stdout PrintStream. Removed snakeyaml. During graph tracking of outputs the Index files, and now BAM MD5s, are tracked with the gathering of the original file. In Queue generated wrappers for the GATK the Index and MD5s used for tracking are switched to private scope. Added more detailed output when running with -l DEBUG. Simplified graphviz visualization for additional debugging. Switched usage of the scala class 'List' to the trait 'Seq' (think java.util.ArrayList vs. using the interface java.util.List) Minor cleanup to build including sending ant gsalib to R's default libloc.	2012-01-08 12:11:55 -05:00
Mauricio Carneiro	f6a18aea63	Updated MDCP with INDEL best practices * chose 90.0 indel cut target for most datasets (this is arbitrary).	2012-01-06 17:21:59 -05:00
Khalid Shakir	7486696c07	When using bam list mode in HSP deriving VCF name from bam list instead of requiring an additional parameter. Creating a single temporary directory per ant test run instead of a putting temp files across all runs in the same directory. Updated various tests for above items and other small fixes.	2011-12-16 18:09:25 -05:00

1 2 3

135 Commits (ffbd4d85f2e0112b32df0bbba00330b00a0806cf)