gatk-3.8

gatk3的最后一个经典版本3.8

Go to file

depristo 27c8fb1e4d Added support for a general GATK option --simplifyBAM to automatically remove and simplify kept reads in an output BAM file. Specifically, duplicate, non-PF, and unmapped reads are removed, and all extended tags in the retained SAM records are removed except the RG:Z tag. This option is very useful when creating temporary BAM files (merged per-population or multi-sample cleaned) for future calling (as in the 1000G processing pipeline). Results in a significant reduction in space of the resulting BAM, faster reading of the BAM, and surprisingly even faster UG performance: 1-10mb of chromosome one, from NA12878 HiSeq 64x data set on hg18: Full BAM Write time: 8.6 m Size: 866M CountReads time: 2.9 m UG time: 11.3 m Simplified BAM: Write time: 6.2 Size: 458M CountReads time: 85.7 s UG time: 10.1 m git-svn-id: file:///humgen/gsa-scr1/gsa-engineering/svn_contents/trunk@5517 348d0f76-0448-11de-a6fe-93d51630548a		2011-03-26 01:21:35 +00:00
R	Misc changes	2011-02-26 15:35:49 +00:00
analysis/depristo	Walkers can now specify a class extending from Gatherer to merge custom output formats. Add @Gather(MyGatherer.class) to the walker @Output.	2011-03-24 14:03:51 +00:00
archive	Moving GLF code to archive	2011-01-15 22:42:42 +00:00
c	Bug fixes for the bwa aligner and changes to support compiling against newer releases of the bwa code base.	2010-12-17 14:49:15 +00:00
doc	removing the custom reflections library from the libs, and adding a release version. Hopefully this will fix the problem Menachem has been seeing with random JVM crashes. Also	2010-08-19 00:42:37 +00:00
java	Added support for a general GATK option --simplifyBAM to automatically remove and simplify kept reads in an output BAM file. Specifically, duplicate, non-PF, and unmapped reads are removed, and all extended tags in the retained SAM records are removed except the RG:Z tag. This option is very useful when creating temporary BAM files (merged per-population or multi-sample cleaned) for future calling (as in the 1000G processing pipeline). Results in a significant reduction in space of the resulting BAM, faster reading of the BAM, and surprisingly even faster UG performance:	2011-03-26 01:21:35 +00:00
lua	forgot to remove a debug line.	2011-02-15 16:25:48 +00:00
matlab	Another matlab script -- this time for making power and coverage plots over a specific gene region. Lots of fun file reading, string manipulation, and exploration of the set() function	2009-11-30 20:02:25 +00:00
packages	Walkers can now specify a class extending from Gatherer to merge custom output formats. Add @Gather(MyGatherer.class) to the walker @Output.	2011-03-24 14:03:51 +00:00
perl	2 more scripts I found helpful in syncing (and cleaning up) the 1000G mirror	2011-02-22 04:17:36 +00:00
python	A helper script that will take a list of bams, a list of case sample IDs, and a list of control sample IDs, and generate a sample meta data yaml (which includes the bamfiles)	2011-03-21 16:11:55 +00:00
ruby	accidentally commited an old tool	2010-08-25 15:42:02 +00:00
scala	Enabled the parameterize option for debugging PipelineTest MD5s.	2011-03-26 00:41:47 +00:00
settings	Update Picard / sam-jdk at Tim's request.	2011-01-03 02:17:25 +00:00
shell	Fixing this so it gets the right 129 dbsnp for b37 samples	2011-03-22 17:43:20 +00:00
testdata	ReplaceReadGroups. Fixes BAM files without read group info. MissingReadGroup points people to this tool now. Please point users on the forum to this tool now. Will migrate to Picard.	2011-02-21 14:02:41 +00:00
LICENSE	Adding a license to the root directory in case BOSC checks for one. Has the	2010-04-20 16:04:29 +00:00
build.xml	Build.xml contained references to tools now in picard	2011-03-17 18:29:46 +00:00
ivy.xml	Added commons math, for Kristian.	2011-02-14 18:57:21 +00:00