diff --git a/doc/GATK_Hello_World.pdf b/doc/GATK_Hello_World.pdf deleted file mode 100755 index ae3a4d276..000000000 Binary files a/doc/GATK_Hello_World.pdf and /dev/null differ diff --git a/doc/GATK_Hello_World.tex b/doc/GATK_Hello_World.tex deleted file mode 100755 index 1e752db00..000000000 --- a/doc/GATK_Hello_World.tex +++ /dev/null @@ -1,110 +0,0 @@ -\documentclass[11pt,fullpage]{article} -\usepackage[urlcolor=blue,colorlinks=true]{hyperref} - -\oddsidemargin 0.0in -\textwidth 6.5in - -\begin{document} - -\title{Getting Started with the Genome Analysis Toolkit (GATK)} -\author{Matt Hanna} -\date{Created March 16, 2009\\ Updated \today} -\maketitle - -\section{Build Prerequisites} -GATK requires JDK 1.6 and Ant 1.7.1 to compile. - -\section{Getting and Building the Source} -GATK is located in the Sting svn repository, and -compiles using a build.xml in the root directory. - -Download and build the source as follows: -\begin{verbatim} - svn co https://svnrepos/Sting/trunk Sting - cd Sting - ant -\end{verbatim} - -\section{Getting Started} -The core concept behind GATK is the walker, a class that implements the -three core operations: filtering, mapping, and reducing. - -\begin{description} - \item [filter] reduces the size of the dataset by applying a predicate. - \item [map] Applies a function to each individual element in a dataset, - effectively 'mapping' it to a new element. - \item [reduce] Inductively combines the elements of a list. The base - case is supplied by the reduceInit() function, and the inductive step - is performed by the reduce() function. -\end{description} -Users of the GATK will provide a walker to run their analyses. The engine -will produce a result by first filtering the dataset, running a map operation, -and finally reducing the map operation to a single result. - -\section{Creating a Walker} -To be loaded by GATK, the walker must satisfy the following properties: -\begin{enumerate} - \item It must be a loose class, not packaged into a jar file. - \item It must be in the unnamed package (in other words, the source - should not start with a package declaration). - \item It must subclass one of the basic walkers in the - org.broadinstitute.sting.gatk.walkers package: ReadWalker or - LociWalker. - \item It must live in the directory \$STING\_HOME/dist/walkers. -\end{enumerate} - -\section{Example} -This walker will print output for each read it sees, eventually computing the -total number of reads by mapping every read to 1 and summing all the 1s to -realize the total number of reads. - -\begin{samepage} -Copy the following text into the file \$STING\_HOME/dist/walkers/HelloWalker.java: - -\begin{verbatim} -import net.sf.samtools.SAMRecord; - -import org.broadinstitute.sting.gatk.LocusContext; -import org.broadinstitute.sting.gatk.walkers.ReadWalker; - -/** - * Define a class extending from ReadWalker with types - * . - */ -public class HelloWalker extends ReadWalker { - private Long currentRead = 0L; - - // Maps each read to the value 1. - public Integer map(LocusContext context, SAMRecord read) { - System.out.printf("Hello read %d%n", ++currentRead ); - return 1; - } - - // Provides an initial value for the reduce function. - public Long reduceInit() { return 0L; } - - // Defines how to compute the reduction given a value in the list. - public Long reduce(Integer value, Long sum) { - return sum + value; - } -} -\end{verbatim} -\end{samepage} -To compile the walker: -\begin{verbatim} -setenv CLASSPATH $STING_HOME/dist/GenomeAnalysisTK.jar:$STING_HOME/dist/sam-1.0.jar -javac HelloWalker.java -\end{verbatim} -To run the walker: -\begin{verbatim} -mkdir $STING_HOME/dist/walkers -java -Xmx4096m -jar dist/GenomeAnalysisTK.jar \ - -R/seq/references/Homo_sapiens_assembly18/v0/Homo_sapiens_assembly18.fasta \ - -I /broad/1KG/legacy_data/trio/na12878.bam -T Hello \ - -L chr1:10000000-10000100 -l WARN -\end{verbatim} -This command will run the walker across a subsection of chromosome 1, operating on -reads which align to that subsection. If you'd like to see more information from the GATK -on what it's doing, you can change the logging level (-l) to INFO. - -\end{document}