## Writing walkers http://gatkforums.broadinstitute.org/gatk/discussion/1302/writing-walkers

1. Introduction

The core concept behind GATK tools is the walker, a class that implements the three core operations: filtering, mapping, and reducing.

Users of the GATK will provide a walker to run their analyses. The engine will produce a result by first filtering the dataset, running a map operation, and finally reducing the map operation to a single result.

2. Creating a Walker

To be usable by the GATK, the walker must satisfy the following properties:

3. Examples

The best way to get started with the GATK is to explore the walkers we've written. Here are the best walkers to look at when getting started:

$STING_HOME/java/src/org/broadinstitute/sting/gatk/walkers/qc/CountLociWalker.java

$STING_HOME/java/src/org/broadinstitute/sting/gatk/walkers/qc/CountReadsWalker.java

$STING_HOME/java/src/org/broadinstitute/sting/gatk/examples/papergenotyper/GATKPaperGenotyper.java

Please note that the walker above is NOT the UnifiedGenotyper. While conceptually similar to the UnifiedGenotyper, the GATKPaperGenotyper uses a much simpler calling model for increased clarity and readability.

4. External walkers and the 'external' directory

The GATK can absorb external walkers placed in a directory of your choosing. By default, that directory is called 'external' and is relative to the Sting git root directory (for example, ~/src/Sting/external). However, you can choose to place that directory anywhere on the filesystem and specify its complete path using the ant external.dir property.

ant -Dexternal.dir=~/src/external

The GATK will check each directory under the external directory (but not the external directory itself!) for small build scripts. These build scripts must contain at least a compile target that compiles your walker and places the resulting class file into the GATK's class file output directory. The following is a sample compile target:

<target name="compile" depends="init">
    <javac srcdir="." destdir="${build.dir}" classpath="${gatk.classpath}" />
</target>

As a convenience, the build.dir ant property will be predefined to be the GATK's class file output directory and the gatk.classpath property will be predefined to be the GATK's core classpath. Once this structure is defined, any invocation of the ant build scripts will build the contents of the external directory as well as the GATK itself.