Merge pull request #1075 from broadinstitute/ldg_bamoutDocs

Add info about multiple input samples (as relevant for M2)
This commit is contained in:
Geraldine Van der Auwera 2015-07-27 16:56:36 -04:00
commit 43a37fc746
1 changed files with 11 additions and 6 deletions

View File

@ -89,11 +89,12 @@ public class AssemblyBasedCallerArgumentCollection extends StandardCallerArgumen
} }
/** /**
* The assembled haplotypes will be written as BAM to this file if requested. Really for debugging purposes only. * The assembled haplotypes and locally realigned reads will be written as BAM to this file if requested. Really
* Note that the output here does not include uninformative reads so that not every input read is emitted to the bam. * for debugging purposes only. Note that the output here does not include uninformative reads so that not every
* input read is emitted to the bam.
* *
* Turning on this mode may result in serious performance cost for the HC. It's really only appropriate to * Turning on this mode may result in serious performance cost for the caller. It's really only appropriate to
* use in specific areas where you want to better understand why the HC is making specific calls. * use in specific areas where you want to better understand why the caller is making specific calls.
* *
* The reads are written out containing an "HC" tag (integer) that encodes which haplotype each read best matches * The reads are written out containing an "HC" tag (integer) that encodes which haplotype each read best matches
* according to the haplotype caller's likelihood calculation. The use of this tag is primarily intended * according to the haplotype caller's likelihood calculation. The use of this tag is primarily intended
@ -101,14 +102,18 @@ public class AssemblyBasedCallerArgumentCollection extends StandardCallerArgumen
* easily see which reads go with these haplotype. * easily see which reads go with these haplotype.
* *
* Note that the haplotypes (called or all, depending on mode) are emitted as single reads covering the entire * Note that the haplotypes (called or all, depending on mode) are emitted as single reads covering the entire
* active region, coming from read HC and a special read group. * active region, coming from sample "HC" and a special read group called "ArtificialHaplotype". This will increase the
* pileup depth compared to what would be expected from the reads only, especially in complex regions.
* *
* Note also that only reads that are actually informative about the haplotypes are emitted. By informative we mean * Note also that only reads that are actually informative about the haplotypes are emitted. By informative we mean
* that there's a meaningful difference in the likelihood of the read coming from one haplotype compared to * that there's a meaningful difference in the likelihood of the read coming from one haplotype compared to
* its next best haplotype. * its next best haplotype.
* *
* If multiple BAMs are passed as input to the tool (as is common for M2), then they will be combined in the bamout
* output and tagged with the appropriate sample names.
*
* The best way to visualize the output of this mode is with IGV. Tell IGV to color the alignments by tag, * The best way to visualize the output of this mode is with IGV. Tell IGV to color the alignments by tag,
* and give it the HC tag, so you can see which reads support each haplotype. Finally, you can tell IGV * and give it the "HC" tag, so you can see which reads support each haplotype. Finally, you can tell IGV
* to group by sample, which will separate the potential haplotypes from the reads. All of this can be seen in * to group by sample, which will separate the potential haplotypes from the reads. All of this can be seen in
* <a href="https://www.dropbox.com/s/xvy7sbxpf13x5bp/haplotypecaller%20bamout%20for%20docs.png">this screenshot</a> * <a href="https://www.dropbox.com/s/xvy7sbxpf13x5bp/haplotypecaller%20bamout%20for%20docs.png">this screenshot</a>
* *