minimap2/misc
Heng Li a633a744b6 more doc in misc 2018-02-02 14:07:17 -05:00
..
README.md more doc in misc 2018-02-02 14:07:17 -05:00
cnt-feat.js allow to exclude regions 2018-01-07 22:35:56 -05:00
gff2bed.js Documented some k8 scripts; more coming 2018-02-02 13:57:08 -05:00
intron-eval.js when there are incorrect anno, warn but not abort 2017-11-24 12:48:49 -05:00
mapstat.js backup manuscript 2017-08-24 22:05:14 +08:00
ov-eval.js renamed paf2ovlp to ov-eval 2017-12-24 18:04:00 -05:00
paf2aln.js Documented some k8 scripts; more coming 2018-02-02 13:57:08 -05:00
paf2diff.js fixed typos 2018-01-30 10:11:50 -05:00
sam2paf.js Documented some k8 scripts; more coming 2018-02-02 13:57:08 -05:00
sim-eval.js Documented some k8 scripts; more coming 2018-02-02 13:57:08 -05:00
sim-mason2.js Documented some k8 scripts; more coming 2018-02-02 13:57:08 -05:00
sim-pbsim.js Documented some k8 scripts; more coming 2018-02-02 13:57:08 -05:00
splice2bed.js Documented some k8 scripts; more coming 2018-02-02 13:57:08 -05:00

README.md

Getting Started

curl -L https://github.com/attractivechaos/k8/releases/download/v0.2.4/k8-0.2.4.tar.bz2 | tar -jxf -
cp k8-0.2.4/k8-`uname -s` k8   # or better copy to a directory on PATH
minimap2 --cs test/MT-*.fa | paf2aln.js - | less   # pretty print base alignment
sam2paf.js aln.sam.gz | less -S                    # convert SAM to PAF
gff2bed.js anno.gtf | less -S                      # convert GTF/GFF3 to BED12
minimap2 -cx splice ref.fa rna-seq.fq | splice2bed.js -   # convert splice aln to BED12

Table of Contents

Introduction

This directory contains auxiliary scripts for format conversion, mapping accuracy evaluation and miscellaneous purposes. These scripts require the k8 Javascript shell to run. On Linux or Mac, you can download the precompiled k8 binary with:

curl -L https://github.com/attractivechaos/k8/releases/download/v0.2.4/k8-0.2.4.tar.bz2 | tar -jxf -
cp k8-0.2.4/k8-`uname -s` k8

It is highly recommended to copy the executable k8 to a directory on your PATH such as /usr/bin/env can find them.

Use Cases

paf2aln.js: convert PAF to other formats

Script paf2aln.js converts PAF with the cs tag to MAF or BLAST-like output. It only works with minimap2 output generated using the --cs tag.

gff2bed.js: convert GTF/GFF3 to BED12 format

Script gff2bed.js converts GFF format to 12-column BED format. It seamlessly works with both GTF and GFF3.

splice2bed.js: convert spliced alignment to BED12

Script splice2bed.js converts spliced alignment in SAM or PAF to 12-column BED format.

sam2paf.js: convert SAM to PAF

Script sam2paf.js converts alignments in the SAM format to PAF.

Evaluating mapping accuracy with simulated reads

Script sim-pbsim.js converts the MAF output of pbsim to FASTQ and encodes the true mapping position in the read name in a format like S1_33!chr1!225258409!225267761!-. Similarly, script sim-mason2.js converts mason2 simulated SAM to FASTQ.

Script sim-eval.js evaluates mapped SAM/PAF. Here is example output:

Q       60      32478   0       0.000000000     32478
Q       22      16      1       0.000030775     32494
Q       21      43      1       0.000061468     32537
Q       19      73      1       0.000091996     32610
Q       14      66      1       0.000122414     32676
Q       10      27      3       0.000214048     32703
Q       8       14      1       0.000244521     32717
Q       7       13      2       0.000305530     32730
Q       6       46      1       0.000335611     32776
Q       3       10      1       0.000366010     32786
Q       2       20      2       0.000426751     32806
Q       1       248     94      0.003267381     33054
Q       0       31      17      0.003778147     33085
U       3

where each Q-line gives the quality threshold, the number of reads mapped with mapping quality equal to or greater than the threshold, number of wrong mappings, accumulative mapping error rate and the accumulative number of mapped reads. The U-line gives the number of unmapped reads if they are present in the SAM file.