Commit Graph

578 Commits (07d41efc2b4024eb8b49a726530cab4ab1aa6f88)

Author SHA1 Message Date
Heng Li 07d41efc2b explain secondary/supplementary aln for RNA-seq 2017-11-30 23:02:20 -05:00
Heng Li 984f7846c0 r601: bugfix - a similar issue to r600
This bug unsets the alignment score of suboptimal alignments.
2017-11-30 11:51:34 -05:00
Heng Li af1d6afba9 r600: bugfix - missing secondary alignments (#71)
This should very rarely happen to typical data, but has a higher chance in
artifactual data.
2017-11-30 11:34:10 -05:00
Heng Li cbdb6c069f when there are incorrect anno, warn but not abort 2017-11-24 12:48:49 -05:00
Heng Li 35b6d9f7d5 direct manpage to HTML 2017-11-24 11:05:20 -05:00
Heng Li 39a9666246 convert PAF to LAST's cigar output
also added the support of BLAST-like output for "--cs=short".
2017-11-18 20:30:44 -05:00
Heng Li 662d05dc02 fixed incorrect var coordinate 2017-11-12 20:25:33 -05:00
Heng Li 379457c18b use mapq threshold 2017-11-12 19:01:34 -05:00
Heng Li 03169d590b print cov-1 regions (not BED output any more) 2017-11-12 18:45:49 -05:00
Heng Li 8c8d446820 find regions covered by one contig 2017-11-12 18:41:01 -05:00
Heng Li 0ddb064f17 dnadiff-like script; improvement coming 2017-11-12 15:07:29 -05:00
Heng Li 131cfc6938 r574: build index without sequences 2017-11-11 21:38:38 -05:00
Heng Li 2f463b1db0 r573: prepare to generalize index 2017-11-11 19:54:06 -05:00
Heng Li 3b518271ee Release minimap2-2.5 (r572) 2017-11-11 11:29:28 -05:00
Heng Li 481d8239e9
Merge pull request #57 from cjw85/strand_typo
Fix typo in strand property
2017-11-10 20:36:19 -05:00
cwright 4f77b0c1ed Fix typo in strand property 2017-11-11 01:18:44 +00:00
Heng Li d7a31e40e6 r569: last commit is buggy 2017-11-09 23:20:41 -05:00
Heng Li dd18cd75de r568: revert - don't take max(dp_max, dp_score) 2017-11-09 23:12:48 -05:00
Heng Li 99a2709913 r567: minor change to #56 2017-11-09 19:17:45 -05:00
Heng Li 032068a747
Merge pull request #56 from mvdbeek/softclipping
Implement -Y for soft clipping of supp. alignments
2017-11-09 19:12:47 -05:00
mvdbeek 1cb0bf4bef Implement -Y for soft clipping of supp. alignments
I tried to base this on bwa-mem and it seems to work for sam alignments.
2017-11-09 19:22:36 +01:00
Heng Li 422b43374e
Merge pull request #55 from martinghunt/fix_python3_strings
Bug fix with byte strings in Python3
2017-11-09 09:37:41 -05:00
martinghunt 29a26e3eea Bug fix with byte strings in Python3 2017-11-09 13:57:15 +00:00
Heng Li a7b38f6900 r562: fixed a severe bug: wrong query start 2017-11-08 22:31:05 -05:00
Heng Li e896c9ec05 r559: prefer a chain involving more segments 2017-11-08 13:22:16 -05:00
Heng Li 98ba8928c6 r558: dp_max no less than dp_score 2017-11-08 10:06:10 -05:00
Heng Li bcf8462d20
Merge pull request #52 from cvdelannoy/patch-1
Update README.md
2017-11-08 07:50:31 -05:00
Carlos de Lannoy c047c852ce
Update README.md
25: changed -x ava-one to -x ava-ont
2017-11-08 13:07:01 +01:00
Heng Li b24d68ae9f r557: fixed another mapq underestimate
When a chain is split during base-level alignment, its chaining score is
reduced. However, the chaining score of its suboptimal chain remains the same.
This leads to underestimated mapping quality.
2017-11-07 23:20:49 -05:00
Heng Li 65deedfa96 r556: bugfix - underestimate mapq for split aln 2017-11-07 22:37:12 -05:00
Heng Li 21a46ba652 Release minimap2-2.4 (r555) 2017-11-06 12:54:02 -05:00
Heng Li 1617b87ee1 this will become version 3 at arXiv 2017-11-06 10:57:12 -05:00
Heng Li 2191ac58ad two discussion paragraphs; need one more 2017-11-05 12:27:52 -05:00
Heng Li fa5a645ca5 r552: fixed a tiny typo on struct packing
The old packing wastes memory, thought very small.
2017-11-05 08:27:26 -05:00
Heng Li c2b09356b8 moved conclusion to result; need new discussion 2017-11-04 22:19:46 -04:00
Heng Li a3f0aa1d5b r550: fixed -L issues with secondary and supp aln 2017-11-04 12:13:38 -04:00
Heng Li 52ffbc9e0c added bowtie2 and snap versions 2017-11-02 15:51:23 -04:00
Heng Li a9790c0f1d added two bowtie2 numbers for comparison 2017-11-02 15:44:48 -04:00
Heng Li d0ac78ac08 updated the tech note 2017-11-02 15:37:24 -04:00
Heng Li 22290db3e4 r546: minor mapQ tuning 2017-11-01 13:20:39 -04:00
Heng Li cd24dc8834 r545: removed option -i, not working well 2017-10-31 22:23:27 -04:00
Heng Li b8e758df0f r544: increased PE mapQ 2017-10-31 16:55:02 -04:00
Heng Li 311fa90030 r543: applied some sr mapq changes to long reads 2017-10-31 15:24:05 -04:00
Heng Li fb8a1b5536 r542: tuning mapQ calculation 2017-10-31 14:25:09 -04:00
Heng Li 7f11f4c4d4 Instructions on different long RNA-seq techs 2017-10-29 13:58:25 -04:00
Heng Li 285eb0da05 r540: removed a buggy debugging line 2017-10-29 00:02:41 -04:00
Heng Li 192217a10c r539: use --splice-flank=yes by default
In human/mouse, the GTr..yAG pattern occurs to 91/92% of all GT-AG introns.
Modeling r..y clearly leads to higher accuracy. However, in SIRV, this
percentage is reduced to ~60%. The default "--splice --splice-flank=yes"
leads to lower accuracy. If someone benchmark minimap2 on SIRV, this would be
bad, but minimap2 is developed for practical applications, not for benchmarks.
I will live with that.
2017-10-28 22:29:55 -04:00
Heng Li f22a94e868 r538: fixed a long existing bug in HPC k-mer (#47)
This bug may lead to a wrong minimizer when a HPC k-mer is longer than 256bp.
When there is a seed match involving this wrong HPC k-mer, the correct seed
sequences do not match in fact. This violates the assumption in align.c and
subsequently causes a segfault, which is what #47 has caught. This bug lurked
in the earliest piece of code and affected all released minimap2 versions so
far. It is extremely rare and does not affect the prebuilt GRCh37/38 indices.
2017-10-28 19:21:10 -04:00
Heng Li 79b0caca95 r537: model the next base to GT/AG
[PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in
both human and yeast, and that the base preceeding AG tends to be C or T (i.e.
Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost
if there is no r or y. This improves the junction accuracy when mapping to
human and mouse and decreases the accuacy when mapping to SIRV. My guess is
that SIRV does not honor this trend. Need to investigate in future.

Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed
to 9 instead of 5. I also added --splice-flank to enable the above model. This
may become the default once I confirm my hypothesis on SIRV.
2017-10-28 00:25:01 -04:00
Heng Li afc2f2e84b r536: removed an unnecessary assert() 2017-10-24 21:08:54 -04:00