Commit Graph

561 Commits (dd18cd75de3fe3f06e9f5d1e651d153e665dec3e)

Author SHA1 Message Date
Heng Li dd18cd75de r568: revert - don't take max(dp_max, dp_score) 2017-11-09 23:12:48 -05:00
Heng Li 99a2709913 r567: minor change to #56 2017-11-09 19:17:45 -05:00
Heng Li 032068a747
Merge pull request #56 from mvdbeek/softclipping
Implement -Y for soft clipping of supp. alignments
2017-11-09 19:12:47 -05:00
mvdbeek 1cb0bf4bef Implement -Y for soft clipping of supp. alignments
I tried to base this on bwa-mem and it seems to work for sam alignments.
2017-11-09 19:22:36 +01:00
Heng Li 422b43374e
Merge pull request #55 from martinghunt/fix_python3_strings
Bug fix with byte strings in Python3
2017-11-09 09:37:41 -05:00
martinghunt 29a26e3eea Bug fix with byte strings in Python3 2017-11-09 13:57:15 +00:00
Heng Li a7b38f6900 r562: fixed a severe bug: wrong query start 2017-11-08 22:31:05 -05:00
Heng Li e896c9ec05 r559: prefer a chain involving more segments 2017-11-08 13:22:16 -05:00
Heng Li 98ba8928c6 r558: dp_max no less than dp_score 2017-11-08 10:06:10 -05:00
Heng Li bcf8462d20
Merge pull request #52 from cvdelannoy/patch-1
Update README.md
2017-11-08 07:50:31 -05:00
Carlos de Lannoy c047c852ce
Update README.md
25: changed -x ava-one to -x ava-ont
2017-11-08 13:07:01 +01:00
Heng Li b24d68ae9f r557: fixed another mapq underestimate
When a chain is split during base-level alignment, its chaining score is
reduced. However, the chaining score of its suboptimal chain remains the same.
This leads to underestimated mapping quality.
2017-11-07 23:20:49 -05:00
Heng Li 65deedfa96 r556: bugfix - underestimate mapq for split aln 2017-11-07 22:37:12 -05:00
Heng Li 21a46ba652 Release minimap2-2.4 (r555) 2017-11-06 12:54:02 -05:00
Heng Li 1617b87ee1 this will become version 3 at arXiv 2017-11-06 10:57:12 -05:00
Heng Li 2191ac58ad two discussion paragraphs; need one more 2017-11-05 12:27:52 -05:00
Heng Li fa5a645ca5 r552: fixed a tiny typo on struct packing
The old packing wastes memory, thought very small.
2017-11-05 08:27:26 -05:00
Heng Li c2b09356b8 moved conclusion to result; need new discussion 2017-11-04 22:19:46 -04:00
Heng Li a3f0aa1d5b r550: fixed -L issues with secondary and supp aln 2017-11-04 12:13:38 -04:00
Heng Li 52ffbc9e0c added bowtie2 and snap versions 2017-11-02 15:51:23 -04:00
Heng Li a9790c0f1d added two bowtie2 numbers for comparison 2017-11-02 15:44:48 -04:00
Heng Li d0ac78ac08 updated the tech note 2017-11-02 15:37:24 -04:00
Heng Li 22290db3e4 r546: minor mapQ tuning 2017-11-01 13:20:39 -04:00
Heng Li cd24dc8834 r545: removed option -i, not working well 2017-10-31 22:23:27 -04:00
Heng Li b8e758df0f r544: increased PE mapQ 2017-10-31 16:55:02 -04:00
Heng Li 311fa90030 r543: applied some sr mapq changes to long reads 2017-10-31 15:24:05 -04:00
Heng Li fb8a1b5536 r542: tuning mapQ calculation 2017-10-31 14:25:09 -04:00
Heng Li 7f11f4c4d4 Instructions on different long RNA-seq techs 2017-10-29 13:58:25 -04:00
Heng Li 285eb0da05 r540: removed a buggy debugging line 2017-10-29 00:02:41 -04:00
Heng Li 192217a10c r539: use --splice-flank=yes by default
In human/mouse, the GTr..yAG pattern occurs to 91/92% of all GT-AG introns.
Modeling r..y clearly leads to higher accuracy. However, in SIRV, this
percentage is reduced to ~60%. The default "--splice --splice-flank=yes"
leads to lower accuracy. If someone benchmark minimap2 on SIRV, this would be
bad, but minimap2 is developed for practical applications, not for benchmarks.
I will live with that.
2017-10-28 22:29:55 -04:00
Heng Li f22a94e868 r538: fixed a long existing bug in HPC k-mer (#47)
This bug may lead to a wrong minimizer when a HPC k-mer is longer than 256bp.
When there is a seed match involving this wrong HPC k-mer, the correct seed
sequences do not match in fact. This violates the assumption in align.c and
subsequently causes a segfault, which is what #47 has caught. This bug lurked
in the earliest piece of code and affected all released minimap2 versions so
far. It is extremely rare and does not affect the prebuilt GRCh37/38 indices.
2017-10-28 19:21:10 -04:00
Heng Li 79b0caca95 r537: model the next base to GT/AG
[PMID:18688272] shows that the base following GT tends to be A or G (i.e. R) in
both human and yeast, and that the base preceeding AG tends to be C or T (i.e.
Y). In the new model, we pay no cost to GTr..yAG, but we pay half of the cost
if there is no r or y. This improves the junction accuracy when mapping to
human and mouse and decreases the accuacy when mapping to SIRV. My guess is
that SIRV does not honor this trend. Need to investigate in future.

Also in this commit, --cost-non-gt-ag is aliased to -C. The default is changed
to 9 instead of 5. I also added --splice-flank to enable the above model. This
may become the default once I confirm my hypothesis on SIRV.
2017-10-28 00:25:01 -04:00
Heng Li afc2f2e84b r536: removed an unnecessary assert() 2017-10-24 21:08:54 -04:00
Heng Li e6f66f2f3b disabled download counts
seems not working any more
2017-10-24 14:40:08 -04:00
Heng Li 70735098e2 fixed a typo in README 2017-10-24 14:39:33 -04:00
Heng Li d4b5dfc297 r533: added --no-pairing
to prevent the use of any pairing information for paired-end reads.
2017-10-23 14:09:32 -04:00
Heng Li 5acd709524 updated the download link to v2.3 2017-10-23 13:43:31 -04:00
Heng Li 306e4541f8 Released minimap2-2.3 (r531) 2017-10-22 23:13:35 -04:00
Heng Li 1dd221ad82 a bit more on short read mapping
The tech note still needs improvement. Will do that after the release of v2.3.
2017-10-22 18:38:35 -04:00
Heng Li c6b6392b70 minor wording changes 2017-10-21 23:46:36 -04:00
Heng Li dc37aee881 minor wording changes 2017-10-21 23:38:05 -04:00
Heng Li 37e627aa98 note on long cigar in README 2017-10-21 22:28:06 -04:00
Heng Li beeb806829 r526: fixed a bug when HPC is in use
It happened when the query HPC minimizer is longer than the reference HPC
minimizer close to the beginning of a contig. We may get a negative coordinate,
which causes an assertion failure.
2017-10-21 19:54:04 -04:00
Heng Li be7f3c4ffe r525: fixed a bug in chaining; handle ovlp ends 2017-10-20 21:34:52 -04:00
Heng Li bd04372873 r524: reverted to bwa-mem end bonus
and reduced the cost of clipping when filtering by identity
2017-10-20 16:57:31 -04:00
Heng Li 15ed0712c2 r523: fixed a performance bug in ksw2_ll
Wont' affect accuracy.
2017-10-20 13:00:10 -04:00
Heng Li 55dcbefe87 updated text (unfinished) 2017-10-20 12:44:54 -04:00
Heng Li 8abba332ad replaced mapQ plot with sr roc
figure legend and text to be updated later
2017-10-19 23:43:17 -04:00
Heng Li 4683da2455 r520: added option -L to write long cigar to CG 2017-10-17 17:32:44 -04:00
Heng Li ffd953029f r519: fixed a severe bug that misses long alns 2017-10-17 15:52:36 -04:00