|
From: | Paul Eggert |
Subject: | bug#17229: [PATCH 2/2] grep: speed-up by using memchr() in Boyer-Moore searching |
Date: | Sun, 27 Apr 2014 13:32:57 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 |
Norihiro Tanaka wrote:
Could you try above cases?
Thanks, you're observing a 2.7x performance speedup with macros on your platform and your benchmark. With the same patch, I observed only a 1.18x speedup on the same benchmark. As usual, I'm testing with AMD Phenom II X4 910e + GCC 4.9.0 + Fedora 20 + default (-O2) optimization. I'm curious about why you're observing a much bigger performance difference with macros. What platform are you using?
Anyway, an 18% speedup is still a speedup, so I looked into it. GCC 4.9.0 misses a non-obvious opportunity for function inlining. I installed a tweak (attached) that should make the inlining opportunity obvious to compilers nowadays. On my platform this gave a 28% speedup, i.e., a bit better than the macro-using patch would have.
0001-kwset-improve-performance-by-inlining-more.patch
Description: Text document
[Prev in Thread] | Current Thread | [Next in Thread] |