[debbugs-tracker] bug#19997: closed (Performance differences between Git

emacs-bug-tracker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#19997: closed (Performance differences between Git

From:	GNU bug Tracking System
Subject:	[debbugs-tracker] bug#19997: closed (Performance differences between Git sources and release tarballs?)
Date:	Thu, 12 Mar 2015 03:04:01 +0000

Your message dated Wed, 11 Mar 2015 20:02:56 -0700
with message-id <address@hidden>
and subject line closing bug report
has caused the debbugs.gnu.org bug report #19997,
regarding Performance differences between Git sources and release tarballs?
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
19997: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19997
GNU Bug Tracking System
Contact address@hidden with problems

--- Begin Message --- Subject: Performance differences between Git sources and release tarballs? Date: Wed, 04 Mar 2015 13:51:32 +1030 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0

G'day,

I'm working on comparing the performance of my Boyer-Moore string search
code in the hstbm[1] project, versus the equivalent code in GNU Grep
(src/kwset.c).  My hope is to expand the very narrow corpus of patterns
and file data types that I've used to-date in my testing.  The major risk
in varying a tuned algorithm such as B-M is that pathological data,
and/or important normal cases, may suffer a performance hit as a result
of the variations.  I'm trying to locate, or perhaps build, a corpus,
together with a detailed performance profile, for the limited set of
hardware that I have access to.  Generalising the testing to other
architectures, OSes and/or compiler toolchains is a further target.

(I note that, in the recent past, others have suggested Project
Gutenberg's text of the King James Version of the Bible as one possible
member of such a test corpus.)

I'm looking to use PAPI[2] in order to use hardware counters to help pick
apart where CPU time is spent; but more on that when I have some sentient
results to report.

I decided to start by looking at why GNU Grep was significantly slower than
hstbm when searching for a trivial pattern ("123") in /dev/null.  I knew
that one test (invoking the command 1000 times inside a timed shell script)
that GNU Grep was roughly 20% slower.  I found that "fgrep" -- more
precisely, "grep -F", was a significant factor -- the grep pattern
compilation was more expensive than the fgrep (src/kwsearch.c) compilation.
NLS is another difference; hstbm does not call "bindtextdomain" or
"textdomain"; but again, more on that another time.

When starting to add instrumentation to GNU Grep (grep.c's main), I found
that it was up to 300% slower, not the 20%ish that I'd measured previously.
After vanishing down a number of rabbit-holes, some to do with GCC's
architecture selection, I've found that the 2.21 tarball has high
performance, whereas the Git head is much, much slower.

(I use a source-based (Gentoo) Linux OS, so am able to dissect the stages
of Gentoo's build steps... a little.)

Looking at Grep's "configure --help" output, I see that
"--enable-gcc-warnings" is an option, and, with some experimentation with
invoking compilation in different environments, it seems that a slew of
warning options is enabled in the development tree, that are not enabled in
the release tarball.  This is when I naively invoke ./configure without
looking closely at all the possible configuration options.

So, could you please give me some guidance as to why the release tarball
would build so differently to the development (Git head) set of sources?

Apologies in advance if there's some documentation that I overlooked.

thanks,

sur-behoffski (Brenton Hoff)
Programmer, Grouse Software

[1] http://savannah.nongnu.org/projects/hstbm
[2] http://icl.cs.utk.edu/papi/

--- End Message ---

--- Begin Message --- Subject: closing bug report Date: Wed, 11 Mar 2015 20:02:56 -0700 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 As the -lpapi problem is understood now to be a side-track issue, I'm closingthe bug report.
--- End Message ---

[Prev in Thread]

Current Thread

[Next in Thread]

[debbugs-tracker] bug#19997: closed (Performance differences between Git sources and release tarballs?), GNU bug Tracking System <=

Prev by Date: [debbugs-tracker] bug#20088: closed (-a changes whether a match is found)
Next by Date: [debbugs-tracker] bug#19680: closed (24.4; option --no-bitmap-icon not working)
Previous by thread: [debbugs-tracker] bug#20088: closed (-a changes whether a match is found)
Next by thread: [debbugs-tracker] bug#19680: closed (24.4; option --no-bitmap-icon not working)
Index(es):
- Date
- Thread