[PATCH 0/4] faster gnulib-tool

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 0/4] faster gnulib-tool

From:	Ralf Wildenhues
Subject:	[PATCH 0/4] faster gnulib-tool
Date:	Sun, 28 Dec 2008 11:15:32 +0100
User-agent:	Mutt/1.5.18 (2008-05-17)

Hello, and I hope you're all having or have had nice holidays,

here's a short patch series to speed up common gnulib-tool usage a bit:

1) cache module metainformation.

The first observation is that a bulk of the forks in a typical --update
are spent for 'sed' parsing the module metainformation files.  So let's
cache them: contents are parsed into shell variables.

The cache variable names consist of 'c_' plus the flattened module name.
For Bash, the function to flatten the name uses ${var//subst/repl} to
avoid forking for module names that contain non-alphanumeric characters
(such as '/').

FWIW, the values of $lookedup_file and $lookedup_tmp are not cached;
doing so would, if --local-dir were used and module files patched,
require that the patched files be kept (and not overwritten) for the
duration of the script.  I have checked that no caller site uses the
lookedup_{file,tmp} values for the module metainformation files, so
we don't have to worry about this.


By itself, this patch does not help much but even slows down gnulib-tool
(see timings below), because a lot of the module file reading happens in
subshells, failing to populate the parent shell's cache.


2) avoid forks with func_get_* functions.

This patch turns (1) into a speed boost, by eliminating lots of forks
related to calls of the func_get_* functions, thus allowing the cache
to be used a few times in a typical --update or --test operation.
(Of course the additional fork elimination itself also helps.  :-)


3) abort loops early where possible.

A couple of loops only test for presence of some condition, but have no
other side condition; they can be aborted as soon as we have a definite
answer.

4) faster string handling for Posix shells.

This introduces a shell function for splitting off literal prefixes and
suffixes from strings, avoiding 'sed' when the shell is Posixy enough
(idea copied from Libtool).


I have tested the changes with M4, using 'gnulib-tool --test', and on a
couple of other packages using gnulib, and ensured that the only changes
they cause is some harmless removed empty lines in generated files.
Testing was done on GNU/Linux using bash and pdksh, and Solaris ksh.

The patches are posted using 'git format-patch' so they can be fed
directly into 'git am', for those so inclined.

OK to apply?


The whole series gives me about 50% improvement for
  gnulib-tool --update

on the git M4 tree:
  before:  21.63 s
  after 1: 27.46 s
  after 2: 16.46 s
  after 3: 12.94 s
  after 4: 10.83 s

With
  gnulib-tool --with-tests --test

there is about 20% improvement (a couple of minutes), but note that
this also runs the other autotools, configure and make.  (1) and (2)
slightly slow down things like
  gnulib-tool --extract-description ...

but since these modes are typically faster than the other modes,
I consider that an acceptable trade-off.  Otherwise, one could also
reorganize gnulib-tool a bit so that it can use one script on all
modules in question, like
  sed "$sed_extract_license_only" $modules

Doing that throughout the code (i.e., also for --update) would need
more intrusive changes, though.

Thanks,
Ralf

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 0/4] faster gnulib-tool, Ralf Wildenhues <=
- [PATCH 1/4] gnulib-tool: cache module metainformation., Ralf Wildenhues, 2008/12/28
- [PATCH 2/4] gnulib-tool: avoid forks with func_get_* functions., Ralf Wildenhues, 2008/12/28
- [PATCH 3/4] gnulib-tool: abort loops early where possible., Ralf Wildenhues, 2008/12/28
  - [PATCH 3/4] gnulib-tool: abort loops early where possible., Ralf Wildenhues, 2008/12/29
    - sed, SIGPIPE, cmp -s [Re: [PATCH 3/4] gnulib-tool: abort loops early where possible., Jim Meyering, 2008/12/29
- [PATCH 4/4] gnulib-tool: faster string handling for Posix shells., Ralf Wildenhues, 2008/12/28

Prev by Date: Re: parse-duration.c
Next by Date: [PATCH 1/4] gnulib-tool: cache module metainformation.
Previous by thread: [PATCH 3/3] find: take advantage of new gnulib/fts leaf-optimization
Next by thread: [PATCH 1/4] gnulib-tool: cache module metainformation.
Index(es):
- Date
- Thread