bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 0/4] faster gnulib-tool


From: Ralf Wildenhues
Subject: [PATCH 0/4] faster gnulib-tool
Date: Sun, 28 Dec 2008 11:15:32 +0100
User-agent: Mutt/1.5.18 (2008-05-17)

Hello, and I hope you're all having or have had nice holidays,

here's a short patch series to speed up common gnulib-tool usage a bit:

1) cache module metainformation.

The first observation is that a bulk of the forks in a typical --update
are spent for 'sed' parsing the module metainformation files.  So let's
cache them: contents are parsed into shell variables.

The cache variable names consist of 'c_' plus the flattened module name.
For Bash, the function to flatten the name uses ${var//subst/repl} to
avoid forking for module names that contain non-alphanumeric characters
(such as '/').

FWIW, the values of $lookedup_file and $lookedup_tmp are not cached;
doing so would, if --local-dir were used and module files patched,
require that the patched files be kept (and not overwritten) for the
duration of the script.  I have checked that no caller site uses the
lookedup_{file,tmp} values for the module metainformation files, so
we don't have to worry about this.


By itself, this patch does not help much but even slows down gnulib-tool
(see timings below), because a lot of the module file reading happens in
subshells, failing to populate the parent shell's cache.


2) avoid forks with func_get_* functions.

This patch turns (1) into a speed boost, by eliminating lots of forks
related to calls of the func_get_* functions, thus allowing the cache
to be used a few times in a typical --update or --test operation.
(Of course the additional fork elimination itself also helps.  :-)


3) abort loops early where possible.

A couple of loops only test for presence of some condition, but have no
other side condition; they can be aborted as soon as we have a definite
answer.

4) faster string handling for Posix shells.

This introduces a shell function for splitting off literal prefixes and
suffixes from strings, avoiding 'sed' when the shell is Posixy enough
(idea copied from Libtool).


I have tested the changes with M4, using 'gnulib-tool --test', and on a
couple of other packages using gnulib, and ensured that the only changes
they cause is some harmless removed empty lines in generated files.
Testing was done on GNU/Linux using bash and pdksh, and Solaris ksh.

The patches are posted using 'git format-patch' so they can be fed
directly into 'git am', for those so inclined.

OK to apply?


The whole series gives me about 50% improvement for
  gnulib-tool --update

on the git M4 tree:
  before:  21.63 s
  after 1: 27.46 s
  after 2: 16.46 s
  after 3: 12.94 s
  after 4: 10.83 s

With
  gnulib-tool --with-tests --test

there is about 20% improvement (a couple of minutes), but note that
this also runs the other autotools, configure and make.  (1) and (2)
slightly slow down things like
  gnulib-tool --extract-description ...

but since these modes are typically faster than the other modes,
I consider that an acceptable trade-off.  Otherwise, one could also
reorganize gnulib-tool a bit so that it can use one script on all
modules in question, like
  sed "$sed_extract_license_only" $modules

Doing that throughout the code (i.e., also for --update) would need
more intrusive changes, though.

Thanks,
Ralf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]