bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: config files substitution with awk


From: Ralf Wildenhues
Subject: Re: config files substitution with awk
Date: Tue, 5 Dec 2006 21:31:28 +0100
User-agent: Mutt/1.5.13 (2006-08-11)

Hello Pascal,

* Pascal Bourguignon wrote on Tue, Dec 05, 2006 at 07:30:15AM CET:
> Ralf Wildenhues <address@hidden> writes:
> > Roughly speaking, a sed script like
[...]
> > has an overhead scaling quadratically in the number of variables.
> 
> This is not necessarily the case.  It depends on the sed
> implementation.

Please point out a sed implementation that is more efficient.
Autoconf cares about real-world systems, not theoretical ones,
so in a way, it'd not even suffice if not most implementations
of interest supported this.

> Note that it must be done carefully, since we could have:
> 
> s/@var1@/@var2@/g
> s/@var2@/text2/g
> 
> and @var1@ in the input would sustitute to text2.

No.  Autoconf specifically forbids this.  The old pre-Autoconf 2.60
sed-based implementation was not reliable here.  In 2.60 the
implementation was fixed to not allow this, by doing essentially:
  s/@var1@/@|#_!!_#|var2@/g
  s/@var2@/text2/g
  ...
  s/|#_!!_#|//g

The documentation was adjusted to reflect this.

The new awk-based implementation keeps the semantics of not allowing
recursive substitution.  (It would also allow reliable recursive
substitution in a future implementation, as I already outlined in the
thread on autoconf-patches.)

> So we must still have as many DFAs as s commands, but with some luck,
> we can skip directly to the right action. (For this reason, I'd
> suggest to use a more specific tool rather than sed which is too
> general for this job).

Before embarking upon the awk implementation, I actually timed using a
sorted sed script and doing some jumping in order to simulate a binary
tree.  There was a speedup, but it was hardly worth investing work.  I
tested with recent GNU sed.

My experience with this code tells me: if possible, don't even invoke a
regex engine at all.  It's just so much slower than just using hashing.
FYI, the patch that was finally applied differs quite a bit from the
first proposed one:
http://lists.gnu.org/archive/html/autoconf-patches/2006-11/msg00053.html

Hope that helps.

Cheers,
Ralf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]