bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: major gawk bug


From: Stanislav Ievlev
Subject: Re: major gawk bug
Date: Thu, 10 Jun 2004 12:14:58 +0400

On Wed, Jun 09, 2004 at 03:08:49PM +0300, Aharon Robbins wrote:
> I'm glad my patches work.  I may send you some further patches
> for testing.
Yes, I can test it. Thank you.
> 
> Code using tolower() is marginally slower for things like
> 
>       BEGIN {
>               IGNORECASE = 1
>               for (i = 1; i < 10000000; i++)
>                       val += ("ONE STRING" == "one string")
>               print val
>       }
> 
> I have a fast machine, making it hard for me to judge whether the difference
> is worth keeping the current code.  I need to think about it some more.
> 
> I do believe that just using RE_ICASE will work and will probably make tht
> the main solution for re.c.
> 
> I am also concerned about portability issues; while GLIBC tolower() is
> highly functional etc, GLIBC and Linux are not my entire customer base. :-)
> 
> Arnold
> 
> > Date: Wed, 9 Jun 2004 15:20:54 +0400
> > From: Stanislav Ievlev <address@hidden>
> > To: Aharon Robbins <address@hidden>
> > Cc: Stepan Kasal <address@hidden>, address@hidden
> > Subject: Re: major gawk bug
> >
> > Hello,
> >
> > On Tue, Jun 08, 2004 at 06:59:48PM +0300, Aharon Robbins wrote:
> > > > I beleive the right fix for regexes is to use RE_ICASE flag instead
> > > > of the translate table.
> > > > The hard-coded table is also used in gawk for various case-insensitive
> > > > comparisons; these should be replaced by a call to tolower().
> > > > The hard-coded table should be then removed.
> > > 
> > > I have some tentative changes in place that work this way.  It passes
> > > `make check'.  I am still concerned about performance, especially
> > > the use of tolower().
> > > 
> > > If you or Mr. Ievlev can test them and give me some feedback, let
> > > me know and I'll send them to you.
> > Arnold, your patch works well.
> > (little improvement:
> > -   if (strcmp(cp, "C") == 0 || strcmp(cp, "POSIX") == 0)
> > +       if (!cp || strcmp(cp, "C") == 0 || strcmp(cp, "POSIX") == 0)
> > )
> >
> > As I understand, we also have a solution with toupper()/tolower() functions.
> >
> > I agree with Stepan that these functions already have good optimization in
> > glibc. Solution with toupper()/tolower() is better, because currently we
> > have two translation tables (first in  glibc and second in gawk) and copy 
> > one to other
> > during initialization (load_ignorecase ), it looks strange.
> >
> > If interpretation of contents of these two tables is identical in gawk
> > algorithms, it's eazy to replace one another.
> >
> > --
> > With best regards
> > Stanislav Ievlev
> >
> > ALT Linux Team.
> >
> >
> > #####################################################################################
> > This Mail Was Scanned by 012.net Anti Virus Service - Powered by TrendMicro 
> > Interscan
> >




reply via email to

[Prev in Thread] Current Thread [Next in Thread]