bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk regex stuff you may want


From: Aharon Robbins
Subject: Re: gawk regex stuff you may want
Date: Sun, 24 Jan 2016 06:01:50 +0200
User-agent: Heirloom mailx 12.5 6/20/10

Hi Paul.

> As far as 'grep' is concerned, it'll trust what regcomp does here, so we 
> do have some freedom to change the code in this area. However, it looks 
> to me like your patch would do the wrong thing for unibyte locales where 
> btowc (b) returns a value that neither b nor WEOF. Also, the rest the 
> code assumes that if btowc returns WEOF in a multibyte locale then there 
> won't be a match (see the setup code in init_dfa, and I have the nagging 
> feeling that this assumption is embedded elsewhere). So, how about the 
> attached more-conservative patch instead?

I applied that patch and gawk passes its tests. I will probably
keep it.  See one comment, below.

> Again, it'd be helpful to know what the problem actually was.

I don't have detailed enough records to be able to tell when all these
small changes were added and why. I will keep them, since the hassle of
removing them, finding out which systems want them, and putting them
back is more than I care to deal with.

I may, one day, just drop in GNULIB's versions.  But not yet.

> diff --git a/ChangeLog b/ChangeLog
> index 181f709..a870e86 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,11 @@
> +2016-01-21  Paul Eggert  <address@hidden>
> +
> +     regex: treat [x] as x if x is a unibyte encoding error
> +     Problem reported by Aharon Robbins in:
> +     http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00091.html
> +     * lib/regcomp.c (parse_byte) [_LIBC && RE_ENABLE_I18N]: New function.
> +     (build_range_exp) [_LIBC && RE_ENABLE_I18N]: Use it.

I think you mean ! _LIBC && RE_ENABLE_I18N.

Thanks,

Arnold



reply via email to

[Prev in Thread] Current Thread [Next in Thread]