bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] fall back to glibc matcher if a MBCSET is found


From: Paolo Bonzini
Subject: Re: [PATCH] fall back to glibc matcher if a MBCSET is found
Date: Wed, 08 Sep 2010 11:52:35 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Lightning/1.0b2pre Mnenhy/0.8.3 Thunderbird/3.0.5

On 09/08/2010 11:05 AM, Jim Meyering wrote:
Thank you for the patch.

If this change really does fix a correctness bug,
then it deserves a NEWS entry with enough detail to confirm that,
and, if at all possible, a test suite addition.

It fixes equivalence classes (e.g. matching [[=a=]] against à), but only --without-included-regex. See attached patches.

The presence of this check in regex.m4

             if (sizeof (regoff_t) < sizeof (ptrdiff_t)
                 || sizeof (regoff_t) < sizeof (ssize_t))

unfortunately means that all existing systems will use the inferior gnulib regex rather than glibc regex. In turn, this means that grep will nowhere support equivalence classes out-of-the-box.

Similarly, if it works around a performance problem,
it would help me evaluate it if you were to provide evidence.

yes 1234567890123456789012345678901234567890123456789012567890 | \
  sed 100000q | time ./grep '[a-z]'

shows 0.91s with the patch and 1.21s without. Since this is not an asymptotic improvement, it is hard to test it reliably, and is secondary anyway compared to the correctness problem above.

Paolo

Attachment: fallback.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]