bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] fall back to glibc matcher if a MBCSET is found


From: Jim Meyering
Subject: Re: [PATCH] fall back to glibc matcher if a MBCSET is found
Date: Wed, 08 Sep 2010 11:05:48 +0200

Paolo Bonzini wrote:
> This patch works around some of the performance problems of multibyte grep.
> The patch has been in RHEL-6 for a few months.  I think it is also a
> correctness patch, since grep has no way to support multi-character
> collation elements.
>
> For UTF-8 it should trigger only in the presence of MBCSET, e.g. [a-z].
> For other character sets all brackets and `.` as well will trigger it.
>
> * src/dfa.c (dfaexec): Fall back to glibc for multibyte matches,
> if possible.

Hi Paolo,

Thank you for the patch.

If this change really does fix a correctness bug,
then it deserves a NEWS entry with enough detail to confirm that,
and, if at all possible, a test suite addition.

Similarly, if it works around a performance problem,
it would help me evaluate it if you were to provide evidence.

Maybe this has already been done in some RHEL-6 bugzilla,
and you just forgot to include that?

Finally, please include some of the above in a comment in the code.

> ---
>  src/dfa.c |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
>
> diff --git a/src/dfa.c b/src/dfa.c
> index 91124b6..3708be7 100644
> --- a/src/dfa.c
> +++ b/src/dfa.c
> @@ -3237,6 +3237,15 @@ dfaexec (struct dfa *d, char const *begin, char *end,
>                  continue;
>                }
>
> +            if (backref)
> +              {
> +                *backref = 1;
> +                free(mblen_buf);
> +                free(inputwcs);
> +                *end = saved_end;
> +                return (char *) p;
> +              }
> +
>              /* Can match with a multibyte character (and multi character
>                 collating element).  Transition table might be updated.  */
>              s = transit_state(d, s, &p);



reply via email to

[Prev in Thread] Current Thread [Next in Thread]