bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18266: handling bytes not part of the charset, and other garbage


From: Jim Meyering
Subject: bug#18266: handling bytes not part of the charset, and other garbage
Date: Fri, 12 Sep 2014 15:23:08 -0700

On Fri, Sep 12, 2014 at 2:39 PM, Paul Eggert <address@hidden> wrote:
> On 09/12/2014 02:29 PM, Vincent Lefevre wrote:
>
>> an option to control what happens on encoding errors would be better and
>> sufficient.
>
>
> It might suffice for your use cases, but it's more complicated and less
> flexible than being able to match bytes within the regular expression.
> (Plus, someone would have to implement it, which is perhaps the biggest
> objection to either approach ....)  But I take your point that \C is best
> avoided.  This whole area is pretty hairy, I'm afraid.
>
> Speaking of hairy, why doesn't grep use PCRE_MULTILINE?  Using
> PCRE_MULTILINE shouldn't be that hard, and should boost performance quite a
> bit in typical usage.  Or am I being too optimistic here?

When I first saw that implementation, I assumed it was just a first-cut one.
I see no reason not to use PCRE_MULTILINE, but haven't tried it, either.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]