bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16912: [PATCH] no longer use CSET for non-UTF8 locale in DFA engine


From: Paolo Bonzini
Subject: bug#16912: [PATCH] no longer use CSET for non-UTF8 locale in DFA engine
Date: Tue, 04 Mar 2014 16:49:59 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

Il 03/03/2014 07:13, Paul Eggert ha scritto:
Norihiro Tanaka wrote:
However I don't understand why the optimization isn't completed on
non-UTF8 locale only.  Can you explain it?

Sorry, no; there's a lot about that code I don't yet understand.

IIRC it's because a CSET matches any byte, while the corresponding MBCSET only matches that byte if it is a single-byte character. So for example, say "\x83A" is a two-byte character. The CSET "A" will match it but the corresponding MBCSET will not.

This can happen in the Shift-JIS encoding.

Paolo






reply via email to

[Prev in Thread] Current Thread [Next in Thread]