bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18777: [PATCH] dfa: improvement for checking of multibyte character


From: Norihiro Tanaka
Subject: bug#18777: [PATCH] dfa: improvement for checking of multibyte character boundary
Date: Mon, 15 Dec 2014 23:59:32 +0900

On Mon, 20 Oct 2014 10:07:20 -0600
Eric Blake <address@hidden> wrote:

> POSIX requires that NUL, slash, dot, newline, and carriage return all be
> single bytes that cannot occur inside a multibyte character (because
> they have special meaning to file name resolution and/or terminal
> interaction); it added this requirement fairly recently, but only after
> confirming that common existing locales satisfy this constraint.  (The
> same is not true for most any other character; even though POSIX
> requires that a-z, A-Z, and 0-9 be single bytes, it does not forbid
> those characters from also being bytes embedded within multibyte
> characters).  Is it worth extending your optimization to all five of the
> POSIX-guaranteed single byte characters?

I rewrote the patch so that NUL, slash, dot and carriage return as well
as newline might be also regarded as a special character.

Attachment: 0001-dfa-improvement-for-checking-of-multibyte-character-.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]