bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16912: [PATCH] no longer use CSET for non-UTF8 locale in DFA engine


From: Norihiro Tanaka
Subject: bug#16912: [PATCH] no longer use CSET for non-UTF8 locale in DFA engine
Date: Sun, 02 Mar 2014 10:23:28 +0900

Hi Paul

Thank you for checking the patch.

> First, why does the first patch add those four using_utf8 calls to
> parse_bracket_exp?  Isn't that optimization valid regardless of
> whether the multibyte encoding is UTF-8?

The optimization which MBCSET is changed into CSET in addtok is completed
on UTF8 locale only, because even if work_mbc->cset is defined in non-UTF8
locales, it's treated as not CSET but MBCSET.  So if not CSET to replacement
to OR, dfa will keep MBCSET until last and return backref.  I want to
avoid it.

However I don't understand why the optimization isn't completed on
non-UTF8 locale only.  Can you explain it?

Norihiro






reply via email to

[Prev in Thread] Current Thread [Next in Thread]