|
From: | Paolo Bonzini |
Subject: | Re: [PATCH v2 0/5] Speed up uNN_chr and uNN_strchr with Boyer-Moore algorithm |
Date: | Tue, 27 Jul 2010 20:39:09 +0200 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.10) Gecko/20100621 Fedora/3.0.5-1.fc13 Lightning/1.0b2pre Thunderbird/3.0.5 |
On 07/27/2010 06:28 PM, Pádraig Brady wrote:
On 27/07/10 19:14, Paolo Bonzini wrote:On 07/27/2010 06:06 PM, Pádraig Brady wrote:I would suggest a new function due to the way I see this function called most often. I.E. repeatedly with the same character.Is this really a bottleneck? i.e., what does u8_uctomb_aux look like in the profile when do a million u8_strchr calls on an empty string?Well it would be a bit faster, but mainly a bit easier to use. I.E. one could do stuff like: while ((f=u8_str_u8_chr (s, "–", 3));
Ok, that's a different usecase that makes more sense. I thought you referred to something like
char c[6]; size_t size = u8_uctomb_aux (c, uc, sizeof c); ... while ((f=u8_str_u8_chr (s, c, size))); This one instead is less likely to be useful. However, note that in C1X you could do while ((f=u8_strchr (s, u'–')));BTW, there's an interesting difference between char32_t and ucs4_t, in that the former has "the same size, signedness, and alignment as uint_least32_t", while libunistring uses uint32_t to define the latter. I wonder if libunistring should be changed to:
1) detect _Char32_t (or uchar.h and char32_t) and use it if available, 2) use uint_least32_t if not available.It would be a no-op everywhere except possibly for some C++ programs, and it wouldn't affect binary compatibility.
Paolo
[Prev in Thread] | Current Thread | [Next in Thread] |