[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters
From: |
Pádraig Brady |
Subject: |
Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters |
Date: |
Thu, 08 Jul 2010 11:04:07 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 |
On 07/07/10 23:49, Pádraig Brady wrote:
> On 07/07/10 17:07, Simon Josefsson wrote:
>> Pádraig Brady <address@hidden> writes:
>>
>>> + /* The following is equivalent to:
>>> + return memmem (s, strlen(s), c, csize);
>>> + but faster for long S with matching UC near the start,
>>> + and also memmem is sometimes buggy and inefficient. */
>>> switch (u8_uctomb_aux (c, uc, 6))
>>
>> Don't we have an efficient memmem in gnulib that this code could use?
>
> Well the current replacement isn't great for small needles.
> Also since we don't know the length of the string up front
> we would also waste a bit of time on the strlen().
Well actually testing it, shows that memmem() isn't that bad at all,
though it is slower than the current u8_strchr() for the interesting
2 and 3 character cases
function hay_len needle_len iterations time
-------------------------------------------------------------
gl_memmem long 1 1 2.17
pb_memmem¹ long 1 1 3.45
u8_strchr long 1 1 3.28
gl_memmem 60 1 10000 2.18
pb_memmem 60 1 10000 1.99
u8_strchr 60 1 10000 2.17
gl_memmem long 2 1 3.88
pb_memmem long 2 1 4.67
u8_strchr long 2 1 3.47
gl_memmem 60 2 10000 1.97
pb_memmem 60 2 10000 1.97
u8_strchr 60 2 10000 1.96
gl_memmem long 3 1 5.86
pb_memmem long 3 1 4.02
u8_strchr long 3 1 4.28
gl_memmem 60 3 10000 1.97
pb_memmem 60 3 10000 1.97
u8_strchr 60 3 10000 1.98
The above tests were compiled with:
gcc -fno-builtin-strlen -march=pentium-m -O3 test-memmem.c
¹ My very simple implementation of memmem() just for comparison
char* pb_memmem (const char* hay, size_t hl, const char* needle, size_t nl)
{
const char* nohay = hay + hl - nl;
const char* noneedle = needle + nl;
if (nl > hl) return NULL;
for (;;)
{
const char* p1 = hay;
const char* p2 = needle;
do
if (p2 == noneedle) return (char*) hay;
while (*p1++ == *p2++);
hay++;
if (hay > nohay) return NULL;
}
}
cheers,
Pádraig.