bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters


From: Pádraig Brady
Subject: Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters
Date: Thu, 08 Jul 2010 10:50:08 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3

On 08/07/10 04:24, Ralf Wildenhues wrote:
> Hi Pádraig,
> 
> * Pádraig Brady wrote on Wed, Jul 07, 2010 at 03:44:29PM CEST:
>> Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters
> 
>> --- a/lib/unistr/u8-strchr.c
>> +++ b/lib/unistr/u8-strchr.c
> 
>>  uint8_t *
>>  u8_strchr (const uint8_t *s, ucs4_t uc)

>> +    return strchr (s, uc);
> 
> Don't you still have to cast the result to uint8_t *?  char may be
> signed, leading at least to warnings with -Wall.

You're right. Silly warnings lead to ugly code.
Updated patch below.

cheers,
Pádraig

>From ffadc06ac77f9e2b691d47f219014104cab05a34 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?P=C3=A1draig=20Brady?= <address@hidden>
Date: Wed, 7 Jul 2010 14:14:23 +0100
Subject: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters

* lib/unistr/u8-strchr.c (u8_strchr): Use strchr() for
the single byte case as it was measured to be 50% faster
than the existing code on x86 linux.  Also add a comment
on why not to use memmem() for the moment for the multibyte case.
---
 ChangeLog              |    4 ++++
 lib/unistr/u8-strchr.c |   19 +++++++------------
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index afcae28..8ca0bd7 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2010-07-07  Pádraig Brady  <address@hidden>
+
+       * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() as it's faster
+
 2010-07-04  Bruno Haible  <address@hidden>

        fsusage: Clarify which code applies to which platforms.
diff --git a/lib/unistr/u8-strchr.c b/lib/unistr/u8-strchr.c
index 3be14c7..8023a2e 100644
--- a/lib/unistr/u8-strchr.c
+++ b/lib/unistr/u8-strchr.c
@@ -21,25 +21,20 @@
 /* Specification.  */
 #include "unistr.h"

+#include <string.h>
+
 uint8_t *
 u8_strchr (const uint8_t *s, ucs4_t uc)
 {
   uint8_t c[6];

   if (uc < 0x80)
-    {
-      uint8_t c0 = uc;
-
-      for (;; s++)
-        {
-          if (*s == c0)
-            break;
-          if (*s == 0)
-            goto notfound;
-        }
-      return (uint8_t *) s;
-    }
+    return (uint8_t *) strchr ((const char *) s, uc);
   else
+    /* The following is equivalent to:
+         return memmem (s, strlen(s), c, csize);
+       but faster for long S with matching UC near the start,
+       and also memmem is sometimes buggy and inefficient.  */
     switch (u8_uctomb_aux (c, uc, 6))
       {
       case 2:
-- 
1.6.2.5




reply via email to

[Prev in Thread] Current Thread [Next in Thread]