Re: bug#10953: Potential logical bug in readtokens.c

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#10953: Potential logical bug in readtokens.c

From:	Paul Eggert
Subject:	Re: bug#10953: Potential logical bug in readtokens.c
Date:	Tue, 06 Mar 2012 21:33:02 -0800
User-agent:	Mozilla/5.0 (X11; Linux i686; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2

On 03/06/2012 03:32 PM, Eric Blake wrote:
> Why not just strchr instead of building up an isdelim bitmap?

strchr would not be right, since '\0' is valid in data and
as a delimiter.

No doubt you meant 'memchr'; but using 'memchr' would slow
down readtoken by about a factor of two.  I got this result by
timing the following benchmark on gcc-4.6.1.tar (uncompressed)
on Fedora 15 x86-64 with GCC 4.6.2:

#include <stdio.h>
#include <readtokens.h>

struct tokenbuffer t;

int main (void)
{
  for (;;)
    {
      size_t s = readtoken (stdin, " \t\n", 3, &t);
      if (s == (size_t) -1)
        return 0;
    }
}

On this benchmark, the relative speeds (user+sys CPU time ratios,
bigger numbers are better) are:

 0.54  readtoken with memchr
 1.00  current readtoken (with non-thread-safe byte array)
 1.13  proposed readtoken (with thread-safe bitset)

So the proposed patch is a performance win even in non-thread-safe use.

> And why
> are we calling getc() one character at a time, instead of using tricks
> like freadahead() to operate on a larger buffer?
> 
> Also, is readtoken() intended to be a more powerful interface than
> strtok, in which case we _do_ want to be non-threadsafe, and to have a
> readtoken_r interface that is the underlying threadsafe variant that can
> benefit from caching?

I haven't thought about these issues, but surely they are
independent of the proposed patch.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: bug#10953: Potential logical bug in readtokens.c, Eric Blake, 2012/03/06
- Re: bug#10953: Potential logical bug in readtokens.c, Paul Eggert <=

Prev by Date: new modules 'expm1', 'expm1f', 'expm1l'
Next by Date: [PATCH] quote: fuse into quotearg
Previous by thread: Re: bug#10953: Potential logical bug in readtokens.c
Next by thread: new modules 'expm1', 'expm1f', 'expm1l'
Index(es):
- Date
- Thread