[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: new module c-strstr
From: |
Paul Eggert |
Subject: |
Re: new module c-strstr |
Date: |
Fri, 18 Aug 2006 10:44:40 -0700 |
User-agent: |
Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux) |
Thanks, that looks nice, but some quibbles about the comments:
> /* The functions defined in this file assume the "C" locale and a character
> set without diacritics (ASCII-US or EBCDIC-US or something like that).
> Even if the "C" locale on a particular system is an extension of the ASCII
> character set (like on BeOS, where it is UTF-8, or on AmigaOS, where it
> is ISO-8859-1), the functions in this file recognize only the ASCII
> characters. More precisely, one of the string arguments must be an ASCII
> string with additional restrictions. */
The intent here is to act like the "C", where all single bytes count
as characters, even when some other locale is in effect, right? So
the comment is misleading, since the code doesn't assume the "C"
locale. How about the following comment instead?
/* c_strstr behaves like strstr would behave in the "C" locale, where
every single byte counts as a distinct character. */
> This function is safe to be called, even in a multibyte locale, if NEEDLE
> ...
I think this claim isn't true for some weird non-ASCII encoding
schemes like DBCS-Host. Also, it wouldn't be true if someone
introduced a new encoding that varies from ASCII in some other way.
How about changing the wording to be:
In all practical encodings that we know of that are extensions or
near-extensions of ASCII, this function is safe to be called, even
in a multibyte locale, if NEEDLE ...
Another possibility would be to remove the claim entirely, since it's not
that relevant to the intended use of c_strstr.
> foundneedle:
> return (char*) haystack;
The usual GNU style puts a space before the "*".