[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: grep 2.5.1: NUL byte doesn't match a complemented character class
From: |
Joe Wells |
Subject: |
Re: grep 2.5.1: NUL byte doesn't match a complemented character class |
Date: |
Thu, 23 Aug 2007 13:13:28 +0100 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux) |
Jim Meyering <address@hidden> writes:
> Joe Wells <address@hidden> wrote:
>> In grep 2.5.1, a NUL character doesn't match a complemented character
>> class.
> ...
>
> Works for me, using Debian's 2.5.1.ds2-6:
>
> $ print '\0x'|grep '[^x]x'
> Binary file (standard input) matches
> $ grep '[^x]x'
> address@hidden
> Binary file (standard input) matches
>
> I get the same results with grep-2.5.1-57.fc7.
Hi, Jim,
Thanks very much for following my description of how to reproduce the
bug! Your negative report is very helpful because it has inspired me
to investigate further.
I now can see what the difference between your environment and mine
must be. I'm also using this environment variable setting:
LC_CTYPE=en_US.UTF-8
When I change this (just for the “grep” process) to
LC_CTYPE=C
the problem goes away.
I am confused by this. My understanding of UTF-8 is that a single
octet with all bits off gets interpreted as character U+0000 (“NUL”
a.k.a. “NULL”), which is a perfectly valid Unicode character.
By the way, I have verified with “od -b” (your program!) exactly what
octets grep is seeing.
I suppose the problem might be in glibc? Or perhaps there is a bug in
the locale data files?
(By the way, if any character in the input is being discarded for some
reason (e.g., invalid UTF-8 format), can I please ask that there
should be an error message generated by grep for this? Otherwise
problems will be too difficult to track down.)
By the way, I am using Ubuntu 6.06 LTS (“Dapper Drake”) with all
libraries and utilities up to date. Therefore, you can check at
<URL:http://packages.ubuntu.com/dapper/> to see what version of any
package I am running.
> [resending just to you, because your mail server blocked my first reply
>
> <address@hidden>: host izanami.macs.hw.ac.uk[137.195.13.6] said: 550
> 82.230.74.64 is listed in rbl-plus.mail-abuse.ja.net (in reply to RCPT
> TO
> command)
> ]
Sorry about that! I don't know why that RBL lists that IP address
(mx.meyering.net). I'm glad I got your second e-mail.
--
Joe
- grep 2.5.1: NUL byte doesn't match a complemented character class, Joe Wells, 2007/08/20
- Re: grep 2.5.1: NUL byte doesn't match a complemented character class, Jim Meyering, 2007/08/23
- Re: grep 2.5.1: NUL byte doesn't match a complemented character class,
Joe Wells <=
- Re: grep 2.5.1: NUL byte doesn't match a complemented character class, Jim Meyering, 2007/08/23
- Re: grep 2.5.1: NUL byte doesn't match a complemented character class, Joe Wells, 2007/08/23
- Re: grep 2.5.1: NUL byte doesn't match a complemented character class, Jim Meyering, 2007/08/23
- Re: grep 2.5.1: NUL byte doesn't match a complemented character class, Joe Wells, 2007/08/23
- grep 2.5.3, and missing info page, Benno Schulenberg, 2007/08/23