emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#23763: closed (Bug report: Grep stops, if a text f


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#23763: closed (Bug report: Grep stops, if a text file contains a null character after 32768 bytes)
Date: Mon, 13 Jun 2016 20:02:02 +0000

Your message dated Mon, 13 Jun 2016 14:01:28 -0600
with message-id <address@hidden>
and subject line Re: bug#23763: Bug report: Grep stops, if a text file contains 
a null character after 32768 bytes
has caused the debbugs.gnu.org bug report #23763,
regarding Bug report: Grep stops, if a text file contains a null character 
after 32768 bytes
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
23763: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=23763
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: Bug report: Grep stops, if a text file contains a null character after 32768 bytes Date: Mon, 13 Jun 2016 21:45:30 +0200 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:43.0) Gecko/20100101 Firefox/43.0 SeaMonkey/2.40
Grep shows a bug, if it processes a text file with at least one embedded
0 (ASCII zero) character after byte 32768. Grep stops with the error
message "Binary file testfile.txt matches" and exit code 0. The error
message is written to standard output. Any line after the 0 character is
silently ignored in output.

Environment:
- grep-2.25
- no patches, no "configure" options
- openSUSE Tumbleweed 20160611 x86_64; glibc 2.23; libpcre 8.38

I saw this bug first, as I tried to filter out a line of the MySQL
backup utility "mysqldump". Because grep stopped at the 0 character, the
backups where incomplete.

# mysqldump --all-databases | grep -v '^-- Dump completed on'
[... around 240 lines of SQL output ...]
LOCK TABLES `PartTable` WRITE;
/*!40000 ALTER TABLE `PartTable` DISABLE KEYS */;
Binary file (standard input) matches
mysqldump: Got errno 32 on write

I found that the mysqldump output contains 0 characters in table PartTable.

I wrote the following test script, which shows the bug without a
dependency to MySQL:
--------------------------------------------------------
#!/bin/bash

testfile="testfile.txt"

# write a text file large enough (16384 lines is
# the minimum number for this test case)
for((i=1;i<=16384;i++)) do echo "A"; done > $testfile

# write a zero byte
echo -e '\0' >> $testfile

# write an end line
echo -e 'A ... the end' >> $testfile

# verify the file contents
ls -l $testfile
tail -n 10 $testfile

# use 'grep' to find all lines with the string "A"
grep "A" $testfile

# the last line is missing, the output ends with
# "Binary file testfile.txt matches"

# check the exit code
echo "Exit code of grep:" $?
--------------------------------------------------------

The last line "A ... the end" is missing in output of grep. The exit
code is 0:

# ./null-bug-testcase.txt
[...]
A
A
A
Binary file testfile.txt matches
Exit code of grep: 0

I also found this bug in older grep versions (e.g. Ubuntu 14.04; grep 2.16).

FreeBSD's version of grep (tested with 2.5.1-FreeBSD under FreeBSD
10.3-RELEASE-p4) does not show the bug:

#./null-bug-testcase.txt
[...]
A
A
A
A ... the end
Exit code of grep: 0

Regards,
Björn



--- End Message ---
--- Begin Message --- Subject: Re: bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes Date: Mon, 13 Jun 2016 14:01:28 -0600 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
tag 23763 notabug
thanks

On 06/13/2016 01:45 PM, Bjoern Voigt wrote:
> Grep shows a bug, if it processes a text file with at least one embedded
> 0 (ASCII zero) character after byte 32768.

Thanks for the report.  However, this is not a bug in grep, but
documented behavior.  By definition, a text file CANNOT contain NUL
bytes; any file with NUL characters is a binary file.  You can still
make grep process it as a text file, but only with the '-a' flag.

> Grep stops with the error
> message "Binary file testfile.txt matches" and exit code 0. The error
> message is written to standard output. Any line after the 0 character is
> silently ignored in output.

POSIX allows this behavior, in that it says that grep's behavior is
undefined on non-text files (which you have by virtue of your NUL byte).

Since this is documented behavior of GNU grep when -a is not used, I'm
closing this as not a bug. But feel free to add further comments to this
thread.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]