bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23234: unexpected results with charset handling in GNU grep 2.23


From: Bjoern Jacke
Subject: bug#23234: unexpected results with charset handling in GNU grep 2.23
Date: Thu, 7 Apr 2016 01:04:04 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 07.04.2016 00:33, Eric Blake wrote:
> That behavior complies with POSIX requirements.

can you give a quote here? One thing which is not POSIX compliant is
that the diagnostic messages is given back on stdout.
http://pubs.opengroup.org/onlinepubs/9699919799/ says:

--snip--
LC_MESSAGES
    Determine the locale that should be used to affect the format and
contents of diagnostic messages written to standard error.
--snap--

which implies that diagnostic messages should be given back to standard
error.

> Again, a script SHOULD
> NOT be grepping binary files (POSIX only defines grep on text files)
> without knowing the ramifications.  Meanwhile, 'grep -a' guarantees you
> won't get the "Binary file" message.

if you consider grepping text files with mixed encodings as invalid use
of grep, then you should not return 0 and/or output the "Binary file
(standard input) matches" on stdout. This makes the output of GNU grep
look like a valid match.

You say "grep -a" is your friend to all the users, who want to grep log
files (cause they tend to conain mixed encodinds). Sure, -a is a
workaround to make GNU grep work as before again. Realisically 99.99 of
the users will not know that though, because this is the first grep
version ever I guess, that requires this. Also -a is a GNU option only,
so portable scripts will not be able to use that.

I guess you are aware, that you will break a lot of existing scripts
with that change of treating mixed encoding input files as binary like
the way you do it now with GNU grep >= 2.23 ?

Björn





reply via email to

[Prev in Thread] Current Thread [Next in Thread]