bug#23234: unexpected results with charset handling in GNU grep 2.23

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23234: unexpected results with charset handling in GNU grep 2.23

From:	Bjoern Jacke
Subject:	bug#23234: unexpected results with charset handling in GNU grep 2.23
Date:	Thu, 7 Apr 2016 01:04:04 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 07.04.2016 00:33, Eric Blake wrote:
> That behavior complies with POSIX requirements.

can you give a quote here? One thing which is not POSIX compliant is
that the diagnostic messages is given back on stdout.
http://pubs.opengroup.org/onlinepubs/9699919799/ says:

--snip--
LC_MESSAGES
    Determine the locale that should be used to affect the format and
contents of diagnostic messages written to standard error.
--snap--

which implies that diagnostic messages should be given back to standard
error.

> Again, a script SHOULD
> NOT be grepping binary files (POSIX only defines grep on text files)
> without knowing the ramifications.  Meanwhile, 'grep -a' guarantees you
> won't get the "Binary file" message.

if you consider grepping text files with mixed encodings as invalid use
of grep, then you should not return 0 and/or output the "Binary file
(standard input) matches" on stdout. This makes the output of GNU grep
look like a valid match.

You say "grep -a" is your friend to all the users, who want to grep log
files (cause they tend to conain mixed encodinds). Sure, -a is a
workaround to make GNU grep work as before again. Realisically 99.99 of
the users will not know that though, because this is the first grep
version ever I guess, that requires this. Also -a is a GNU option only,
so portable scripts will not be able to use that.

I guess you are aware, that you will break a lot of existing scripts
with that change of treating mixed encoding input files as binary like
the way you do it now with GNU grep >= 2.23 ?

Björn

[Prev in Thread]

Current Thread

[Next in Thread]

bug#23234: unexpected results with charset handling in GNU grep 2.23, Björn JACKE, 2016/04/06
- bug#23234: unexpected results with charset handling in GNU grep 2.23, Eric Blake, 2016/04/06
  - bug#23234: unexpected results with charset handling in GNU grep 2.23, Bjoern Jacke, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Eric Blake, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Bjoern Jacke <=
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Eric Blake, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Norihiro Tanaka, 2016/04/09
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/09
  - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/06
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/09
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/10
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Jim Meyering, 2016/04/10
    - bug#23234: unexpected results with charset handling in GNU grep 2.23, Zev Weiss, 2016/04/10
    - bug#20768: bug#23234: unexpected results with charset handling in GNU grep 2.23, Paul Eggert, 2016/04/10

Prev by Date: bug#23234: unexpected results with charset handling in GNU grep 2.23
Next by Date: bug#23234: unexpected results with charset handling in GNU grep 2.23
Previous by thread: bug#23234: unexpected results with charset handling in GNU grep 2.23
Next by thread: bug#23234: unexpected results with charset handling in GNU grep 2.23
Index(es):
- Date
- Thread