[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#23763: Bug report: Grep stops, if a text file contains a null charac
From: |
Bjoern Voigt |
Subject: |
bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes |
Date: |
Tue, 14 Jun 2016 22:10:27 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:43.0) Gecko/20100101 Firefox/43.0 SeaMonkey/2.40 |
Paul Eggert wrote:
> Bjoern Voigt wrote:
>> This is clearly a bug in my eyes.
>
> The behavior conforms to grep's spec, so it's not a bug in that sense.
> I don't offhand see a behavior change that wouldn't cause worse
> problems elsewhere. Unless you were thinking of adding an option?
The current manual page patched with
"0001-doc-remove-obsolete-MS-DOS-mention-2.patch" says:
--binary-files=TYPE
If the first few bytes of a file indicate that the file
contains binary data, assume that the file is of type TYPE. By
default, TYPE is binary, and grep normally outputs either a
one-line message saying that a binary file matches, or no
message if there is no match. If TYPE is without-match, grep
assumes that a binary file does not match; this is equivalent
to the -I option. If TYPE is text, grep processes a binary
file as if it were text; this is equivalent to the -a option.
When processing binary data, grep may treat non-text bytes as
line terminators; for example, the pattern '.'
(period) might not match a null byte, as the null byte might be
treated as a line terminator. Warning: grep
--binary-files=text might output binary garbage, which can have
nasty side effects if the output is a terminal and if the
terminal driver interprets some of it as commands.
My test case where a files starts with more than 32KB text data and
continues with text data with at least one embedded 0 character (which
makes this binary data) is undocumented.
Consequently I probably search a new option "--binary-files=auto" which
also should by the default sometime later.
For files it should work as follows:
--binary-files=auto
If the first few bytes of a file indicate that the file
contains binary data, assume that the file is of type binary.
Otherwise assume that the file is of type text.
Since the behavior of --binary-files=binary for my testcase is
undocumented and since the output is more or less useless except of the
fact that some not-printable characters on terminal are suppressed, it
would be also an option to change --binary-files=binary mode in code and
in the manual page.
For files as input data this is easy to implement. But I haven't
checked, how --binary-files should work with standard input. The
decision binary or text should be made there before the first match is
printed.
My MySQL mysqldump problem can be solved with --text or
--binary-files=text. So I do not search a quick solution anymore.
Regards,
Björn
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Bjoern Voigt, 2016/06/13
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Eric Blake, 2016/06/13
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Bjoern Voigt, 2016/06/13
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Paul Eggert, 2016/06/13
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Bjoern Voigt, 2016/06/14
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Paul Eggert, 2016/06/14
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes,
Bjoern Voigt <=
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Paul Eggert, 2016/06/15
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, sur-behoffski, 2016/06/15
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Paul Eggert, 2016/06/15
- bug#23763: Bug report: Grep stops, if a text file contains a null character after 32768 bytes, Eric Blake, 2016/06/15