bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#19230: Help! grepV2.21 treats ISO-8859 text files as if they are bin


From: Hans Pelleboer
Subject: bug#19230: Help! grepV2.21 treats ISO-8859 text files as if they are binary
Date: Mon, 01 Dec 2014 08:57:52 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

I think you nailed it, Paul:

OS: Arch Linux / kernel 3.17.4 / x86_64, locale is set to UTF-8
grep came straight from the Arch repository.

As grepV2.20 still showed the `old', more forgiving behaviour,
I was wondering what can be done to compile grep in such a way,
that it processes all text files, no matter what way they are encoded.
After all sed, vi, emacs, the works, do just that.

Yours,

hansp

On 11/30/2014 11:02 PM, Paul Eggert wrote:
Hans Pelleboer wrote:

Binary file <NAME_FILE> matches

Further tests showed, that grep only behaved this way with text
files that were encoded according to ISO-8859 (There may be more!).

What operating system are you running on, and how did you build or import grep?

Also, what's your locale? What is the output of the shell command 'locale'?

I can see this happening if you are using an UTF-8 locale, as in general ISO-8859 is not valid UTF-8 text. Older versions of 'grep' were less picky in this area, and that might explain the symptoms you observed. With newer versions it's more important for the locale to be compatible with the text file's encoding.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]