[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#19242: latest grep considers text files as binary
From: |
Jim Meyering |
Subject: |
bug#19242: latest grep considers text files as binary |
Date: |
Fri, 5 Dec 2014 07:00:21 -0800 |
On Fri, Dec 5, 2014 at 1:58 AM, Thomas Wolff <address@hidden> wrote:
> Paul Eggert wrote:
>>>
>>> the mentioned patches are apparently intended to fix issues in non-UTF-8
>>> locales.
>>
>> No, they're also needed for UTF-8 locales I'm afraid. There are some
>> security issues, not only having to do with grep's internals, but also for
>> the behavior of downstream programs that may be expecting UTF-8 text.
>>
>> You can work around the problem with 'grep -a'.
>
> I was aware of this workaround but I claim it should not be needed because
> the files affected are in fact not binary files but text files. The manual
> clearly says about -a: "Process a binary file as if it were text" but
> partial content in a different text encoding does not make a file binary.
>
> Jim Meyering wrote:
>>
>> this is due to documented and desirable behavior.
>
> I deny this is desirable behavior and I doubt there is a security issue as
> described. If any other, independent software has a security issue with
> non-UTF-8 input, it should decide itself to filter it and use accordingly
> stable decoding functions. It cannot be the task of any tool (grep in this
> case) to filter output to work around possible security issues in other
> programs in a pipe. This would be completely against the concept of pipes in
> the Unix tradition.
This is another side effect of using a multibyte locale.
As long as there are no NUL bytes in your input, you can work
around the issue by running grep in the C locale:
LC_ALL=C grep ...
- bug#19242: latest grep considers text files as binary, Thomas Wolff, 2014/12/01
- bug#19241: latest grep considers text files as binary, Paul Eggert, 2014/12/01
- bug#19241: latest grep considers text files as binary, Jim Meyering, 2014/12/01
- bug#19242: latest grep considers text files as binary, Paul Eggert, 2014/12/01
- bug#19242: latest grep considers text files as binary, Thomas Wolff, 2014/12/05
- bug#19242: latest grep considers text files as binary, Eric Blake, 2014/12/05
- bug#19242: latest grep considers text files as binary, Eric Blake, 2014/12/05