[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#20526: BUG: text file is detected as binary
From: |
Eric Blake |
Subject: |
bug#20526: BUG: text file is detected as binary |
Date: |
Tue, 12 May 2015 06:06:13 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 |
On 05/12/2015 02:41 AM, Kamil Dudka wrote:
> On Monday 11 May 2015 21:27:35 Paul Eggert wrote:
>> Perhaps we can improve the behavior of grep by changing its heuristic
>> slightly. Currently grep reports "Binary file FOO matches" if it finds
>> binary data in FOO before it finds the first match. Instead, perhaps we
>> could change grep to report "Binary file FOO matches" when it sees that
>> it's about to generate binary *output* copied from FOO, regardless of
>> whether this output represents the first match. That is, when grep sees
>> that it's about to output binary data, grep instead outputs "Binary file
>> FOO matches" and then stops output for FOO (even if it already output some
>> lines for ordinary matches in FOO).
>>
>> This approach would fix the problem of grep trashing the output stream, and
>> it should be less drastic than grep's current approach, in that it would
>> make grep more likely to do what Kamil Dudka is asking for (assuming grep
>> is given mostly valid input interspersed with small amounts of binary
>> data).
>
> Thanks for the suggestion! I believe that such approach would work for me.
> Do you want me to write a patch implementing it?
>
> Eric, what do you think about the change proposed above?
I'm still a bit worried that encoding errors encountered on input, even
though they don't match for output, may still cause issues for some
patterns (we've had cases of encoding errors causing 'grep -P' to go
into an infinite loop, for example); but yes, as the behavior is
undefined, we are still justified in adopting those heuristics, if
someone is willing to contribute a patch along those lines.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
- bug#20526: BUG: text file is detected as binary, Sebastian Poehn, 2015/05/07
- bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/07
- bug#20526: BUG: text file is detected as binary, Sebastian Pöhn, 2015/05/07
- bug#20526: BUG: text file is detected as binary, Eric Blake, 2015/05/07
- bug#20526: BUG: text file is detected as binary, Kamil Dudka, 2015/05/11
- bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/12
- bug#20526: BUG: text file is detected as binary, Kamil Dudka, 2015/05/12
- bug#20526: BUG: text file is detected as binary,
Eric Blake <=
- bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/12
- bug#20526: BUG: text file is detected as binary, Ángel González, 2015/05/20
- bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/07
- bug#20526: BUG: text file is detected as binary, Sebastian Poehn, 2015/05/08
- bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/08
bug#20526: BUG: text file is detected as binary, Johannes Meixner, 2015/05/08