bug#20526: BUG: text file is detected as binary

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20526: BUG: text file is detected as binary

From:	Paul Eggert
Subject:	bug#20526: BUG: text file is detected as binary
Date:	Tue, 12 May 2015 17:08:42 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

Eric Blake wrote:

I'm still a bit worried that encoding errors encountered on input, even
though they don't match for output, may still cause issues for some
patterns (we've had cases of encoding errors causing 'grep -P' to go
into an infinite loop, for example);

Yes, that's right. We can't go back to the old way of doing things. Encodingerrors in the data must not be matched by any regular expression (not even ".").'grep -P' won't loop if we never pass encoding errors to the PCRE matcher, sothat's what we gotta do.

but yes, as the behavior is
undefined, we are still justified in adopting those heuristics, if
someone is willing to contribute a patch along those lines.

The hard part about it (and the reason I haven't written up a patch yet) ismaking sure the above property holds, while continuing to have good performancein the typical case where the input is validly encoded. I suppose it's OK,though, if the change hurts performance only for the -P case, since -P is soslow anyway.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#20526: BUG: text file is detected as binary, Sebastian Poehn, 2015/05/07
- bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/07
  - bug#20526: BUG: text file is detected as binary, Sebastian Pöhn, 2015/05/07
    - bug#20526: BUG: text file is detected as binary, Eric Blake, 2015/05/07
    - bug#20526: BUG: text file is detected as binary, Kamil Dudka, 2015/05/11
    - bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/12
    - bug#20526: BUG: text file is detected as binary, Kamil Dudka, 2015/05/12
    - bug#20526: BUG: text file is detected as binary, Eric Blake, 2015/05/12
    - bug#20526: BUG: text file is detected as binary, Paul Eggert <=
    - bug#20526: BUG: text file is detected as binary, Ángel González, 2015/05/20
    - bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/07
    - bug#20526: BUG: text file is detected as binary, Sebastian Poehn, 2015/05/08
    - bug#20526: BUG: text file is detected as binary, Paul Eggert, 2015/05/08
  - bug#20526: BUG: text file is detected as binary, Johannes Meixner, 2015/05/08

Prev by Date: bug#20526: BUG: text file is detected as binary
Next by Date: bug#20605: Grep command
Previous by thread: bug#20526: BUG: text file is detected as binary
Next by thread: bug#20526: BUG: text file is detected as binary
Index(es):
- Date
- Thread