|
From: | Paul Eggert |
Subject: | bug#28255: grep erroneously skips Microsoft UTF-8 text files as being binary |
Date: | Sun, 27 Aug 2017 14:47:28 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 |
Simon wrote:
Windows text files can start with a byte order mark of U+FEFF and then be encoded in UTF-8. These are skipped as being binary files.
I can't reproduce this problem on Fedora 26 x86-64. Here's how I tried: $ printf '\357\273\277x\n' >t $ LC_ALL=C grep x t | od -c 0000000 357 273 277 x \n 0000005To help us diagnose the problem, please send a simple, self-contained example, and mention your platform.
[Prev in Thread] | Current Thread | [Next in Thread] |