[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18266: handling bytes not part of the charset, and other garbage
From: |
Vincent Lefevre |
Subject: |
bug#18266: handling bytes not part of the charset, and other garbage |
Date: |
Sat, 13 Sep 2014 03:17:41 +0200 |
User-agent: |
Mutt/1.5.23-6361-vl-r59709 (2014-07-25) |
On 2014-09-12 17:57:39 -0700, Paul Eggert wrote:
> Currently, for example, the tz package <http://www.iana.org/time-zones> has
> a Make rule 'check_character_set' that verifies that the source files are
> all properly encoded. It executes this shell command:
>
> ! grep -nv '^.*$' file names
>
> This relies on GNU grep's behavior that "." does not match an encoding
> error. But it's a command that is not obvious. It'd be simpler and clearer
> to write this:
>
> ! grep -n '[[:error:]]' file names
>
> if such a feature were available.
But both of these solutions have the drawback of working only in
UTF-8 locales. One may wonder whether grep is the right tool, as
"iconv -f UTF-8 -t UTF-8" can do such a check in any locale.
--
Vincent Lefèvre <address@hidden> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
- bug#18266: handling bytes not part of the charset, and other garbage, (continued)
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Jim Meyering, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage,
Vincent Lefevre <=
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/15
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/16