[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unicode support
From: |
Bruno Haible |
Subject: |
Re: Unicode support |
Date: |
Tue, 25 Jul 2006 20:42:11 +0200 |
User-agent: |
KMail/1.9.1 |
Jarl Friis wrote:
> I didn't know that the default
> "to-encoding" on iconv is UTF-8, but a small test reveals this fact.
iconv's default "to-encoding" (as well as its default "from-encoding")
is the locale encoding. It can be specified at system installation
time (for most Linux distributions) or later, ad-hoc, through the
environment variables LANG or LC_ALL.
If you found out that for you, the default "to-encoding" is UTF-8, it
means you are already in a UTF-8 locale.
> So I assume with these very good arguments that the diff utils support
> UTF-8, right?
The input and output encoding of 'diff' is also the locale encoding.
So, for you, it's UTF-8. Other people, who don't usually work in an UTF-8
locale, can convert the UTF-8 files to their locale encoding before
running 'diff':
#!/bin/bash
inputfile1=$1
inputfile2=$2
diff <(iconv -f UTF-8 < "$inputfile1") <(iconv -f UTF-8 < "$inputfile2")
Or can run 'diff' on the UTF-8 files directly and then convert to the
encoding of your locale:
#!/bin/bash
LC_ALL=en_US.UTF-8 diff "$@" | iconv -f UTF-8
The result should be the essentially the same.
Bruno