[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: quote characters in stds
From: |
James Youngman |
Subject: |
Re: quote characters in stds |
Date: |
Tue, 7 Jun 2005 14:36:54 +0100 |
User-agent: |
Mutt/1.3.28i |
Karl writes:
> Unicode contains the unambiguous quote characters required, and its
> common encoding address@hidden is upward compatible with address@hidden
It might be worth pointing out that all valid ASCII files are valid
UTF-8 files, but not all valid Latin-1 files are valid UTF-8 files.
Specifically, there are characters in Latin-1 that are used in Unicode
as leading bytes of multibyte characters (for example 0xE8, which is
an e with a grave accent). Unicode is a superset of Latin-1, but that
doesn't mean that you can load a Latin-1 file as if it was UTF-8.
It might be worth considering this wording change...
> Unicode contains the unambiguous quote characters required, and its
> common encoding address@hidden is upward compatible with address@hidden
> However, you can't process a Latin-1 encoded file as if it were
> address@hidden, because some Latin-1 character codes are used to begin
> multibyte character sequences in address@hidden
... though this is sort of drifting away from the main point of a
section on quote characters and into guidance on handling character
encoding systems.
James.