[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lynx-dev] Unicode-marking, &c
From: |
Thomas Dickey |
Subject: |
Re: [Lynx-dev] Unicode-marking, &c |
Date: |
Thu, 26 Feb 2009 15:53:18 -0500 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
On Thu, Feb 26, 2009 at 06:49:02PM +0000, Thorsten Glaser wrote:
> Thomas Dickey dixit:
>
> >> Here under Windows there are constant references to the character that
> >> begins a 16-bit-wide-character file (FF FE) or UTF-8 file (EF BB BF).
>
> Note that this is not about Windows® though ??? the Byte Order Mark,
> Unicode FEFF, UCS-2BE 0xFE 0xFF, UCS-2LE 0xFF 0xFE, UTF-8 0xEF 0xBB 0xBF,
> is a standardised thing.
>
> > Lynx handles _some_ cases - but a url would help, so we can see.
>
> Attached.
>
> Lynx handles all three poorly: the UTF-8 BOM isn???t stripped, the UCS-2
> files end with an ampersand instead of the ??? (ellipsis).
Lynx assumes the document charset is ISO-8859-1 if it's not given.
(That was the rule for some time - for HTML - perhaps we're not
discussing HTML anymore).
Setting that to UTF-8 makes it display properly.
0xFE is a valid ISO-8859-1 code, as your terminal emulator shows...
--
Thomas E. Dickey <address@hidden>
http://invisible-island.net
ftp://invisible-island.net
signature.asc
Description: Digital signature
- [Lynx-dev] Unicode-marking, &c, Halász Sándor, 2009/02/26
- Re: [Lynx-dev] Unicode-marking, &c, Thomas Dickey, 2009/02/26
- Re: [Lynx-dev] Unicode-marking, &c, Thorsten Glaser, 2009/02/26
- Re: [Lynx-dev] Unicode-marking, &c,
Thomas Dickey <=
- Re: [Lynx-dev] Unicode-marking, &c, David Woolley, 2009/02/27
- Re: [Lynx-dev] Unicode-marking, &c, Thorsten Glaser, 2009/02/27
- Re: [Lynx-dev] Unicode-marking, &c, Thomas Dickey, 2009/02/27
- Re: [Lynx-dev] Unicode-marking, &c, Thorsten Glaser, 2009/02/27
- Re: [Lynx-dev] Unicode-marking, &c, Halász Sándor, 2009/02/26
- Re: [Lynx-dev] Unicode-marking, &c, Thorsten Glaser, 2009/02/27
Message not available
Re: [Lynx-dev] Unicode-marking, &c, David Woolley, 2009/02/27