[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lynx-dev] Unicode-marking, &c
From: |
David Woolley |
Subject: |
Re: [Lynx-dev] Unicode-marking, &c |
Date: |
Sun, 01 Mar 2009 11:27:58 +0000 |
User-agent: |
Thunderbird 2.0.0.19 (X11/20081209) |
Thomas Dickey wrote:
yes - it has the meta tags after the title for UTF-8, but has a BOM
right up front. Lynx isn't seeing the charset tag when it gets the page.
One possible factor here is that browsers aren't required to refetch the
page when they get a charset clash in a meta element, so that meta
element is supposed to be very near the front. I suspect the intent was
that it should be immediately after <head>, and that browsers should
perform a limited lookahead, if there was not charset in the real HTTP
headers, or HTTP wasn't used. Unfortunately authoring tool writers just
know they have to create a lot of boiler plate, including boiler plate
for the benefit of the authoring tool, and don't think that some of it
may need to have priority.
This one looks like a manual template, adapted from authoring tool
output, e.g. look at the unconfigured description and keywords, but also
note the generator.
However, my version of Lynx does correctly identify this as UTF-8, so
the real problem may be in inappropriate error recovery, i.e. Lynx is
assuming that non-ASCII characters, before the the meta charset, were
real content that had been sent without at least a preceding title element.
--
David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Lynx-dev] Unicode-marking, &c,
David Woolley <=