lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] 2-9-0: -assume_charset= does not override in-document cha


From: Mouse
Subject: Re: [Lynx-dev] 2-9-0: -assume_charset= does not override in-document charset=
Date: Wed, 17 Jan 2024 15:35:19 -0500 (EST)

>> -override-charset maybe?
> Or, heck, that!

:-)

> But editing a document to get it -dump'ed out correctly, that is no
> good.

Well...the real offender here is whatever led to serving 8859-* data
but mislabeling it as UTF-8.  I have mixed feelings about making it
easy to work around brokenness of that order; for all that it is
defensible to for lynx to be a useful tool to investigate such
catastrophes, I really think webservice that broken should be
_blatantly_ broken.

But:

For what it's worth, for me (Canada), fetching https://www.google.com/
gets me a document with headers including

content-type: text/html; charset=ISO-8859-1

and a <head> including

<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

so it might be worth looking to see whether the UTF-8 you're seeing
comes from the Content-Type: or a <meta> - the <meta> does look wrong
to me, but I don't know whether it or the Content-Type: is supposed to
take precedence when they disagree.  (The actual content I get is
mostly ASCII, but it does include "Français" - in 8859-1.)

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse@rodents-montreal.org
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B



reply via email to

[Prev in Thread] Current Thread [Next in Thread]