[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: why not use unicode if html file has charset=utf-8?
From: |
Kevin Rodgers |
Subject: |
Re: why not use unicode if html file has charset=utf-8? |
Date: |
Tue, 27 Jul 2004 09:50:56 -0600 |
User-agent: |
Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv:0.9.4.1) Gecko/20020406 Netscape6/6.2.2 |
Dan Jacobson wrote:
> One would think that if some file.html had
> <META http-equiv=Content-Type content="text/html; charset=utf-8">
> near the top, emacs would show it with the unicode charset.
> Browsers get that right.
I think the first step would be to go from the (MIME) charset attribute
value to an Emacs coding system. But this particular example (utf-8)
returns 8 alternatives on Emacs 21.3:
(let ((mime-charset 'utf-8) ; more generally: (intern (downcase "UTF-8"))
(coding-systems '()))
(mapatoms (lambda (symbol)
(if (and symbol
(coding-system-p symbol)
(eq (coding-system-get symbol 'mime-charset)
mime-charset))
(setq coding-systems (cons symbol coding-systems)))))
(sort coding-systems 'string-lessp)) =>
(mule-utf-8 mule-utf-8-dos mule-utf-8-mac mule-utf-8-unix utf-8 utf-8-dos
utf-8-mac utf-8-unix)
What's the right way to choose among them? Ah, gnus/mm-util.el has
ths: (mm-charset-to-coding-system "UTF-8") => utf-8
The next step would be to call set-buffer-file-coding-system; should
that be done via html-mode-hook, or is that too late? What about using
after-insert-file-functions/after-insert-file-set-buffer-file-coding-system?
--
Kevin Rodgers