[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: html2text
From: |
Reiner Steib |
Subject: |
Re: html2text |
Date: |
Tue, 09 Nov 2004 23:44:24 +0100 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) |
On Mon, Nov 08 2004, Reiner Steib wrote:
> [ The suggested patch from Jari's original message was: ]
>
> --8<---------------cut here---------------start------------->8---
> --- html2text.el.7.10 2004-11-06 17:20:46.000000000 +0200
> +++ html2text.el 2004-11-06 17:41:12.000000000 +0200
> @@ -42,8 +42,42 @@
> (defvar html2text-format-single-element-list '(("hr" . html2text-clean-hr)))
>
> (defvar html2text-replace-list
> - '((" " . " ") (">" . ">") ("<" . "<") (""" . "\"")
> - ("&" . "&") ("'" . "'"))
> + '(("´" . "`")
This should be "´".
> + ("&" . "&")
> + ("'" . "'")
> + ("¦" . "|")
> + ("¢" . "c")
> + ("ˆ" . "^")
> + ("©" . "(C)")
> + ("¤" . "¤")
> + ("°" . "degree")
> + ("÷" . "/")
> + ("€" . "e")
> + ("½" . "½")
[...]
It seems strange to use Latin-1 characters for some entities, but not
for all encodable by Latin-1.
On a second thought, it looks like there are already more or less
complete lists[1] e.g. in `mm-url-html-entities' (from Gnus),
`sgml-char-names', `sgml-char-names-table', `iso-iso2sgml-trans-tab'
(Emacs) or `w3m-entity-alist' (emacs-w3m).
Probably one of these could be used. Hm, maybe the function
`iso-sgml2iso' could be used in `html2text.el'?
Bye, Reiner.
[1] Might be checked with
http://www.w3.org/TR/REC-html40/sgml/entities.html or other
tables.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/