[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: html2text
From: |
Reiner Steib |
Subject: |
Re: html2text |
Date: |
Mon, 08 Nov 2004 16:51:34 +0100 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/21.3.50 (gnu/linux) |
On Sat, Nov 06 2004, Jari Aalto+mail.emacs wrote:
> This is your copy. Article has been posted to the newsgroup(s).
I didn't see your message on emacs-devel, see
<URL:http://thread.gmane.org/address@hidden>.
> * Sun 2004-10-31 Alfred Szmidt <ams AT kemisten.nu> gmane.emacs.devel
> * Message-Id: 1099247139.071920.12084.nullmailer AT Update.UU.SE
> | html2text is quite nice, but it doesn't strip all HTML files into
> | something that is readable. The following patch makes it strip some
> | "newer" tags that have croped up.
>
> There is more entities. This patch is against the Gnus CVS, but I
> assume it will work for Emacs as well. The entities are in
> alphabetical order.
>
> 2004-11-06 Sat Jari Aalto <jari dot aalto A T cante dot net>
>
> * text2html (html2text-replace-list). Added more HTML 4.0
> entities.
It seems you have signed papers for Emacs as you are listed in the
AUTHORS file. But I can't check it myself. Could you please confirm?
[ The suggested patch from Jari's original message was: ]
--8<---------------cut here---------------start------------->8---
--- html2text.el.7.10 2004-11-06 17:20:46.000000000 +0200
+++ html2text.el 2004-11-06 17:41:12.000000000 +0200
@@ -42,8 +42,42 @@
(defvar html2text-format-single-element-list '(("hr" . html2text-clean-hr)))
(defvar html2text-replace-list
- '((" " . " ") (">" . ">") ("<" . "<") (""" . "\"")
- ("&" . "&") ("'" . "'"))
+ '(("´" . "`")
+ ("&" . "&")
+ ("'" . "'")
+ ("¦" . "|")
+ ("¢" . "c")
+ ("ˆ" . "^")
+ ("©" . "(C)")
+ ("¤" . "¤")
+ ("°" . "degree")
+ ("÷" . "/")
+ ("€" . "e")
+ ("½" . "½")
+ (">" . ">")
+ ("¿" . "?")
+ ("«" . "<<")
+ ("&ldquo" . "\"")
+ ("‹" . "(")
+ ("‘" . "`")
+ ("<" . "<")
+ ("—" . "--")
+ (" " . " ")
+ ("–" . "-")
+ ("‰" . "%%")
+ ("±" . "+-")
+ ("£" . "£")
+ (""" . "\"")
+ ("»" . ">>")
+ ("&rdquo" . "\"")
+ ("®" . "(R)")
+ ("›" . ")")
+ ("’" . "'")
+ ("§" . "§")
+ ("¹" . "^1")
+ ("²" . "^2")
+ ("³" . "^3")
+ ("˜" . "~"))
"The map of entity to text.
--8<---------------cut here---------------end--------------->8---
Bye, Reiner.
--
,,,
(o o)
---ooO-(_)-Ooo--- | PGP key available | http://rsteib.home.pages.de/
- Re: html2text,
Reiner Steib <=