emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: One example of code I can't understand


From: Stefan Monnier
Subject: Re: One example of code I can't understand
Date: Mon, 20 Jul 2009 14:13:59 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.94 (gnu/linux)

> Here's some code from mm-util.el that I don't understand.
> Well, I can understand the first 6 lines, but after that
> I am stumped.  The doc string gives no details of what the
> value should look like or what it means.


> (defvar mm-iso-8859-x-to-15-table
>   (and (fboundp 'coding-system-p)
>        (mm-coding-system-p 'iso-8859-15)
>        (mapcar
>       (lambda (cs)
>         (if (mm-coding-system-p (car cs))
>             (let ((c (string-to-char
>                       (decode-coding-string "\341" (car cs)))))
>               (cons (char-charset c)
>                     (cons
>                      (- (string-to-char
>                          (decode-coding-string "\341" 'iso-8859-15)) c)
>                      (string-to-list (decode-coding-string (car (cdr cs))
>                                                            (car cs))))))
>           '(gnus-charset 0)))
>       mm-iso-8859-15-compatible))
>   "A table of the difference character between ISO-8859-X and ISO-8859-15.")

Entries in this list have the form (CHARSET OFFSET CHARS...)
and it means that characters in CHARSET (except for those in CHARS) can
be converted to iso-8859-15 by adding OFFSET.

In Emacs-23 it doesn't make much sense (because unification, OFFSET is
always 0).  It's used in mm-find-mime-charset-region (via
mm-iso-8859-x-to-15-region) to provide a "poor man's unification":

    (if (and (> (length charsets) 1)
             (memq 'iso-8859-15 charsets)
             (memq 'iso-8859-15 hack-charsets)
             (save-excursion (mm-iso-8859-x-to-15-region b e)))
        (dolist (x mm-iso-8859-15-compatible)
          (setq charsets (delq (car x) charsets))))

i.e. if we need more than 1 coding-system to encode the region and
iso-8859-15 is among them, then use the above table to turn some of the
other chars into iso-8859-15 in the hope to reduce the number of
coding-systems to use (and hence the number of chunk into which the
text needs to be split).

Now that we have utf-8, this is unnecessary since we can always encode
the whole text with just a single coding-system, without having to break
it down into chunks.


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]