emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Unicode Lisp reader escapes


From: Oliver Scholz
Subject: Re: [PATCH] Unicode Lisp reader escapes
Date: Mon, 15 May 2006 04:49:04 +0200
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/23.0.0 (gnu/linux)

Stefan Monnier <address@hidden> writes:

>>> Handa says that telling people "don't use utf-8" solves the problem.
>>     Additionally to "don't use unify-8859-on-decoding" which causes
>>     similar problems (which we already bumped into a few years ago when we
>>     included unify-8859-on-decoding) with iso8859 chars and coding systems
>>     like iso-2022.
>
>> There is a way for a Lisp file to specify a coding system which isn't
>> utf-8.  Is there a way for a Lisp file to specify that
>> unify-8859-on-decoding should not be used when reading it?
>
>> If not, maybe we should make one.
>
>> Here's one idea: if the -*- line specifies `coding' and specifies
>> the mode `emacs-lisp' then force unify-8859-on-decoding to nil
>> for that file.

Besides the work already mentioned, this would also require to turn
unify-8859-on-decoding-mode into a buffer-local minor mode. Which
would require to make the necessary translation tables somehow (!)
buffer-local.

> Forcing it to nil for a particular file is maybe too much work to implement
> compared to th benefit.
> Maybe an easier solution is to add a file-local variable
> `no-8859-unification' such that if that file is loaded in an Emacs which
> is configured to use unify-8859-on-decoding it signals an error.
>
> It could then be added to files like ucs-tables.el.

[Nitpick: ucs-tables.el is encoded in ISO 2022. Most of Emacs' files
containing m18n characters are, AFAIK. I don't know the reason. Maybe
because it's 7bit, but still ASCII compatible.]

How about just issuing a warning with the warning message containing a
description of the effects and of what to do to change the settings?

e.g.:

(when (and (memq (coding-system-base buffer-file-coding-system)
                 '(mule-utf-8 utf-7 mule-utf-16
                         ; ...
                         mule-utf-16be-with-signature))
           utf-fragment-on-decoding ; default is nil
           (let ((charsets (find-charset-region (point-min) (point-max))))
             (or (memq 'greek-iso8859-7 charsets)
                 (memq 'cyrillic-iso8859-5 charsets))))
  (warn "You have enabled ... but this source file contains
characters from ... Emacs has ... This might or might not be what
you want ... To restore the defaults do ... bla bla ...
... you might want to use `emacs-mule' as coding system for Emacs Lisp
source files ..."))

And similar for the other cases.

[FWIW, I think that `emacs-mule'---as Handa suggested---is a perfectly
valid file encoding for Emacs Lisp source files. Since it is, by
definition unambigous w.r.t. the specified charsets, emacs-mule has
none of the problems we are discussing. Of course, Emacs is probably
the only text editor that can deal with emacs-mule, but that would
hardly matter for Elisp sources. I can think only of two drawbacks: 1.
You can't simply insert or attach such files to mail or usenet
postings. You have to zip, tar, base64 etc. them first. 2. Specifying
particular charsets might exactly *not* be what an author wants. --

Though, the only way to deal with the latter would be to modify the
Lisp printer for writing *.elc files so that it escapes non-ascii
characters whereever possible with the new \u syntax. This would be
another solution to the problem we are discussing.]


    Oliver
-- 
Oliver Scholz               26 Floréal an 214 de la Révolution
Ostendstr. 61               Liberté, Egalité, Fraternité!
60314 Frankfurt a. M.       




reply via email to

[Prev in Thread] Current Thread [Next in Thread]