emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Unicode Lisp reader escapes


From: Oliver Scholz
Subject: Re: [PATCH] Unicode Lisp reader escapes
Date: Wed, 17 May 2006 14:37:02 +0200
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/23.0.0 (gnu/linux)

Growing a bit tired of this discussion, I hacked a kludge that might
do what you want. It introduces a variable
`byte-compile-no-char-translation' that is meant to be put into the
Local Variables section of an Emacs Lisp source file in order to
inhibit the effects of `utf-fragment-on-decoding' and
`unifiy-8859-on-decoding'. In other words: This patch deals only with
the issues that *I* can understand. I seem to recall that Handa also
mentioned some effects of certain CJK language environments.

It is *absolutely vital*, that Kenichi Handa reviews this patch. I am
not entirely sure whether this breaks something or not.

With my patch, in decode_coding_iso2022 looking up characters in
Vstandard_translation_table_for_decode is inhibited at all if
`byte-compile-no-char-translation' is non-nil. This might be wrong.
Vstandard_translation_table_for_decode is not empty by default. I
guess instead of inhibiting its use one could just temporarily set its
parent at about the same place. But maybe this is unnecessary.

decode_coding_sjis_big5 refers to
Vstandard_translation_table_for_decode, too. I did not modify it,
though, thus introducing a possible inconsistency. The reason is that
I don't understand CJK issues and I don't understand this encoding.

Note: Even with the remaining issues wielded out, IMNSHO this patch is
worse than the two other solutions (1) Tell users to use emacs-mule.
Or: (2) Remove `unify-8859-on-decoding-mode' and
`utf-fragment-on-decoding'. The reasoning goes as follows:

    Check: Are `unify-8859-on-decoding-mode' and
    `utf-fragment-on-decoding' useful options?

    If no: Remove them, since they cause only trouble.

    If yes: then a user who set them, will want them for all affected
            characters. The choice for unification/fragmention should
            not be the choice of the programmer of the Lisp package;
            it should be the choice of the user.

            (To quote a future user, complaining on gnu-emacs-help:
            "The heck! Why do I have only hollow boxes for my Greek
            characters after byte compilation??? It's all fine in the
            source file!!!")

    Exception: In the event that the particular choice of charsets is
    important for a Lisp Package: Use `emacs-mule'!

    
    Oliver

Index: lisp/files.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/files.el,v
retrieving revision 1.836
diff -u -r1.836 files.el
--- lisp/files.el       16 May 2006 18:33:31 -0000      1.836
+++ lisp/files.el       17 May 2006 12:08:43 -0000
@@ -2361,6 +2361,7 @@
        (left-margin                     . integerp) ;; C source code
        (no-update-autoloads             . booleanp)
        (tab-width                       . integerp) ;; C source code
+        (byte-compile-no-char-translation . booleanp) ;; C source code
        (truncate-lines                  . booleanp))) ;; C source code
 
 (put 'c-set-style 'safe-local-eval-function t)
Index: lisp/emacs-lisp/bytecomp.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/emacs-lisp/bytecomp.el,v
retrieving revision 2.185
diff -u -r2.185 bytecomp.el
--- lisp/emacs-lisp/bytecomp.el 16 May 2006 10:05:09 -0000      2.185
+++ lisp/emacs-lisp/bytecomp.el 17 May 2006 12:08:45 -0000
@@ -1673,6 +1673,14 @@
            (enable-local-eval nil))
        ;; Arg of t means don't alter enable-local-variables.
         (normal-mode t)
+
+        ;; KLUDGE: `byte-compile-no-char-translation' should affect
+        ;; how characters are decoded. But at this point decoding
+        ;; already happend. So we insert the file contents again.
+        (when byte-compile-no-char-translation
+          (erase-buffer)
+          (insert-file-contents filename))
+        
         (setq filename buffer-file-name))
       ;; Set the default directory, in case an eval-when-compile uses it.
       (setq default-directory (file-name-directory filename)))
Index: src/coding.c
===================================================================
RCS file: /cvsroot/emacs/emacs/src/coding.c,v
retrieving revision 1.336
diff -u -r1.336 coding.c
--- src/coding.c        8 May 2006 05:25:02 -0000       1.336
+++ src/coding.c        17 May 2006 12:08:50 -0000
@@ -405,6 +405,15 @@
 
 Lisp_Object Qcoding_system_p, Qcoding_system_error;
 
+/* This variable is meant to turn off character tranlation during byte
+   compilation. */
+
+Lisp_Object Vbyte_compile_no_char_translation;
+
+Lisp_Object empty_translation_table;
+Lisp_Object Qucs_translation_table_for_decode, 
Qutf_translation_table_for_decode;
+Lisp_Object Qunify_8859_on_decoding_mode, Qutf_fragment_on_decoding;
+
 /* Coding system emacs-mule and raw-text are for converting only
    end-of-line format.  */
 Lisp_Object Qemacs_mule, Qraw_text;
@@ -1849,7 +1858,7 @@
   else
     {
       translation_table = coding->translation_table_for_decode;
-      if (NILP (translation_table))
+      if (NILP (translation_table) && NILP (Vbyte_compile_no_char_translation))
        translation_table = Vstandard_translation_table_for_decode;
     }
 
@@ -4938,8 +4947,48 @@
          dst_bytes--;
          extra = coding->spec.ccl.cr_carryover;
        }
-      ccl_coding_driver (coding, source, destination + extra,
-                        src_bytes, dst_bytes, 0);
+
+      /*KLUDGE: Inhibit unification and or fragmentation. This is
+        meant for byte compiling Emacs Lisp source files. For CCL
+        based coding systems it has to be done here, because we want
+        it only for decoding. We temporarily swap the affected
+        translation tables in Vtranslation_table_vector with an empty
+        translation table.*/
+      if (! NILP (Vbyte_compile_no_char_translation)
+          && (! NILP (SYMBOL_VALUE (Qunify_8859_on_decoding_mode))
+              || ! NILP (SYMBOL_VALUE (Qutf_fragment_on_decoding))))
+        {
+          if (NILP (empty_translation_table))
+            {
+              empty_translation_table =
+                call0 (intern ("make-translation-table"));
+            }
+
+          Lisp_Object ucs_tt = Fget (Qucs_translation_table_for_decode, 
Qtranslation_table);
+          Lisp_Object ucs_id = Fget (Qucs_translation_table_for_decode, 
Qtranslation_table_id);
+
+          Lisp_Object utf_tt = Fget (Qutf_translation_table_for_decode, 
Qtranslation_table);
+          Lisp_Object utf_id = Fget (Qutf_translation_table_for_decode, 
Qtranslation_table_id);
+
+          /* Should this be `unwind-protect'ed? */
+
+          Faset (Vtranslation_table_vector, ucs_id, Fcons 
(Qucs_translation_table_for_decode,
+                                                           
empty_translation_table));
+          Faset (Vtranslation_table_vector, utf_id, Fcons 
(Qutf_translation_table_for_decode,
+                                                           
empty_translation_table));
+
+          ccl_coding_driver (coding, source, destination + extra,
+                             src_bytes, dst_bytes, 0);
+
+          Faset (Vtranslation_table_vector, ucs_id, Fcons 
(Qucs_translation_table_for_decode,
+                                                           ucs_tt));
+          Faset (Vtranslation_table_vector, utf_id, Fcons 
(Qutf_translation_table_for_decode,
+                                                           utf_tt));
+
+        }
+      else ccl_coding_driver (coding, source, destination + extra,
+                              src_bytes, dst_bytes, 0);
+      
       if (coding->eol_type != CODING_EOL_LF)
        {
          coding->produced += extra;
@@ -7852,6 +7901,34 @@
   defsubr (&Sset_coding_priority_internal);
   defsubr (&Sdefine_coding_system_internal);
 
+  DEFVAR_LISP ("byte-compile-no-char-translation", 
&Vbyte_compile_no_char_translation,
+               doc: /* Don't translate characters during byte compilation.
+
+Options like `utf-fragment-on-decoding' or the minor mode
+`unify-8859-on-decoding-mode' modify the way Emacs maps file encodings
+to mule charsets.  Since *.elc files are encoded in emacs-mule, such
+settings are preserved in the compiled file.  If this variable is
+non-nil, Emacs uses the default mule charsets.
+
+You can set this variable in the local variables section of a file. */);
+  Vbyte_compile_no_char_translation = Qnil;
+
+  empty_translation_table = Qnil;
+  staticpro (&empty_translation_table);
+  
+  Qucs_translation_table_for_decode = intern 
("ucs-translation-table-for-decode");
+  staticpro (&Qucs_translation_table_for_decode);
+
+  Qutf_translation_table_for_decode = intern 
("utf-translation-table-for-decode");
+  staticpro (&Qutf_translation_table_for_decode);
+
+  Qunify_8859_on_decoding_mode = intern ("unify-8859-on-decoding-mode");
+  staticpro (&Qunify_8859_on_decoding_mode);
+
+  Qutf_fragment_on_decoding = intern ("utf-fragment-on-decoding");
+  staticpro (&Qunify_8859_on_decoding_mode);
+  
+  
   DEFVAR_LISP ("coding-system-list", &Vcoding_system_list,
               doc: /* List of coding systems.
 
    
-- 
Oliver Scholz               28 Floréal an 214 de la Révolution
Ostendstr. 61               Liberté, Egalité, Fraternité!
60314 Frankfurt a. M.       




reply via email to

[Prev in Thread] Current Thread [Next in Thread]