[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Cyrillic vs UTF-8
From: |
Stefan Monnier |
Subject: |
Re: Cyrillic vs UTF-8 |
Date: |
Mon, 19 May 2003 09:49:05 -0400 |
> > > Maybe it is. In my situation, I'd like utf-8 to be at the top
> > > of the preferences w.r.t decoding because it virtually never
> > > guesses wrong.
> > > OTOH, I'm still using a mostly-latin-1 environment, so I'd
> > > still rather avoid utf-8 when I can. I.e. latin-1 should be at
> > > the top of my preferences w.r.t encoding.
> >
> > In that case, I think the source of the problem is that the
> > command prefer-coding-system doesn't satisfy this request of
> > yours:
> > Prefer utf-8 only in automatic detection on reading a
> > file, not for the other situations.
> >
> > (defun prefer-coding-system (coding-system)
> > "Add CODING-SYSTEM at the front of the priority list for automatic
> > detection.
> > This also sets the following coding systems:
> > o coding system of a newly created buffer
> > o default coding system for subprocess I/O
> > This also sets the following values:
> > o default value used as `file-name-coding-system' for converting file
> > names.
> > o default value for the command `set-terminal-coding-system' (not on
> > MSDOS)
> > o default value for the command `set-keyboard-coding-system'
> >
> > How about changing it to skip "This also ..." parts if
> > called with a prefix argument?
> >
> > Then, on writing, if buffer-file-coding-system is not
> > locally bound, default-buffer-file-coding-system is tried
> > automatically.
> >
> > And, for the case that buffer-file-coding-system is locally
> > bound differently from default-buffer-file-coding-system,
> > but it can'd encode the current buffer, we can change
> > select-safe-coding-system to try
> > default-buffer-file-coding-system before trying the most
> > preferred coding system.
> >
> > That way, I think we can satisfy your request completely.
>
> That seems like a cheap way to get what I want indeed.
Actually I don't currently use prefer-coding-system (specifically
because I didn't want to set all those other coding-systems),
instead I use
(when (boundp 'coding-category-utf-8)
(set-coding-priority '(coding-category-utf-8)))
so I guess the only change that I care about is the part that uses
default-buffer-file-coding-system in preference to the most preferred
coding system (although it does sound paradoxical ;-)
The patch below would work for me; any comment/objection ?
Stefan
Index: mule-cmds.el
===================================================================
RCS file: /cvsroot/emacs/emacs/lisp/international/mule-cmds.el,v
retrieving revision 1.231
diff -u -u -b -r1.231 mule-cmds.el
--- mule-cmds.el 16 May 2003 04:15:20 -0000 1.231
+++ mule-cmds.el 19 May 2003 13:45:16 -0000
@@ -1,5 +1,5 @@
;;; mule-cmds.el --- commands for mulitilingual environment
-;; Copyright (C) 1995 Electrotechnical Laboratory, JAPAN.
+;; Copyright (C) 1995, 2003 Electrotechnical Laboratory, JAPAN.
;; Licensed to the Free Software Foundation.
;; Copyright (C) 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
@@ -631,7 +631,8 @@
between FROM and TO are shown in a popup window. Among them, the most
proper one is suggested as the default.
-The list of `buffer-file-coding-system' of the current buffer and the
+The list of `buffer-file-coding-system' of the current buffer,
+the `default-buffer-file-coding-system', and the
most preferred coding system (if it corresponds to a MIME charset) is
treated as the default coding system list. Among them, the first one
that safely encodes the text is normally selected silently and
@@ -648,8 +649,8 @@
list of coding systems to be prepended to the default coding system
list. However, if DEFAULT-CODING-SYSTEM is a list and the first
element is t, the cdr part is used as the defualt coding system list,
-i.e. `buffer-file-coding-system' and the most prepended coding system
-is not used.
+i.e. `buffer-file-coding-system', `default-buffer-file-coding-system',
+and the most preferred coding system are not used.
Optional 4th arg ACCEPT-DEFAULT-P, if non-nil, is a function to
determine the acceptability of the silently selected coding system.
@@ -679,6 +680,9 @@
(mapcar (function (lambda (x) (cons x (coding-system-base x))))
default-coding-system))
+ ;; From now on, the list of defaults is reversed.
+ (setq default-coding-system (nreverse default-coding-system))
+
(unless no-other-defaults
;; If buffer-file-coding-system is not nil nor undecided, append it
;; to the defaults.
@@ -686,24 +690,30 @@
(let ((base (coding-system-base buffer-file-coding-system)))
(or (eq base 'undecided)
(rassq base default-coding-system)
- (setq default-coding-system
- (append default-coding-system
- (list (cons buffer-file-coding-system base)))))))
+ (push (cons buffer-file-coding-system base)
+ default-coding-system))))
+
+ ;; If default-buffer-file-coding-system is not nil nor undecided,
+ ;; append it to the defaults.
+ (if default-buffer-file-coding-system
+ (let ((base (coding-system-base default-buffer-file-coding-system)))
+ (or (eq base 'undecided)
+ (rassq base default-coding-system)
+ (push (cons default-buffer-file-coding-system base)
+ default-coding-system))))
;; If the most preferred coding system has the property mime-charset,
;; append it to the defaults.
(let ((tail coding-category-list)
preferred base)
- (while (and tail
- (not (setq preferred (symbol-value (car tail)))))
+ (while (and tail (not (setq preferred (symbol-value (car tail)))))
(setq tail (cdr tail)))
(and (coding-system-p preferred)
(setq base (coding-system-base preferred))
(coding-system-get preferred 'mime-charset)
(not (rassq base default-coding-system))
- (setq default-coding-system
- (append default-coding-system
- (list (cons preferred base))))))))
+ (push (cons preferred base)
+ default-coding-system)))))
(if select-safe-coding-system-accept-default-p
(setq accept-default-p select-safe-coding-system-accept-default-p))
@@ -724,7 +734,7 @@
(push (car elt) safe))
(push (car elt) unsafe)))
(if safe
- (setq coding-system (car (last safe)))))
+ (setq coding-system (car safe))))
;; If all the defaults failed, ask a user.
(when (not coding-system)
Re: Cyrillic vs UTF-8, Kenichi Handa, 2003/05/18