emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Default setting of `mm-coding-system-priorities'


From: David Kastrup
Subject: Default setting of `mm-coding-system-priorities'
Date: Mon, 12 Feb 2007 09:14:45 +0100
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.50 (gnu/linux)

Hi,

nowadays a lot of GNU/Linux systems have set up an utf-8 locale by
default.  Now the default value of
mm-coding-system-priorities is

(defcustom mm-coding-system-priorities
  (if (boundp 'current-language-environment)
      (let ((lang (symbol-value 'current-language-environment)))
        (cond ((string= lang "Japanese")
               ;; Japanese users prefer iso-2022-jp to euc-japan or
               ;; shift_jis, however iso-8859-1 should be used when
               ;; there are only ASCII text and Latin-1 characters.
               '(iso-8859-1 iso-2022-jp iso-2022-jp-2 shift_jis utf-8)))))
  "Preferred coding systems for encoding outgoing messages.

More than one suitable coding system may be found for some text.
By default, the coding system with the highest priority is used
to encode outgoing messages (see `sort-coding-systems').  If this
variable is set, it overrides the default priority."
  :version "21.2"
  :type '(repeat (symbol :tag "Coding system"))
  :group 'mime)

Now the problem is that a _lot_ of mail and news readers in frequent
use (and often old favorites of people) don't grok utf-8, but pretty
much every one of them gets along fine with Latin-1.  So I think that
we should at least in a standard language locale (I have

English language environment

Sample text:
  Hello!, Hi!, How are you?

Input methods:
  english-dvorak ("DV@" in mode line)

Character sets:
  ascii: ASCII (ISO646 IRV)

Coding systems:
  nothing specific to English


And I have

Coding system for saving this buffer:
  = -- emacs-mule

Default coding system (for new files):
  u -- mule-utf-8 (alias: utf-8)

Coding system for keyboard input:
  nil
Coding system for terminal output:
  u -- utf-8 (alias of mule-utf-8)

Defaults for subprocess I/O:
  decoding: u -- mule-utf-8 (alias: utf-8)

  encoding: u -- mule-utf-8 (alias: utf-8)


Priority order for recognizing coding systems when reading files:
  1. mule-utf-8 (alias: utf-8)
  2. iso-latin-1 (alias: iso-8859-1 latin-1)
  3. mule-utf-16be-with-signature (alias: utf-16be-with-signature 
mule-utf-16-be utf-16-be)
  4. mule-utf-16le-with-signature (alias: utf-16le-with-signature 
mule-utf-16-le utf-16-le)
  5. iso-2022-jp (alias: junet)
  6. iso-2022-7bit 
  7. iso-2022-7bit-lock (alias: iso-2022-int-1)
  8. iso-2022-8bit-ss2 
  9. emacs-mule 
  10. raw-text 
  11. japanese-shift-jis (alias: shift_jis sjis cp932)
  12. chinese-big5 (alias: big5 cn-big5 cp950)
  13. no-conversion 

  Other coding systems cannot be distinguished automatically
  from these, and therefore cannot be recognized automatically
  with the present coding system priorities.

  The following are decoded correctly but recognized as iso-2022-7bit-lock:
    iso-2022-7bit-ss2 iso-2022-7bit-lock-ss2 iso-2022-cn iso-2022-cn-ext
    iso-2022-jp-2 iso-2022-kr

Particular coding systems specified for certain file names:

  OPERATION     TARGET PATTERN          CODING SYSTEM(s)
  ---------     --------------          ----------------
  File I/O      "\\.dz\\'"              (no-conversion . no-conversion)
                "\\.g?z\\(~\\|\\.~[0-9]+~\\)?\\'"
                                        (no-conversion . no-conversion)
                "\\.tgz\\'"             (no-conversion . no-conversion)
                "\\.tbz\\'"             (no-conversion . no-conversion)
                "\\.bz2\\'"             (no-conversion . no-conversion)
                "\\.Z\\(~\\|\\.~[0-9]+~\\)?\\'"
                                        (no-conversion . no-conversion)
                "\\.elc\\'"             (emacs-mule . emacs-mule)
                "\\.utf\\(-8\\)?\\'"    utf-8
                "\\(\\`\\|/\\)loaddefs.el\\'"
                                        (raw-text . raw-text-unix)
                "\\.tar\\'"             (no-conversion . no-conversion)
                "\\.po[tx]?\\'\\|\\.po\\."
                                        po-find-file-coding-system
                "\\.\\(tex\\|ltx\\|dtx\\|drv\\)\\'"
                                        latexenc-find-file-coding-system
                ""                      (undecided)
  Process I/O   nothing specified
  Network I/O   nothing specified

[back]


It would now seem appropriate to make mm-coding-system-priorities
effectively default to '(iso-8859-1 utf-8), namely first try
iso-8859-1 before going over to utf-8.

I am not sure whether this change is something that should be done at
mm-coding-system-priorities level or in the English language
environment, though.  But since it is reasonable in an utf-8 locale
that files that are read and written are primarily considered utf-8,
it really might be appropriate to confine the Latin-1 preference to
mail and news interchange.

But there it is _definitely_ preferable.

-- 
David Kastrup




reply via email to

[Prev in Thread] Current Thread [Next in Thread]