emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Language environments (Re: How to insert cyrillic characters in utf-8 bu


From: Karl Eichwalder
Subject: Language environments (Re: How to insert cyrillic characters in utf-8 buffers?)
Date: Mon, 07 Jan 2002 17:30:50 +0100
User-agent: Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.1.50 (i686-pc-linux-gnu)

Eli Zaretskii <address@hidden> writes:

[Since this isn't a bug I'll switch to the devel mailinglist.]

>> Okay, I know, that's Emacs talk.  But IMO it's confusing to call
>> something like iso-8859-1 or UTF-8 a language environments.  Language
>> environments are "English (US)" or "Russian".
>
> It is not enough to say "Russian".  There are users who prefer KOI8-R 
> with Russian, others who prefer ISO8859-5, etc.  So in Emacs we have 
> already "Cyrillic-ISO", "Cyrillic-KOI8", and "Cyrillic-ALT".  A UTF 
> ``language environment'' does fit this scheme to some extent.

Yes, but the name should be "Russian" and than the user should have the
possibility to choose a list of the default coding system as an option.

Can I change the coding system for the language environment "German"?

> Of course, it's possible that a better name could be found for this 
> language environment, but I appreciate the difficulty of finding such a 
> name, since Unicode is by definition multilingual.

I thought it's neutral but you can add language tags (there's only one
'ä'--it might appear in Swedish, Danish, German etc. texts).  For sure,
German users want to use quotes according to »German rules« and French
users want to use them according to «French rules» (I know, Unicode
offers the other single and double quotation marks, too).

> In any case, we do need a name for such an environment, because the
> language environment is the main machinery we have in Emacs for setting 
> priorities and preferred encodings, without which things like guessing 
> the text encoding and handling of unibyte text won't work correctly.

I'm not an expert in this area; I don't know how hard it is to guess
UTF-8 encoded texts correctly (in doubt, offer UTF-8 to the user and
tell him how he can try another coding system while visiting the file).

>> There should be a way to
>> transliterate characters that lack an iso-8859-2 equivalent;
>> e.g. transliterate ¤ to the ASCII string "EUR" or to the TeX version
>> "\euro" (the first could be the default using glibc features, the second
>> could be a user option).
>
> Well, this is (AFAIK) a non-existing feature, so it should be coded
> first ;-)

Yes :)

-- 
address@hidden (work) / address@hidden (home):              |
http://www.suse.de/~ke/                                  |      ,__o
Free Translation Project:                                |    _-\_<,
http://www.iro.umontreal.ca/contrib/po/HTML/             |   (*)/'(*)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]