Re: Language environments

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Language environments

From:	Paul Eggert
Subject:	Re: Language environments
Date:	Thu, 29 Nov 2001 17:49:26 -0800 (PST)

> Date: Thu, 29 Nov 2001 04:20:06 -0700 (MST)
> From: Richard Stallman <address@hidden>
> 
> The only difficulty comes in places where we have to choose
> one name to display in a certain piece of text.

In that case "Serbocroatian" is probably the way to go.
Unfortunately this name will cause some controversy.
(I now see that Emacs spells it "Serbo-Croatian"; alas, even the
distinction between that and "Serbocroatian" is controversial.  :-)

> Date: Thu, 29 Nov 2001 10:18:05 +0200 (IST)
> From: Eli Zaretskii <address@hidden>
>
> How can you have an Emacs language environment that supports both
> Cyrillic and Latin scripts?  I think you can't.

> The problem is that ISO-8859 charsets use conflicting 8-bit codepoints
> if you mix more than one in the same document.  Emacs currently cannot
> save such mixed buffers in anything but emacs-mule

I was assuming (perhaps wishfully) that Emacs could convert between
Cyrillic and Latinic Serbocroatian.  It couldn't do a perfect job but
I think it would be adequate in practice and that it might be what
8-bit Serbocroatian Cyrillic users would prefer.  That way, Emacs
wouldn't need to output a mixed buffer.


> without knowing in advance whether to favor Latin or Cyrillic, Emacs
> will err with a high probability.

That's true in general, but Serbocroatian is a special case.  ISO 8859
Latinic Serbocroatian normally uses only a few of the bytes in the
range 0x80 through 0xFF.  If you see bytes in that range that are not
one of those few characters, you can infer that it's not Latinic.

Admittedly this isn't perfect.  Also, if it's Cyrillic your problem is
then to infer which Cyrillic coding system is being used.

So perhaps I was being optimistic about a single language environment,
and Emacs would need multiple environments, e.g.:

  Serbocroatian-Cyrillic-ISO
  Serbocroatian-Cyrillic-KOI8
  Serbocroatian-Latin-2
  ...

Ick.


> Am I missing something?

Not really.  It's not a trivial problem, which is partly why it hasn't
been done.

I should emphasize that I don't pretend to understand all the issues
here, as I don't speak or write Serbocroatian.  I'm mostly writing on
and on about this because I want to try to document the implementation
issues as impartially as I can, as I suspect that any actual
implementation will be contentious.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Language environments, (continued)
- Re: Language environments, Werner LEMBERG, 2001/11/18
- Re[1]: Language environments, Eric M. Ludlam, 2001/11/18
  - Re: Re[1]: Language environments, Kai Großjohann, 2001/11/18
  - Re: Language environments, Richard Stallman, 2001/11/19
- Re: Language environments, Kenichi Handa, 2001/11/18
- Re: Language environments, Kenichi Handa, 2001/11/30

Prev by Date: Re: isearch C-o patch (post-freeze resubmission)
Next by Date: Re: isearch C-o patch (post-freeze resubmission)
Previous by thread: Re: Language environments
Next by thread: Re: Language environments
Index(es):
- Date
- Thread