emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: a few MULE criticisms


From: Hin-Tak Leung
Subject: Re: a few MULE criticisms
Date: Thu, 15 May 2003 04:29:16 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312

Kenichi Handa wrote:
At first, as far as I know, cemacs.el is a small program
written by <address@hidden> that does just these
things:
  o make emacs 19 use standard-display-8bit
  o rebind forward-char, delete-char, etc to functions that
    pay attention to 8-bit chars.
And, it works only in tty mode under a chinese terminal,
e.g. cxterm.

That's correct.

<a lot of C-x RET, etc snipped>

Is there a central organized repository for these tips/etc
associated with leim? e.g. how to create a new leim file?
They aren't in the obvious places (i.e. the info files
that came with emacs).

(1) Associations: the ability to let the user choose the next possible
associated characters.

[...]

for many years. This would most certainly require extending MULE with
the ability of loading distionaries of commonly used phrases in
various languages. And will make the leim package a lot bigger.


I think this should be implemented by extending abbrev-mode
so that it can associate an abbreviation with multiple
words/phrases (is it possible already?)

At this point it is probably good to switch to Japanese examples...
What I meant is the ability to anticipate something like:

(a) if I input "Handa" (two characters in Japanese Kanji),
and because it is a common surname, the system somehow asks
whether I want to follow it with "san" (two characters in Japanese
gana - the honourable suffix, similar to the "Mr" prefix in English),
or the 2nd half of the name of famous historical figures
with that surname.

(b) Many of the usage of verbs in Japanese consists of one
or two Kanji characters, e.g. "carry", "bring", followed
by verb modifiers meaning "to", "did not", "forbidden to",
"please do", "please do not", etc which could be 5-6
characters long. The list of characters commonly used for
verbs is quite well defined (a few hundreds? still small by
comparison to the whole character set of a few thousands),
and the list of commonly used verb modifiers is even shorter
(maybe about 10?). Any of those in the 1st list is likely
to be followed by those in the 2nd list.

(c) "Ken" (one Japanese character) is often followed by
one specific character to form the phrase for "health".
(in addition to "Ken-ichi", a rather common first name),
and "ichi" (one character) is often followed by "ban"
(one character) to form "ichi-ban" (meaning "the best").

These might sound very demanding and critical - but association
can dramatically improve the speed of typing
by a factor of two or three... and association has been available
with some CJK input mechanism on either unix or other platforms
for years.

And it is not just about the speed of typing - sometimes
one just can't remember the precise keystrokes corresponding
to a certain lesser-used character, so one would rely on
association from more frequently-used ones (and delete
the more frequently-used one afterwards).

When you type TAB while you are using an input method, Emacs
shows the full list.  But, the method used in cxterm is not
implemented, it's not easy.

I have figured that out. However a match by beginning and ending ("a*b")
or by ending ("*b") is quite important for Chinese inputs. TAB
(match by the beginning portion "a*") probably works alright
for Japanese, because a native Japanese speaker most probably
know how the character is pronounced (or at least know the
first one or two syllables of the phrase). But many of the
Chinese input methods (other than the Pinyi method,
"by pronounciation") function by character shapes, and the
distinctive/memorable part is often the right-hand side of
the character. In other words, the ending of the keystroke
sequence (or the 2nd half of the sequence), because the keystroke
sequence is usually coded according to how native writer
writes the different portion of a character (top-left,
bottom-left, top-right, bottom-right).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]