emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: find-composition still depends on the composition property


From: Kenichi Handa
Subject: Re: find-composition still depends on the composition property
Date: Thu, 23 Oct 2008 10:18:22 +0900

In article <address@hidden>, Eli Zaretskii <address@hidden> writes:

> Thanks, but Emacs still does not get this quite right.  For example,
> in the following line:

>   אבגדה12345

> Which mixes Hebrew letters with digits, M-f stops at the first digit,
> whereas in this line:

>   abcde12345

> it does not.  The latter behavior is correct, the former is not.  (I'm
> ashamed to admit that even MS Word gets it right.)

> I understand that the way for fixing this would be to install more
> entries in word-combining-categories, but more infrastructure seems to
> be missing, since right now no characters have the "Hebrew" category,
> for example (at least judging by the output of describe-categories).

Then what to do is:

(1-1) assign the category "6" (digit) to "0123456789".
(1-2) define a category, say "D", and assign it to all
characters that have no word-boundary between digits.
(1-3) add (?D . ?6) and (?6 . ?D) to word-combining-categories.

Another way is:

(2-1) modify word_boundary_p to handle negative category mnemonic in
word-*-categories to catch a character that doesn't have the
specified category.
(2-2) assign the category "6" (digit) to "0123456789".
(2-3) define a category, say "X", and assign it to all
characters that have word-boundary between digits.
(2-4) add ((- ?X) . ?6) and (?6 . (- ?X)) to
word-combining-categories.

Or,

(3-1) Make `common' script and classify digits, etc to it.
(3-2) modify word_boundary_p not to distinguish `common' from
any other script.
(3-3) define a category, say "X", and assign it to all
characters that have word-boundary between digits.
(3-4) add (?X . ?6) and (?6 . ?X) to
word-separating-categories.

> By the way, I'd suggest to move the legend generated by
> describe-categories to the beginning of the buffer, because the buffer
> is huge and it does not say anywhere at the beginning that there's a
> legend at the end.  Without the legend, the buffer looks like a large
> pile of gibberish.

The legend is longer than 40 lines.  If we put that at the
head, it will occupy the whole first page, which I think is
not that good.  Saying something like "See the end of the
buffer for the legend." with "legend" clickable at the first
line will be good.  What do you think?

> And another wish: can we have word-combining-categories and
> word-separating-categories display their elements with human-readable
> letters, not as their ASCII codes?  (Quick: what letter is code 94?)

How about modifing word_boundary_p to accept a mnemonic
string (instead of a mnemonic character) in those variables?
Then we can specify multiple categories in the string to
catch a character that have one of them.

---
Kenichi Handa
address@hidden
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]