emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Column numbering in bidirectional display


From: Eli Zaretskii
Subject: Column numbering in bidirectional display
Date: Fri, 21 May 2010 12:08:48 +0300

With most of basic features needed for displaying bidirectional text
out of my way (the notable omission so far is reordering display
strings), development now enters the application level, albeit on a
very basic level for now.

One of the major issues on this level is the semantics of the column
numbering.  In the unidirectional case, this is trivial: column
numbers start at zero at the left margin and increase linearly as we
move to the right.

In the bidirectional case, we have two complications.  First, there
are right-to-left (R2L) lines made entirely of R2L characters.  They
are displayed starting at the right margin of the window, like this:

                                      ZYX WVU TSRQ PONMLKJIH GFEDCBA

What should current-column return when point is before A, i.e. at the
first character of the line in the reading order, which is at the
right margin of the window on display?

The other complication is mixed L2R and R2L text.  Example of how we
display a L2R line that includes some R2L characters:

  EDCBA abcde fghij

Here A is the first character of the line in buffer's logical order.
What should current-column return when point is before A?

A similar example for displaying a R2L line that includes some L2R
text:

                                                 JIHGF EDCBA abcde

and we have the same dilemma regarding the value of current-column
when point is before a.

Currently, current-column (and move-to-column, and other primitives in
indent.c) work in buffer's logical order, disregarding the reordering
of characters for display.  That is why current-column returns zero
for all the situations I described above.  It also counts column in
strict logical order.  For example, here are the column numbers for
each character of the last example (numbers that need more than one
digit are written vertically):
                                                 JIHGF EDCBA abcde
                                                 11111111987612345
                                                 76543210

This might surprise at first, and might even look terribly wrong, but
it turns out that users expect that in bidirectional text.  At least
MS Word behaves _exactly_ like this, AFAICS.

Moreover, this makes a surprising number of basic Emacs features work
correctly even though the underlying Lisp code is entirely oblivious
to bidi reordering.  One example is Dired, when file names include R2L
characters: I was pleasantly surprised to see that it puts the cursor
on the correct place within the file name.  Another example is the
various features that manipulate indentation.

If we decide that columns should be numbered in their screen order,
from left to right, then we will need:

  . Rewrite primitives in indent.c to be bidi-aware, i.e. advance by
    calling functions from bidi.c rather than just incrementing
    character positions.  This would complicate the parts that move
    backwards, because there's no code in bidi.c that can do that, and
    it's not trivial to write such code.

  . Fix all the Lisp code that uses these primitives to not assume
    that column zero is necessarily the first character of the line
    that follows a newline.

Admittedly, there are some features which need to be fixed even if we
keep the current semantics of column numbering.  C-e (just fixed 2
days ago) is one example.  But I think the number of such features is
much smaller than if we number columns in visual screen order.

So on balance, I think we should keep the current semantics of the
line numbering, whereby columns are numbered in strict logical order.

If we decide to go that way, we will need to provide primitives or
subroutines to get to the visually first and last characters of a
visual line.  That's because some features need that; see the thread
Re: Hl-line and visual-line for one example.  beginning-of-visual-line
and end-of-visual-line sound like a good starting point.

Comments are welcome.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]