emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Mixed L2R and R2L paragraphs and horizontal scroll


From: Eli Zaretskii
Subject: Re: [emacs-bidi] Mixed L2R and R2L paragraphs and horizontal scroll
Date: Fri, 12 Feb 2010 13:03:52 +0200

> From: Beni Cherniavsky <address@hidden>
> Date: Thu, 11 Feb 2010 23:40:03 +0200
> Cc: Stefan Monnier <address@hidden>, address@hidden, address@hidden
> 
> [Sorry, long mail.  In the first half I'm whining about why I don't
> like Eli's solution; but I also reply with technical ideas below...]

Thank you for your comments.

> - Truncation in logical order might(?) be OK if coupled with
>   logical-order "mirrored" scrolling.

What is ``logical-order "mirrored" scrolling''?

>   Worse yet, if I now proceed to *edit* the buffer, I'll modify it in
>   completely wrong places, and even when I realize that, fixing it will
>   be even harder!  I'll need to *simultaneously* reverse-engineer your
>   deviant bidi algorithm and figure out the real logical order, and
>   then very carefully fix my edits, all the time getting strangely
>   permuted feedback for my actions.

I don't think fixing such problems is anywhere near that hard.  Just
display the text in its logical order (a flip of a buffer-local
variable) and fix it.  Case closed.

That doesn't mean that we should proliferate problems that need
fixing, of course.

>   This is the *real* reason we hate broken bidi support.

This is not really fair.  I'm not developing a ``broken bidi
support''.  Everything developed so far is first-class bidirectional
operation, as much as I know what that means, at least on the low
level on which I'm working.  Line continuation is the first issue
where I gave up on making it 100% perfect, because it's just too damn
hard for someone who has only weekends to work on that, and doesn't
have enough knowledge and experience in Emacs display-related
features.

The truth is that this is not a job for a bidi expert, it is a job for
an Emacs display engine expert who can ask for bidi advice from time
to time.

But I've waited for 8 years for such a display engine expert to come
and integrate the bidi reordering code I wrote with Emacs redisplay.
It never happened.  So now I'm doing it as best I can, in the hope
that my best will be good enough.

People who want the result to be better can help by suggesting very
specific implementation ideas, based on specific details of the Emacs
redisplay code.  General ideas are generally not helpful, because my
problem is not with principles, it is with the details.  So if you
want to help, please make yourself familiar with xdisp.c, which is the
bulk of the display engine.  Then if you have specific ideas about how
to implement continuation lines that read in the correct order, I'll
be all ears.

>   No bidi at all
>   is frequently better - ain't pretty but at least has 1:1 mental model.

Maybe I should just quit, then.  I don't want my name scribbled on
code that is considered worse than what Emacs was before that.  I
could find better uses for my scarce free time.

> To be fair, we're talking about rare situations where embedded text is
> broken across lines.  But note that a wrong base direction can inflict
> this on whole paragraphs (more on that below).

Unlike with other programs, in Emacs, wrong base direction is very
easy to fix, even with what I have now.  There's a per-buffer variable
that forces one of the two directions on all the paragraphs in the
buffer, and if that's not fine-grained enough, you can insert the
corresponding directional mark at the beginning of the paragraph.
(Eventually, there should be commands to do this, without asking the
user to remember the Unicode codepoints of these characters.)

> >  . I saw no other editor that supports truncation and behaves
> >    otherwise.  (I don't know about any editors that support
> >    continuation lines like Emacs does.)  See below.
> >
> Truncation is OK, but the issue is continuation.
> 
> Not following your claim about editors that support continuation -
> all these do and behave otherwise (i.e. as Ehud wants):
> Notepad, gedit, firefox/webkit, OpenOffice.

No, they don't have continuation lines.  They reflow the lines,
i.e. divide text differently between the lines.

> Indeed, embedded text tends to be short.
> 
> But I'm afraid it's bigger than you think, because if the base direction
> of a paragraph is incorrect, *the whole paragraph* will wrap in this
> broken bottom-up manner.

See above: this is easy to fix in Emacs.  Again, that's not a good
reason to have incorrect display, of course.

> This also means that forcing all paragraphs to R2L or L2R base direction
> (which would be a handy way to momentarily work around wrong imperfect
> guessing) would break line order in half the paragraphs in a mixed buffer!

Let's not get carried away: the problem will only be with continued
lines, not with every line.

> So if only the line breaking points were static, you'd have no
> performance problem!

If the line breaking points were at known buffer positions, yes.  (I
don't quite know what you mean by ``static''.)

> => Could you maybe cache this information and recompute it only when
> the line is edited?

This is unlikely to help, with the current design of the display
engine.  It works very hard to avoid redisplaying lines that did not
change.  So when it finally decides to redisplay a line, there are
very good chances that the cached value is invalid anyway.  Note that
even placing a text property on some text of a line could make it
overflow the display margin at a different point: for example, giving
the text bold or italics face is all you need for the continuation
point to move.

> [XEmacs already has a "Line Start Cache" according to its Internals Manual.

So does Emacs.  But the cache is invalidated when/where the text
changes.

> I didn't find a similar overview for Emacs.  Is there anything I can read
> to understand Emacs redisplay before I attempt to approach the source?]

Only the comments at the beginning of xdisp.c.  After that, read the
code, guided by the high-level structure described in those comments.
That, asking questions here, and related discussions on this list is
all you have, sorry.

> > until we discover where we should stop.  (We could do a binary search,
> > of course, but that's details.)  I don't think that's reasonable, and
> > I have no idea what will this do to the redisplay speed.
> >
> Binary search is a big improvement!  In 10 attempts you can handle lines
> of 1K chars, in 20 - 1M.  On my computer Emacs presently handles 100k
> smoothly, 1M already feels sluggish.  By crude (and probably wrong)
> computation, binary search would still be fast enough up to 10K...

The problem is not with implementing binary search, the problem is to
plug it into the display engine, which needs to be aware of all the
possible side effects of each thing it does while preparing display of
a line.

> Also, I presume that the heavy part of a redisplay is normally the actual
> output to screen (if not, why do such a complex job minimizing it?).

I don't think so.  I think the heavy part is computing and merging
faces.  But doing measurements to find out the hot spots would be in
itself a useful project, which may help down the line.  Volunteers are
welcome.

> To top this, I think you can do several times better if you allow some
> imprecision in line breaking of mixed-direction paragraphs.  Naturally,
> you must not overshoot the screen, but some undershooting is OK.  So it
> seems to me that you could reasonably do it with a greedy approach:
> 
> (1) Add characters in *logical order* as long as they fit.
> (2) Try it in visual order to account for precise typographic stuff.
> (3) As long as it doesn't fit, strip one a char and retry (2).
> (4) When OK, repeat with actual output display to the screen.

I think this will only work if there are no embeddings.  In text with
embeddings, doing it in logical order will have you scan wrong
characters anyway, and you are back at square one.

> Finally, I want to propose a feature that I think will be handy,
> and also happens to support efficient wrapping.  The truth is that any
> way to wrap an embedding accross lines is ugly!  I'd like a mode where
> any embedding either fits completely on a line or starts and ends on a
> lines by itself:
> 
> +----------------------------------------+
> |some latin text followed by            \|
> |\          ROF TXET GNOL TAHWEMOS WERBEH|
> |\                     SIHT GNITARTSNOMED|
> |followed by latin tail                  |
> +----------------------------------------+

Again, what about embeddings?  I want to have full UAX#9 support, not
just some simplified bidi that breaks as soon as the user wants to use
the full power of the Unicode algorithm.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]