emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Embedding levels of formatting codes


From: Eli Zaretskii
Subject: Re: [emacs-bidi] Embedding levels of formatting codes
Date: Wed, 17 Oct 2001 13:02:43 +0200 (IST)

On Wed, 17 Oct 2001, Behdad Esfahbod wrote:

> Got the idea, but a small question, as I remember, UAX#9 needs some 
> look-aheads, say in rule W5, that a ET can be followed by ENs, what do 
> you do with this?

I have no choice but to look ahead.  To recover some of the costs of
this look-ahead, I cache all the resolved levels and other important
associated information computed during the look-ahead, so that
characters between the first ET and the EN will be delivered from the
cache, bypassing all the steps to compute that info.

I need the cache anyway, because I cannot rearrange characters after
level resolution.  Instead of rearranging characters, I switch
direction of the buffer scan and jump between level runs, based on the
resolved levels.  This is impossible (or, more accurately, very
inefficient) without a cache.

> input string:    "AN ARABIC {LRE}{PDF} 123-456"
> runs bidi types:  (AL)(WS)(AL)(WS)(LRE)(PDF)(WS)(EN)(ON)(EN)
> explicit marks resolved,
> embedding levels: 1111111111??????????11111111
> implicit levels resolved,
> embedding levels: 1111111111??????????12221222
> and obviously reordering this, results in: "456-123  CIBARA NA"
> 
> and this one:
> 
> input string:    "AN ARABIC {LRE} {PDF} 123-456"
> runs bidi types:  (AL)(WS)(AL)(WS)(LRE)(WS)(PDF)(WS)(EN)(ON)(EN)
> explicit marks resolved,
> embedding levels: 1111111111?????2?????11111111
> implicit levels resolved,
> embedding levels: 1111111111?????2?????22222222
> and obviously reordering this, results in: " 123-456 CIBARA NA"
> 
> And the point is that if you assign level 2 to {LRE} and {PDF} in this 
> test, then the you will solve the first test wrong.

Empty embeddings are indeed a problem, but I don't consider them a
serious one: these are borderline, almost nonsensical cases (if you
don't put anything inside the embedding, why do you need the embedding
in the first place?).  If we find that it's important to handle this
with a 100% adherence to what happens when the formatting codes are
actually removed, we can always design something special for this
situation.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]