emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Embedding levels of formatting codes


From: Behdad Esfahbod
Subject: Re: [emacs-bidi] Embedding levels of formatting codes
Date: Wed, 17 Oct 2001 04:13:43 +0330 (IRT)

On Tue, 16 Oct 2001, Eli Zaretskii wrote:
> ...
> Yes, I've looked into FriBidi's code some time ago (only after the
> design and most of the imlpementation of my code was done ;-), and saw
> this removal followed by insertion.
> 
> Unfortunately, this cannot be done in the implementation that I'm
> working on, which processes characters one by one, and is called anew
> for each character.  There's no ``before'' and ``after'' in this code;
> everything is done in-place, and of course I cannot modify the buffer
> by removing characters from it (I could copy and then work on the
> copy, but that would probably significantly slow down the code).  I
> cannot even rely that I'll be called again, since the higher levels
> can decide they don't need any more characters (e.g., because the
> glyph row is long enough to reach the window margin).  I must do
> everything in one go and then repeat it for the next character when
> I'm called again.  The only ``memory'' I'm allowed to keep is what is
> stored in a special iterator structure used to traverse the buffer in
> the visual order; that structure is passed by the caller when my code
> is called.

Got the idea, but a small question, as I remember, UAX#9 needs some 
look-aheads, say in rule W5, that a ET can be followed by ENs, what do 
you do with this?

> That's why it is so much more convenient for me to keep the formatting
> codes in the buffer, and let the caller decide what to do with them
> (normally, the caller is expected to throw them away and immediately
> call the bidi code again, to deliver the next character, since the
> normal display mode should be to hide the formatting codes).
> 
> > It is a simple and nice way to handle them, fribidi currently does 
> > something like this, but it can cause problems if you implement it 
> > without enough care, consider this test:
> > 
> >   AN ARABIC {LRE}{PDF} 123-456
> > 
> > The correct answer should be this:
> > 
> >   456-123  CIBARA NA
> > 
> > And for this one:
> > 
> >   AN ARABIC {LRE} {PDF} 123-456
> > 
> > The correct answer is this:
> > 
> >   123-456 CIBARA NA
> 
> I don't see the difference between these two test cases.  Why should
> the single blank between LRE and PDF make any difference here?  Did
> you perhaps mean LRO instead of LRE?

No, I exactly mean what I wrote, I designed the case to show you the 
difference, and tested it with both fribidi and reference 
implemention, and just added it to fribidi's test cases, lets get into 
them: (I hope you can parse this :-) ).

input string:    "AN ARABIC {LRE}{PDF} 123-456"
runs bidi types:  (AL)(WS)(AL)(WS)(LRE)(PDF)(WS)(EN)(ON)(EN)
explicit marks resolved,
embedding levels: 1111111111??????????11111111
implicit levels resolved,
embedding levels: 1111111111??????????12221222
and obviously reordering this, results in: "456-123  CIBARA NA"

and this one:

input string:    "AN ARABIC {LRE} {PDF} 123-456"
runs bidi types:  (AL)(WS)(AL)(WS)(LRE)(WS)(PDF)(WS)(EN)(ON)(EN)
explicit marks resolved,
embedding levels: 1111111111?????2?????11111111
implicit levels resolved,
embedding levels: 1111111111?????2?????22222222
and obviously reordering this, results in: " 123-456 CIBARA NA"

And the point is that if you assign level 2 to {LRE} and {PDF} in this 
test, then the you will solve the first test wrong.

> (Btw, you don't say, but I assume you meant that all the upper-case
> letters in this example are Arabic letters, not Hebrew letters,
> right?)

Well you expect the text 'AN ARABIC ...' to be written with hebrew
letters? I assume you didn't got it, 'cause you cannot read arabic ;-).

[snip]

> > and if you do not display with 
> > different background colors, then this one is the best you can do:
> > 
> >   a {LRE}simple {RLO}TSet{PDF} which{PDF} see
> 
>
> Alas, this is impossible to produce without lots of risky messing with
> the resolved levels generated by UTR#9.  Given the considerations I
> described above, it doesn't seem to me like it's worth the hassle,
> certainly not at first approximation.  If there's a public outcry
> about this when bidi-capable Emacs is released, we can always do
> something later.

You are right, I couldn't make fribidi output this too.

> > Also there is a collection of very good test-datas for explicit marks, 
> > in fribidi CVS version, have a look at them.
> 
> Thanks, I already know about that, and used all of your tests to test
> my code.  It passed with flying colors ;-)

Congratulation!!! They are really good tests :D.

> Last, but not least, thanks to everyone who gave feedback in this
> thread.

Well, It made to have a look at UAX#9 after a few months too.

Yours,
-- 
Behdad
25 Mehr 1380, 2001 Oct 17

[Finger for Geek Code]




reply via email to

[Prev in Thread] Current Thread [Next in Thread]