[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [emacs-bidi] Embedding levels of formatting codes
From: |
Behdad Esfahbod |
Subject: |
Re: [emacs-bidi] Embedding levels of formatting codes |
Date: |
Wed, 17 Oct 2001 04:13:43 +0330 (IRT) |
On Tue, 16 Oct 2001, Eli Zaretskii wrote:
> ...
> Yes, I've looked into FriBidi's code some time ago (only after the
> design and most of the imlpementation of my code was done ;-), and saw
> this removal followed by insertion.
>
> Unfortunately, this cannot be done in the implementation that I'm
> working on, which processes characters one by one, and is called anew
> for each character. There's no ``before'' and ``after'' in this code;
> everything is done in-place, and of course I cannot modify the buffer
> by removing characters from it (I could copy and then work on the
> copy, but that would probably significantly slow down the code). I
> cannot even rely that I'll be called again, since the higher levels
> can decide they don't need any more characters (e.g., because the
> glyph row is long enough to reach the window margin). I must do
> everything in one go and then repeat it for the next character when
> I'm called again. The only ``memory'' I'm allowed to keep is what is
> stored in a special iterator structure used to traverse the buffer in
> the visual order; that structure is passed by the caller when my code
> is called.
Got the idea, but a small question, as I remember, UAX#9 needs some
look-aheads, say in rule W5, that a ET can be followed by ENs, what do
you do with this?
> That's why it is so much more convenient for me to keep the formatting
> codes in the buffer, and let the caller decide what to do with them
> (normally, the caller is expected to throw them away and immediately
> call the bidi code again, to deliver the next character, since the
> normal display mode should be to hide the formatting codes).
>
> > It is a simple and nice way to handle them, fribidi currently does
> > something like this, but it can cause problems if you implement it
> > without enough care, consider this test:
> >
> > AN ARABIC {LRE}{PDF} 123-456
> >
> > The correct answer should be this:
> >
> > 456-123 CIBARA NA
> >
> > And for this one:
> >
> > AN ARABIC {LRE} {PDF} 123-456
> >
> > The correct answer is this:
> >
> > 123-456 CIBARA NA
>
> I don't see the difference between these two test cases. Why should
> the single blank between LRE and PDF make any difference here? Did
> you perhaps mean LRO instead of LRE?
No, I exactly mean what I wrote, I designed the case to show you the
difference, and tested it with both fribidi and reference
implemention, and just added it to fribidi's test cases, lets get into
them: (I hope you can parse this :-) ).
input string: "AN ARABIC {LRE}{PDF} 123-456"
runs bidi types: (AL)(WS)(AL)(WS)(LRE)(PDF)(WS)(EN)(ON)(EN)
explicit marks resolved,
embedding levels: 1111111111??????????11111111
implicit levels resolved,
embedding levels: 1111111111??????????12221222
and obviously reordering this, results in: "456-123 CIBARA NA"
and this one:
input string: "AN ARABIC {LRE} {PDF} 123-456"
runs bidi types: (AL)(WS)(AL)(WS)(LRE)(WS)(PDF)(WS)(EN)(ON)(EN)
explicit marks resolved,
embedding levels: 1111111111?????2?????11111111
implicit levels resolved,
embedding levels: 1111111111?????2?????22222222
and obviously reordering this, results in: " 123-456 CIBARA NA"
And the point is that if you assign level 2 to {LRE} and {PDF} in this
test, then the you will solve the first test wrong.
> (Btw, you don't say, but I assume you meant that all the upper-case
> letters in this example are Arabic letters, not Hebrew letters,
> right?)
Well you expect the text 'AN ARABIC ...' to be written with hebrew
letters? I assume you didn't got it, 'cause you cannot read arabic ;-).
[snip]
> > and if you do not display with
> > different background colors, then this one is the best you can do:
> >
> > a {LRE}simple {RLO}TSet{PDF} which{PDF} see
>
>
> Alas, this is impossible to produce without lots of risky messing with
> the resolved levels generated by UTR#9. Given the considerations I
> described above, it doesn't seem to me like it's worth the hassle,
> certainly not at first approximation. If there's a public outcry
> about this when bidi-capable Emacs is released, we can always do
> something later.
You are right, I couldn't make fribidi output this too.
> > Also there is a collection of very good test-datas for explicit marks,
> > in fribidi CVS version, have a look at them.
>
> Thanks, I already know about that, and used all of your tests to test
> my code. It passed with flying colors ;-)
Congratulation!!! They are really good tests :D.
> Last, but not least, thanks to everyone who gave feedback in this
> thread.
Well, It made to have a look at UAX#9 after a few months too.
Yours,
--
Behdad
25 Mehr 1380, 2001 Oct 17
[Finger for Geek Code]
Re: [emacs-bidi] Embedding levels of formatting codes, Roozbeh Pournader, 2001/10/16