emacs-bidi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Embedding levels of formatting codes


From: Behdad Esfahbod
Subject: Re: [emacs-bidi] Embedding levels of formatting codes
Date: Wed, 17 Oct 2001 21:13:52 +0330 (IRT)

On Wed, 17 Oct 2001, Eli Zaretskii wrote:
> On Wed, 17 Oct 2001, Behdad Esfahbod wrote:
> 
> > >    ``instead of removing the format codes, assign the embedding 
> > >      level to each embedding character''
> > > 
> > > What is ``the embedding level'' which I should assign to those codes?
> > 
> > It means:
> > 
> >   ``X9'. With each RLE, LRE, RLO, LRO, PDF, and BN character, set it's
> >     level to the current embedding level, then turn it's type to BN.''
> 
> What is ``current embedding level''?  Is this the level _before_ or
> _after_ increasing/decreasing it due to these codes?

It's after that, because It's rule X9, and the rule that increses or 
decreases the levels, is one of the rules X2, X3, X4, X5 and X7, and 
in an algorithm the steps run from ....
 
> I currently implemented that as _after_ the level update, so RLE, LRE,
> RLO, and LRO get the higher level, while PDF gets the lower level.
> This needs an artificial correction at the final stage, to prevent a
> buffer like this:
> 
>        abcd{RLO}foo{PDF}xyz
> 
> from being displayed like this:
> 
>        abcdoof{RLO}{PDF}xyz

You are right, they don't care for it, because their goal was not to 
display this marks, but, they either in their implemention (c++) 
assign the higher level to PDF, in fribidi too, I assign the level 
before PDF to it, and the level after RLE/RLO/LRE/LRO to them, but if 
the embedding level is empty, you should not do this, because it can 
break a run into 3 (1111{LRE}{PDF}1111 is equivalent to 1111111111, 
not 1111221111), you can easily prove that both have the same output.

> > >    ``In rule X10, assign L or R to the last of a sequence of adjacent BNs 
> > >      according to the eor / sor, and set the level to the higher of the
> > >      two levels.''
> > > 
> > > Do you even understand what are they trying to tell here?  What does
> > > ``according to the eor / sor'' mean in practical terms?  What ``two
> > > levels'' do they mean in the last part of this sentence?
> > 
> > For each run, the spec has defined sor and eor levels, the sor is the 
> > level of previous run, and eor is the level of next run, now:
> > 
> >   ``X10'. With each maximal sequence of adjacent BNs, set it's level
> >     to the higher of sor and eor, name this level x, then if x is  
> >     even, change the bidi type of the last character in sequence, to 
> >     ltr, and otherwise, change it to rtl.''
> 
> The problem here is that formatting codes in most cases (with the
> exception of RLM and LRM) start or end the run.  Since X10 is applied
> to a single level run only, what does it mean, practically, ``maximal
> sequence of adjacent BNs''?  For example, if we have a buffer like
> this:
> 
>    abcd{LRE}{RLE}{RLO}{LRE}{LRO}xyz{PDF}{PDF}{PDF}{PDF}{PDF}
> 
> I don't really have any ``adjacent BNs'' here, since each BN in this
> example is in another level run, right?
> 
> So, with the exception of LRM/RLM, when would we see a ``maximal
> sequence of BNs''?

1. When really BN characters exist, I mean BNs other than explicit 
marks.

2. Many adjacent empty explicit embedding levels:

  abcd{RLE}{PDF}{LRE}{PDF}{LRO}{PDF}{RLO}{PDF}efgh

Has runs: {LTR^4}{BN^8}{LTR^4}

> > > > I won't recommend implementing something based on this section.
> > > 
> > > Actually, what I wrote is based on that section, and it seems to work
> > > fairly well.  Most of what they say there is not very important
> > > anyway, since the algorithm mostly works on each level run separately,
> > > and formatting codes almost always (with the exception of LRM and RLM)
> > > change the level, i.e. end the current level run and start another.
> > 
> > Ok, but lots of bugs arise in the run boundaries, are you trying to be 
> > strictly UAX#9 conformant, or some very small exceptions are not too 
> > important to you?
> 
> I'm trying to be compliant, but UTR#9 doesn't have any test suite to
> test the code against (and their reference implementation could be
> buggy, so is not a very good tool for verifying other
> implementations).  What I need is a test suite which was hand-verified
> against the algorithm definition.  I didn't find such a suite.

The algorithm is almost exact, I didn't have your problems when I was 
implementing fribidi, and when there was ambigiuties, you can read the 
Java Reference code to findout what they mean, also, you can assume 
that it has no bugs.

Yours,
-- 
Behdad
25 Mehr 1380, 2001 Oct 17

[Finger for Geek Code]





reply via email to

[Prev in Thread] Current Thread [Next in Thread]