bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appea


From: Eli Zaretskii
Subject: bug#11860: 24.1; Arabic - Harakat (diacritics, short vowels) don't appear
Date: Sat, 18 Aug 2012 18:33:21 +0300

> From: Kenichi Handa <handa@gnu.org>
> Cc: 11860@debbugs.gnu.org, smias@yandex.ru, handa@gnu.org
> Date: Sat, 18 Aug 2012 18:19:19 +0900
> 
> > If this is the case, how come we display the diacriticals correctly on
> > Windows in other cases, e.g. with Hebrew?
> 
> For Hebrew too, on Windows, I see the same problem as what
> Steffan <smias@yandex.ru> reported:
> 
> In article <349641344144469@web8d.yandex.ru>, Steffan <smias@yandex.ru> 
> writes:
> >>> I choose "hebrew-full" as input-method.
> >>> 
> >>> - After typing 'f' I get KAF
> >>> - then by typing d I get GIMMEL
> >>> - and after typing 'D' I get "the three point sign" (HEBREW POINT QUBUTS) 
> >>> not below the GIMMEL but the KAF!
> 
> If you don't face with that problem, perhaps we are using
> the different font.  C-u C-x = tells that "courier new" is
> used for hebrew too in my case.

"Courier New" is the font that is used, and I still don't see the
problem.  The HEBREW POINT QUBUTS is displayed below GIMEL, as I'd
expect.

> I've just read the function uniscribe_shape in
> w32uniscribe.c.  It seems that these are the key API for
> uniscribe:
> 
> * ScriptItemize -- no idea what is this

It breaks the string to be displayed into individually shapeable
chunks, called "items".  We then pass each chunk to Uniscribe
separately for shaping.

> * ScriptShape -- perhaps for glyph substitution (GSUB features of opentype)

http://msdn.microsoft.com/en-us/library/windows/desktop/dd368564%28v=vs.85%29.aspx
says that this function "Generates glyphs and visual attributes for a
Unicode run".

> * ScriptPlace -- perhaps for glyph positioning (GPOS features of opentype)
> 
> So at first please check the documentation of ScriptShape
> and figure out how it works for bidi script; i.e. what order
> does it expect for input, and what order does it produce.

>From the above page:

  If fLogicalOrder is set to TRUE in the SCRIPT_ANALYSIS structure, the
  function always generates glyphs in the same order as the original
  Unicode characters. If fLogicalOrder is set to FALSE, the function
  generates right-to-left items in reverse order so that ScriptTextOut
  does not have to reverse them before calling ExtTextOut.

And w32uniscribe.c sets that flag to TRUE a few lines before it calls
ScriptShape, because Emacs itself reorders characters:

  for (i = 0; i < nitems; i++)
    {
      int nglyphs, nchars_in_run;
      nchars_in_run = items[i+1].iCharPos - items[i].iCharPos;
      /* Force ScriptShape to generate glyphs in the same order as
         they are in the input LGSTRING, which is in the logical
         order.  */
      items[i].a.fLogicalOrder = 1;  <<<<<<<<<<<<<<<<<<<<<<<<

      /* Context may be NULL here, in which case the cache should be
         used without needing to select the font.  */
      result = ScriptShape (context, &(uniscribe_font->cache),
                            chars + items[i].iCharPos, nchars_in_run,
                            max_glyphs - done_glyphs, &(items[i].a),
                            glyphs, clusters, attributes, &nglyphs);

> Next please find the meaning of this code fragment:
> 
>                 /* Detect clusters, for linking codes back to
>                    characters.  */
>                 if (attributes[j].fClusterStart)
>                   {
>                     while (from < nchars_in_run && clusters[from] < j)
>                       from++;
>                     if (from >= nchars_in_run)
>                       from = to = nchars_in_run - 1;
>                     else
>                       {
>                         int k;
>                         to = nchars_in_run - 1;
>                         for (k = from + 1; k < nchars_in_run; k++)
>                           {
>                             if (clusters[k] > j)
>                               {
>                                 to = k - 1;
>                                 break;
>                               }
>                           }
>                       }
>                   }
> 
> The comment refer to "clusters".  I don't know what it
> exactly means in uniscribe, but I guess it relates to
> grapheme cluster, and if so, this part seems to relates to
> the ordering of glyphs in this kind of grapheme clauster:
> 
>   [0 1 1593 969 8 1 8 12 4 nil]
>   [0 1 1593 760 0 3 6 12 4 [1 -2 0]]

No, they are character clusters, not grapheme clusters.  They could be
similar (or even identical) to grapheme clusters, but I'm not sure,
because I have a very vague idea about both.  You can find some
details here:

   
http://msdn.microsoft.com/en-us/library/windows/desktop/dd317792%28v=vs.85%29.aspx

I hope this will allow you to understand the meaning of the above
code, by looking at how the results are used in the calls to
LGLYPH_SET_* macros right below the above snippet.

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]