bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#19993: 25.0.50; Unicode fonts defective on Windows


From: Ilya Zakharevich
Subject: bug#19993: 25.0.50; Unicode fonts defective on Windows
Date: Sat, 7 Mar 2015 23:41:58 -0800
User-agent: Mutt/1.5.21 (2010-09-15)

On Sat, Mar 07, 2015 at 10:14:16AM +0200, Eli Zaretskii wrote:
> >   What can it mean that a font “supports a script”?
> > 
> > Theoretically, it may mean that
> >   • it “knows” all the characters in the script, and
> >   • has enough extra infrastructure to shape these characters
> >     into a correct glyphic representation.
> > 
> > I may see that the second part may be described by one bit per
> > script.  But what about the first one?  A repertoir of a script
> > changes every year (sometimes several times per year).  How can this
> > be encapsulated into a bit?
> 
> All I know about this is what the MSDN documentation says:
> 
>   FONTSIGNATURE structure
> 
>   Contains information identifying the code pages and Unicode subranges
>   for which a given font provides glyphs.
>   [...]
>   Members
> 
>   fsUsb
> 
>       A 128-bit Unicode subset bitfield (USB) identifying up to 126
>       Unicode subranges. Each bit, except the two most significant bits,
>       represents a single subrange. The most significant bit is always 1
>       and identifies the bitfield as a font signature; the second most
>       significant bit is reserved and must be 0. Unicode subranges are
>       numbered in accordance with the ISO 10646 standard. For more
>       information, see Unicode Subset Bitfields.

So this bits “identify” a subrange.  Of course, nothing is said about
what this actually MEANS.  So I did an experiment: Cour.ttf.

The following subrange is “identified”:

  9       0400 - 04FF          Cyrillic
          0500 - 052F          Cyrillic Supplement
          2DE0 - 2DFF          Cyrillic Extended-A
          A640 - A69F          Cyrillic Extended-B

What is actually supported:

  0400 - 04FF    Everything but 04d8,04d9 (Schwa, used in Cyrillic Azeri — but 
contemporary Azeri is written in Latin) 
  0500 - 052F    Only 0500 - 0513, 051a - 051d supported
  2DE0 - 2DFF    None supported (5.1)
  A640 - A69F    None supported (5.1 and later)

Looking in DerivedAge.txt:

   04D0..04EB    ; 1.1 #  [28] CYRILLIC CAPITAL LETTER A WITH BREVE..CYRILLIC 
SMALL LETTER BARRED O WITH DIAERESIS

   0500..050F    ; 3.2 #  [16] CYRILLIC CAPITAL LETTER KOMI DE..CYRILLIC SMALL 
LETTER KOMI TJE
   0510..0513    ; 5.0 #   [4] CYRILLIC CAPITAL LETTER REVERSED ZE..CYRILLIC 
SMALL LETTER EL WITH HOOK
   0514..0523    ; 5.1 #  [16] CYRILLIC CAPITAL LETTER LHA..CYRILLIC SMALL 
LETTER EN WITH MIDDLE HOOK

So two characters of 1.1 are not supported, all characters of 3.2 and 5.0 are
supported, and part of 5.1 is supported.

Does it look like a good indication of anything?  I would say no… Do
you know any other tool looking at this bitmap for choosing which font
to pick up for a particular character?

Ilya





reply via email to

[Prev in Thread] Current Thread [Next in Thread]