freetype-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Devel] Fw: Freetype, fontconfig,Xft, Mozilla and Non-BMP char. supp


From: Antoine Leca
Subject: Re: [Devel] Fw: Freetype, fontconfig,Xft, Mozilla and Non-BMP char. support
Date: Mon, 02 Dec 2002 00:10:38 +0100
User-agent: Mozilla/5.0 (Windows; U; Win98; en-US; rv:1.0.0) Gecko/20020530

Werner LEMBERG wrote:
> I currently lack time to take care of this...

I believe I can handle it.

> On Thu, 28 Nov 2002, Jungshik Shin wrote:
>
>   Why does FcCharSetHasChar fail for non-BMP characters?  It's because
> fontconfig calls FT_Select_CharMap() instead of FT_Set_CharMap().

Well, FT_Select_CharMap() is a valid (non-deprecated) interface, so it
should work OK, even if there is "another way to do the same job".


>  When FT_Select_CharMap() is called with 'FT_ENCODING_UNICODE'(or
> deprecated ft_encoding_unicode), freetype activates the first cmap with
> Unicode encoding for subsequent operations on a font until another cmap
> is activated. It's not a problem for fonts covering BMP only. However,
> fonts like Code2001 has multiple Cmaps all with the identical symbolic
> FT encoding 'FT_ENCODING_UNICODE' but with different char. coverage.
> Code2001 has 4 cmaps, pid=0,eid=0(Unicode), pid=1,eid=1(AppleRoman),
> pid=3(MS),eid=1(Unicode) and  pid=3(MS),eid=10(Unicode).  Only the last
> cmap has non-BMP characters although the first and the third are also
> Unicode cmap. They're actually UCS-2 cmap.

You are correct.

> As mentioned above, Freetyp2
> makes the first cmap matching 'symbolic encoding name' active and
> unfortunately that happens to be the one not covering non-BMP characters.

Well, in fact only FT_Select_CharMap() does that. The initialisation code
(in open_face, around line 770) does the right thing, and scans *all* the
available charmaps before returning. Since the 3,10 will alwayus be the
last one, it would be selected by default. This is only if the application
issues a (non-needed) FT_Select_CharMap( ..., FT_ENCODING_UNICODE ) that
the behaviour you are noticing would be invoked.

This is clearly a bug.


>   One may say that the font (CODE2001) is to blame and pid=0/eid=0
> and pid=3/eid=3 cmaps should have non-BMP characters covered as
> well. However, it's not very clear that it has to according to the MS
> document at <http://www.microsoft.com/typography/otspec/cmap.htm>.

Well, the MS doc pretty clearly states that *both* tables has to be
included, and a vast majority of the reader will understand that the
3,1 table has to be encoded with format 4. I agree that the specs can be
interpreted _as_if_ it could be possible to have a 3,1 table in format 12,
however:
   1) this will break compatibility with any software that does not know
    how to handle format 12 tables, including Windows 9X, NT <= 4, etc.
    so is IMHO a unacceptable break in backward compatibility;
   2) I do not remember presently the 2nd argument, but the 1st is strong
    enough, I believe!

>   One possible solution is to return not the first
> cmap table matching the symbolic encoding name of 'FT_ENCODING_UNICODE'
> but to keep on looking to see if pid=3/eid=10 cmap is also present.
> If it is, it has to be activated instead of the first Unicode cmap found.

And this is what open_face already does! So we have to correct this.


>   Alternative is to introduce a new symbolic encoding name,
> 'FT_ENCODING_UCS4'

God sakes, NOOOOOOOOO! The API is 32-bit clean, we have NO reasons to
allow in any way people to use the (now obsolete) UCS-2 format.
And of course, this would cause a number of programs (which would expect
access to UCS-4 to be the default, as it is nowadays) to be modified,
something we cannot accept.

>   Below are links to FT2 patch (against 2.1.3)
>
>  http://bugzilla.mozilla.org/attachment.cgi?id=107852 : FT2 patch

Thanks. However, I have written something slightly different,
which mimics the behaviour of open_face. Can you please test it?


>  I believe Werner is on this list so that I won't write to him
> separately for a while. Werner, if you find that my patch makes sense,
> it'd be nice to apply it to Freetype2. BTW, it just occurred to me that
> the routine setting the default Cmap for a newly opened FT_Face has to
> be modified in a similar manner. (currently, it sets the first-found
> Unicode Cmap as the default, but the first-matched Unicode Cmap may not
> be the most extensive one as I explained above.)

No it does not, as I explained above.


Regards,
Antoine
--- ../freetype2/src/base/ftobjs.c      Fri Nov  8 21:40:14 2002
+++ src/base/ftobjs.c   Sun Dec  1 22:52:30 2002
@@ -1436,6 +1436,7 @@
   {
     FT_CharMap*  cur;
     FT_CharMap*  limit;
+    FT_Bool      found;
 
 
     if ( !face )
@@ -1447,14 +1448,26 @@
 
     limit = cur + face->num_charmaps;
 
+    /* We have to scan *all* charmaps. This is because, with UCS-4-enable */
+    /* TrueType/OpenType/sfnt fonts, the UCS-4 charmap, with encoding=10, */
+    /* will appear *after* the ones with only UCS-2 (limited to 65,536    */
+    /* positions) tables.                                                 */
+    /* A more elaborate way may be to seek first for 3,10 table, then for */
+    /* the others possibilities. If you implement that, please change the */
+    /* logic in open_face (earlier in this file) too, which selects the   */
+    /* Unicode charmap by default.                                        */
+    found = 0;
     for ( ; cur < limit; cur++ )
     {
       if ( cur[0]->encoding == encoding )
       {
+        found = 1;
         face->charmap = cur[0];
-        return 0;
       }
     }
+
+    if ( found )
+        return 0;
 
     return FT_Err_Invalid_Argument;
   }

reply via email to

[Prev in Thread] Current Thread [Next in Thread]