[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ft] Fwd: FW: Handling an embedded CID font using Freetype
From: |
Derek B. Noonburg |
Subject: |
Re: [ft] Fwd: FW: Handling an embedded CID font using Freetype |
Date: |
Wed, 22 Mar 2017 09:37:00 -0700 (PDT) |
As Chris said, the CFF font doesn't contain the encoding information, so
decompressing the font won't help you.
CID CFF fonts are indexed using "CIDs" (character IDs), which usually
come from a particular "character collection". (This is all Adobe
terminology.)
In your case, the character collection, as specified in the
CIDSystemInfo dictionary, is Adobe-Japan1. You can get the relevant
specs here:
https://www.adobe.com/devnet/font.html
For PDF, you'll also need to look at the font's encoding, which may be
"Identity", meaning that the character codes are the CIDs; or it may be
a CMap, which maps (possibly variable length) character codes to CIDs.
The CMap can either be the name of a predefined CMap (those CMap files
are available at the adobe link above), or a CMap stream. Either way,
you'll need to parse the CMap -- the CMap format spec is also on that
web page.
To summarize:
* get the character code from the PDF content stream
* map the character code to CID, using a CMap
* use the CID to get a glyph from the font
If you want to map to Unicode (rather than drawing a character from the
font), replace that last step with: map the CID to Unicode, using the
character collection data.
- Derek
On 2017 Mar 22, tony smith wrote:
> Thanks for the quick reply.
>
> I've attached an example pdf file, which displays a single Chinese
> character.
> The font details and the string in the pdf data are both readable.
>
> The pdf string contains the single character <07F4>.
> Also the relevant font has the following entry
> /CIDSystemInfo << /Registry (Adobe)/Ordering (Japan1)/Supplement 4 >>
>
> Fontforge has a file which maps these CID values to the equivalent Unicode
> values.
> So in this case the Chinese character is \u9AD8
> But Freetype has no way to workout what the Unicode mapping is for a CFF.
>
> So if my understanding is correct I need to uncompress the CFF file and
> workout the relevant mapping from the CMap information.
>
> Thanks for you time, Tony Smith.
>
>
> On 22/03/17 14:08, tony smith wrote:
>> Hello
>>
>> I'm using Freetype to display fonts embedded in pdf data.
>>
>> My problem seems to be with an embedded CCF file, where the CID
>> mappings is only defined in the pdf data.
>> I load a Compact File Format (CFF) stream into memory.
>> But when I try to select a character map, using FT_Select_CharMap, I
>> get error 0x26, Invalid_CharMap_Handle.
>> I've attached the CFF, font.cff, and a simple c++ program, cff.c,
>> which highlights the problem.
>>
>> From the pdf file I know that the font is a CID one.
>> So you can work out the mapping from the PDF strings to Unicode strings.
>> But this information doesn't seem to be available in the CFF stream.
>> So my question is how do I handle this situation?
>> I can't see any way in Freetype to add the missing character encoding.
>> Do I have to decompress the CFF stream, which should be a type1 font
>> and then explicitly add the missing character encoding information?
>
> CFF != compressed type 1. Although the imaging models, and many concepts
> are shared, they are not the same.
>
> CFF CIDFonts don't contain any encoding information, and there is no set
> glyph ordering, and there is no built-in character ID to glyph index
> mapping (like, for example, a TTF/OTF cmap table).
>
> To actually use a CIDFont you need to use the CMap (NOTE: not "cmap", but
> "CMap") specified by the PDF - that can either be custom, embedded in the
> PDF, or a named reference to a standard CMap (available from Adobe, but I
> can't remember right now where).
>
> Parsing the CMap table will give you a mapping from character code to CID,
> which you can then use to pull the glyph from the font. (In Postscript
> terms you "compose" the CIDFont with the CMap to create a usable instance
> of the CIDFont).
>
> CMaps are defined in a highly restricted variant of Postscript, so should
> be fairly easy to parse.
>
> Chris
>
> _______________________________________________
> Freetype mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/freetype
>
> --
> This e-mail message has been scanned and cleared by Google Message Security
> and the UNICOM Global security systems. This message is for the named
> person's use only. If you receive this message in error, please delete it
> and notify the sender.
Re: [ft] Handling an embedded CID font using Freetype, Werner LEMBERG, 2017/03/22