GhostScript writes badly encoded PDF Metadata for Latin-1 input

bug-ghostscript

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GhostScript writes badly encoded PDF Metadata for Latin-1 input

From:	David Kastrup
Subject:	GhostScript writes badly encoded PDF Metadata for Latin-1 input
Date:	Thu, 29 Nov 2012 14:49:14 +0100

The following file

showpage
[ /Title (Document title)
  /Author (\241 \242)
  /DOCINFO pdfmark

gets encoded as

sss.pdf
Description: Adobe PDF document

by ps2pdf 9.06.  The XML Metadata originally encoded in PDFDocEncoding
(a Latin-1 subset) does not get converted to UTF-8 as required in the
XML Metadata, leading to problems when PDF processing programs hit the
badly encoded Metadata.

This bug surfaced in LilyPond (which uses ps2pdf internally) and had
been reported to Evince after first incomplete analysis, so it is
conceivable that duplicate reports may originate from there.  I can't
find anything related to it recently in the Ghostscript bug database:
the last encoding problem I can find related to the use of UTF16BE with
BOM, which happens to be converted correctly by GhostScript 9.06.

-- 
David Kastrup

[Prev in Thread]

Current Thread

[Next in Thread]

GhostScript writes badly encoded PDF Metadata for Latin-1 input, David Kastrup <=

Prev by Date: Re: 80 line limit in Windows 8?
Previous by thread: 80 line limit in Windows 8?
Index(es):
- Date
- Thread