pdf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pdf-devel] Object layer API


From: Michael Gold
Subject: Re: [pdf-devel] Object layer API
Date: Wed, 18 Feb 2009 01:38:57 -0500
User-agent: Mutt/1.5.18 (2008-05-17)

On Tue, Feb 17, 2009 at 19:56:40 -0800, address@hidden wrote:
>  > Date: Tue, 17 Feb 2009 16:32:39 -0500
>  > From: Michael Gold <address@hidden>
>  > 
>  > 
>  > >     - What's the point of header_string?
>  > >        - The client probably doesn't care about the header -- they just
>  > >          want the library to open a valid PDF.
>  > >=20
>  > > The user may want to open non-pdf file such as a FDF file, that uses
>  > > the "%FDF-" header. Also, we may want to introduce new headers such as
>  > > "%GNUPDF-".
>  > 
>  > Why would we introduce a new header?
> 
> Who is the client here ?
> 
> BTW, don't forget there is an upper 'document' layer which could provide a
> wrapper for all this 'low-level' details (e.g if client==PDFviewer)

Right, the client can be internal (the document layer) or external.
It seems unlikely that we'd want a nonstandard header in either case.


>  > For the standard headers, the client could check using the
>  > pdf_obj_doc_get_header you proposed, but that still seems like a
>  > low-level detail they shouldn't have to deal with.
>  > 
>  > We could have separate functions for opening PDF or FDF, or a function
>  > that returns the type (PDF or FDF).
> 
> Here again, client is ambigous to me. Though the procedure you mention,
> IMHO, is in the correct level of abstraction.
> 
> Moreover the phrase 'low-level' depends on where you're looking it from, and
> although a procedure is defined on some API it's not enough reason for the
> user to call it.

True, but I'd still prefer to hide details when it's not useful to
expose them, and to keep them in the proper layer.

I don't think passing a header string on open/save provides a real
benefit.  It may seem like a generic way to handle different types of
files, but in reality the header won't be the only difference:
 - the encoding of names varies depending on the PDF version
 - the xref table is optional for FDF files
 - FDF files can't use indirect /Length values for streams

So the object layer needs to know the file type anyway (not just the
header string); and given this, it can choose an appropriate header when
saving, or detect an appropriate header when opening.


>  > I don't really like the idea of the library creating temporary files on
>  > its own.  Opening a file in a library can cause security issues, for
>  > example:
>  >   http://udrepper.livejournal.com/20407.html
>  > (Linux 2.6.27 is needed to protect against this, and I'm sure there are
>  > operating systems without this feature.)
> 
> Interesting security risk, but if we make all memory-based, how much memory
> will we need, on average, to edit a document ?
> Maybe we could provide this as an optional feature for the poor user with
> few MB of RAM.

I definitely wouldn't want to require all data to be stored in RAM.
The callback idea I suggested in my first mail would be OK for low-
memory systems, and would work as follows:
 - when creating a stream object (pdf_obj_stream_new), the client would
   provide a callback function instead of a pdf_stm_t
 - when saving a file, the object layer would execute this callback when
   it needed to write a stream
     - the callback would return an open pdf_stm_t
     - the object layer would read this until EOF, writing the filtered
       data to the output file; then it would close the stream

The client could provide a callback that reads from a file (possibly a
temp file) if it wanted to.

-- Michael

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]