pdf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pdf-devel] Updated tokeniser/parser patch


From: Michael Gold
Subject: Re: [pdf-devel] Updated tokeniser/parser patch
Date: Thu, 22 Jan 2009 18:28:47 -0500
User-agent: Mutt/1.5.18 (2008-05-17)

On Thu, Jan 22, 2009 at 14:03:42 -0800, address@hidden wrote:
>  >    One annoying thing I noticed was that pdf_list_t needs a heap allocation
>  >    to use an iterator, which means that pdf_obj_equal_p could fail with an
>  >    ENOMEM error (but currently has no way to return that error). It would
>  >    be nice if the iterator could be kept on the stack -- struct
>  >    pdf_list_iterator_s only contains one member anyway.
...
> Hi hackers,
> 
> Well, before going further, I think the design of the list module is ok as it 
> is
> now. We can write any predicate procedure on top of any module and not for 
> that
> reason the module is to blame on status codes. We have to deal with status
> codes.

But callbacks like pdf_list_element_equals_fn_t aren't defined to return
any status codes, which is a problem if we need to compare two elements
which are lists.

> If we publish the iterator structure and allocate it from the stack, we also
> need to make public the gl_list_* structures which was in first place 
> something
> we didn't want. Correct me if I'm wrong.

I was reading the code wrong and didn't notice it was allocating a
gl_list_iterator_t (rather than a pdf_list_iterator_s). Still, we don't
necessarily need to publish gl_list_iterator_t -- as I mentioned in my
other mail, we need a struct that's large enough to hold that struct,
and we can use padding for future compatibility. Publically, it could
simply be declared like struct { void *x[12]; }.

Note that pdf_hash_t would need similar changes.

> Moreover, if code cleanliness is the issue ignoring the status code like it's
> being done in pdf_obj_equal_p() is fine. By pushing the iterator to the stack,
> we're not safe from anything since the process stack has a limit too (and
> smaller than the heap), and we can deal with heap limits while stack limits
> are simply fatal, we can't deal with them.

We have to assume there's a bit of stack memory available, since we
couldn't otherwise call functions at all (anyone calling into a library
with less than 1K free, for example, should expect problems). Obviously
we shouldn't allocate huge structs there, but allocating something small
like an iterator shouldn't be a problem.

Applications using GNU PDF can manage the size of their call stack by
avoiding deep call paths and huge stack allocations, but I'm not aware
of any portable way to ensure that a certain amount of heap space is
free.

If we ignore the status code in pdf_obj_equal_p, we'll either cause the
application using GNU PDF to crash, or we'll return an incorrect result
(which could itself cause a crash, or worse, some rare and difficult-to-
debug behaviour). It's difficult to return an error code from that
function since we may need to use it in callbacks, which themselves
can't return an error code.

> IMHO if we follow this philosophy of using the stack to avoid the check of
> status codes, soon we might be tempted to push more structures to the stack
> going nowhere and losing binary code compatibility (no opaque pointers).
> I guess the gnulib folks have gone through this discussion and settled on the
> abort() call for some reason, right?. And gnulib is also a library.

gnulib is not a shared library. The files are designed to be copied
directly into programs -- so if a programmer decides they don't want to
abort on OOM conditions, they can avoid or fix those modules. But GNU
PDF will usually be loaded dynamically, which means the same binary
could be used by many applications, each with a different OOM policy.
We shouldn't impose any policy on them.

> BTW An alternative would be to allocate an iterator along with the
> list, and still have the structures hidden from the public. What you
> think ?

This would make it impossible for two threads to iterate over the same
list concurrently. I'm not sure whether that's a problem, since it would
already be unwise if there's a chance the list could be modified while
iterating.

-- Michael

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]