pdf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [pdf-devel] Problems in the stream implementation


From: gerel
Subject: Re: [pdf-devel] Problems in the stream implementation
Date: Wed, 01 Oct 2008 15:59:53 -0700 (PDT)

 > Date: Wed, 01 Oct 2008 22:23:58 +0200
 > From: Juan Pedro Bolivar Puente <address@hidden>
 > 
 > Hi,
 > 
 > While writing the LZW filter implementation some doubts arose (I'm a
 > corner case maniac, sorry :)):
 > 
 > Marchesi, why didn't you use circular buffers as we discussed in the
 > GHM instead of those fill-rewind buffers you are using? I was wondering
 > how the filter had to behave in case it needs to leave some data in the
 > input buffer. I can happen quite often, as the "compression ratio" of
 > most algorithms can easily go over 1.0 (they generate more data than
 > they take).

I guess the buffers are intended to be circular, check the pdf_stm_buffer.c
file. The filters should see it as circular. The thing is that the Stream
code is not behaving as intended with the buffers.


 > Lets get to a deadly function:
 > 
 > pdf_status_t
 > pdf_stm_filter_apply (pdf_stm_filter_t filter,
 >                       pdf_bool_t finish_p)
 > {
 >   pdf_status_t ret;
 > 
 >   pdf_stm_buffer_rewind (filter->out);
 >   ret = PDF_OK;
 > 
 >   while ((!pdf_stm_buffer_full_p (filter->out))
 >          && (ret == PDF_OK))
 >     {
 >       /* If the input buffer is empty, refill it */
 >       if (pdf_stm_buffer_eob_p (filter->in))
 >         {
 >           ret = pdf_stm_filter_get_input (filter, finish_p);
 >         }
 > 
 >       if (ret != PDF_ERROR)
 >         {
 >           /* Generate output */
 >           ret = filter->impl.apply_fn (filter->params,
 >                                        filter->state,
 >                                        filter->in,
 >                                        filter->out,
 >                                        finish_p);
 >         }
 >     }
 > 
 >   return ret;
 > }
 > 
 > The recursion will get here and apply the filter chain forwards. When
 > the output buffer is full it returns, thats ok. But what happens if the
 > filter had to leave some data in the input buffer because it generated
 > too much output? When this function is invoked again, the input buffer
 > will be reset! All the remaining input data is lost...

This may be a flaw, but, the filter's history could avoid this, by one
rule, read as much as you can, then try to output. That's how I'm coding the
filters right now.
Anyway if this can be fixed, great!!

 > 
 > Another function that is buggy is the pdf_stm_finish function. It
 > directly applies the filter chain with finish_p set to PDF_TRUE, but
 > this is wrong again when the compression factor > 1.0. One have to loop
 > running the filter chain until the filters return EOF, and only then one
 > can be sure they will behave properly if finish_p = true.
 > 

Now, _this_ is the problem I've been talking about with jemarch,
except that I didn't say it explicitly. The "last apply call" :-)

We should _really_ fix this.

 > 
 > I have not prepared a patch for this because I know that jemarch wants
 > to maintain the stm infrastructure code. But if you want I can prepare a
 > patch with a fancy circular buffer an this problem solved tomorrow
 > afternoon.

IMHO the problem is not with buffers as I mentioned above. The thing is the
stream implementation.


cheers

-gerel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]