lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Testable alternative to compressed PDFs


From: Vadim Zeitlin
Subject: Re: [lmi] Testable alternative to compressed PDFs
Date: Sun, 11 Feb 2018 00:49:45 +0100

On Sat, 10 Feb 2018 23:29:18 +0000 Greg Chicares <address@hidden> wrote:

GC> Let me paste some specimen
GC> sections of the PDF I just created:
GC> 
GC> BT /F1 10.00 Tf ET
GC> BT /F1 8.00 Tf ET
GC> 
GC> Many lines like that. My guess is that they're loading fonts.

 Yes, "F1" is the name of the font resource and "10" or "8" is the size of
the font.

GC> Here's part of a footnote:
GC> 
GC> BT 1 0 0 -1 24.00 335.00 Tm 0 Tr (Premiums ) Tj ET
[...]
GC> BT 1 0 0 -1 213.00 335.00 Tm 0 Tr (basis ) Tj ET
GC> 
GC> Here's a row of numeric values:
GC> 
GC> BT 1 0 0 -1 57.00 324.00 Tm 0 Tr (1) Tj ET
[...]
GC> I don't think this is what we want:

 I agree that this is not the only thing that we want, but I still think it
could be useful to compare PDFs. Whether it's more convenient to do it by
running diff on uncompressed files or diffpdf on normal ones is another
question.

GC> Option 1 (flat-text output) is the only option identified
GC> so far that can really be considered.

 Should I start working on this a.s.a.p. then?


GC> Differences might very well be failures--if we changed the PDF code in a
GC> way that was intended to be a pure refactoring, then any difference is
GC> an anomaly, which none of our other (existing) tests can find because
GC> they don't use PDFs. This is an important use case: we often refactor,
GC> and with strong system tests we can refactor boldly, so that material
GC> changes can often be smaller, and thus easier to review and to test.

 This is exactly the use case I thought diffing uncompressed PDFs could be
useful for. Or, if not, at least using a fixed date of the PDF creation and
comparing binary files directly (or even just comparing their checksums).
Because right now the only tool we have for this is diffpdf and while it is
very useful, especially if the files do differ and you want to see what
exactly is the difference, it is much slower than diff or md5sum, and so I
think it would be nice to use something else for such non-regression tests.


GC> Differences might also be intentional effects of desired changes. This
GC> is where uncompressed PDFs fall short.

 Yes, they won't help with this.

 Regards,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]