texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Texmacs-dev] [PATCH] Re: [TeXmacs] Large document size


From: Sam Liddicott
Subject: [Texmacs-dev] [PATCH] Re: [TeXmacs] Large document size
Date: Fri, 13 May 2011 10:00:09 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.18pre) Gecko/20110512 Lightning/1.0b2 Lanikai/3.1.11pre


If I disable the use of gs to "condition" the eps files written by qt_image_to_eps then the resulting .ps file seems to not be conformant:
1. gv can only display page 1
2. evince can only display some page in the middle.
3. basic old gs can display all pages (althought of course it requires me to press enter after each page).

I guess this is why gs was being used to condition these .eps images.

I note that when ps2ps invoked on the non-conformant output ps file, it reduces it from 158MB to 9MB and causes it to work just fine in evince and so on.

Also, ps2pdf when invoked on the non-conformant output ps file produces a 2.5MB pdf file that also seems to work just fine.

Primary conclusions
1. The .eps written by qt_image_to_eps is not satisfactory.
2. The .eps written by gs is not satisfactory.
3. The .eps written by the imagemagick "convert" utility is satisfactory.

Secondary conclusions
1. ps2ps might be invoked to some advantage on the output .ps in all cases.
2. qt_image_to_eps might do well to copy the format produced by imagemagick or that produced by gs version 8 when processing the imagemagick format.
3. gs 9 tools should not be used to output intermediate post-script files

Actions - patch provided, tested on Ubuntu 11.04 - requires imagemagick to be present
1. In image_to_eps
 a. Disable qt_image_to_eps
 b. Disable gs_to_eps

Future actions:
1. Fix qt_image_to_eps to produce conformant eps files directly instead of using gs to sanitize 2. Maybe the gs eps2write target instead of the epswrite target when it exists (the ps2write target doesn't work). 3. Consider invoking ps2ps on the generated ps to condition and trim the resultant .ps file. This can reduce a .ps file from 167MB to 10MB

Sam

On 12/05/11 18:05, Sam Liddicott wrote:


On 12/05/11 17:02, marc lalaude-labayle wrote:
Hi,

on a fresh ubuntu 11.04 install, with official txmacs package, i too get huge .ps files.

I've asked for advice on the gs mailing list but my own searches bring up this gs bug report that seems to describe the problem:
http://bugs.ghostscript.com/show_bug.cgi?id=691914

Which is pretty much this:

eps2eps and gs etc should not be used to "optimize" bitmaps for inclusion; as by the time it has finished it may no longer be a bitmap but rather a load of fill/stroke rectangle commands and the colours may already be altered too with respect to ICC colour profiles and all that...

ps2write is recommended but does not yet produce eps conformant output.

The answer is: texmacs should no longer use gs_to_eps on eps files.
gs_to_eps should be optimised to do a straight copy if the input file is also an eps file, and perhaps more correctly, it should not be called from qt_image_to_eps that knows full well the input file is eps.

I hope to submit a patch for texmacs tomorrow,

Sam


Marc

2011/5/12 Sam Liddicott <address@hidden <mailto:address@hidden>>


    All lies! qt_image_to_eps is not a qt function but a texmacs
    function - and further, it is responsible for the "colorimage"
    format I was so praising of on my 10.04 system.

    However disabling use of qt_image_to_eps did stop the problems, so
    I only have a short code path to investigate to find the true cause.

    Sam


    On 12/05/11 16:38, Sam Liddicott wrote:



        The problem seems to be that since I re-build TeXmacs under
        Ubuntu 11.04, qt_image_to_eps is being used to generate the
        eps in image_to_eps() file image_files.cpp

        I disabled that particular function and now the "convert"
        program from imagemagick is being used, so it is interesting
        to note that USE_GS is not defined.

        Convert is better than QT and produces 158MB of .ps compared
        to gs's 200MB and QT's 280MB

        This 158MB .ps shrinks down to 2.6MB

        So the culprit is qt_image_to_eps() on Ubuntu 11.04 doing
        something too clever that results in enormous files which
        often render badly.

        I'm also willing to swear in a very general sort of way that I
        don't think that -dPDFSETTINGS=/printer or
        -dPDFSETTINGS=/screen made any difference to final pdf from
        the .ps generated with qt_image_to_eps() - but when I take
        away qt_image_to_eps() the -dPDFSETTINGS makes a difference to
        the size of the pdf and quality of embedded images.

        I propose quite strongly that qt_image_to_eps() not be used.

        Sam


        On 12/05/11 16:00, Sam Liddicott wrote:



            With the help of this script running in my
            .TeXmacs/system/tmp folder:
            while : ; do ln *eps x/ ; ls -l x ; sleep 0.1 ; done

            I've got a load of the .eps files.

            I note that the eps files fed to gs on Ubuntu 11.04 are
            usually larger than those fed to gs on 10.04, and have a
            different form.

            Those on 11.04 have the bitmap defined the with loads of
            lines like this:
            138 140 r5
            7420 5980 10 10 rf
            142 148 r5
            7430 5980 10 10 rf
            142 140 r5
            7440 5980 10 10 rf
            142 148 r5

            Those on 10.04 have the bitmap defined like this:
            colorimage
            [B?XWa2ZEF`...
            .....

            which appears to be perhaps a based64 style encoding of
            the bitmap.

            That would account for the difference in size of eps, and
            why the .ps is 50% bigger, but does not directly explain
            how this should cause a 20x bigger PDF.
            I don't know... maybe pstopdf recognized the "colorimage"
            form and optimised and jpeg'd it?

            Now I need to work out how/why texmacs gets the images
            emitted in different forms... anyone who knows should
            speak up now.

            Sam

            On 12/05/11 15:04, Sam Liddicott wrote:



                I copied the 279MB .ps file to an ubuntu 10.04 machine
                and ran pstopdf and it produced a 42MB pdf as before
                so I don't think pstopdf is to blame.

                I then copied over my .tm files (and my .TeXmacs dir)
                and exported the ..ps and it came only to 99MB instead
                of 279MB - that's a big difference.
So TeXmacs on my Ubuntu 11.04 produces enormous .ps files

                I copied the 99MB .ps back to my 11.04 machine and ran
                ps2pdf and it produces a 2.6MB .pdf file instead of a
                42MB file.

                So I need to find out what texmacs does to export the
                .ps file to see what it is behaving differently on
                ubuntu 11.04
                Anyone knows? Please tell!

                I'm using my own git based build of texmacs.
                On 10.04 I'm using TexMacs with top change vdhoeven
                Fri March 18th "Fix"
                On 11.04 I'm using top change vdhoeven Apri 27

                I'll now rule that out as a difference, I'm building
                the version I was using on 11.04. (My 11.04 was an
                upgrade from 10.10, not a re-install so I don't think
                it's a case of different code paths because of missing
                libs).

                I've reverted my TeXmacs and it makes no difference.

                I'm looking at the way TeXmacs invokes gs when
                exporting the .ps file, it calls like this:

                gs -dQUIET -dNOPAUSE -dBATCH -dSAFER -sDEVICE=epswrite
                -dEPSCrop

                I guess for each of the image files I use. This must
                be producing different files on Ubuntu 11.04

                Sam



                On 12/05/11 13:35, Sam Liddicott wrote:



                    Hmmmm... I tried viewing my PDF on windows in
                    foxit pdf view and it's great!

                    Evince under ubuntu is rubbish and shows all the
                    half-tone style dithering and the rest.

                    However I can't put all the blame there, I have a
                    PDF of a document this one was derived from, using
                    many of the same images and it still displays
                    fine, and from the artifacts when zooming closely,
                    must be using JPEG (and it's a much smaller file).

                    So I'm guessing that the PS/PDF internal image
                    format I'm now getting from TeXmacs is different
                    such that
                    1. images are larger
                    2. when converted to pdf, display badly in evince
                    pdf viewer (but as PS display fine).

                    I don't have any record of how big the .ps files
                    used to be.

                    I'll try and find an ubuntu 10.10 machine I can
                    run pstopdf on and see if that produces PDF of a
                    normal size.

                    Sam






                    On 12/05/11 11:53, Sam Liddicott wrote:



                        On 11/05/11 19:39, Sam Liddicott wrote:

                            I have a 48 page TeXmacs document, and
                            many of which have images which are
                            screen-shots and so not very high
                            resolution - I think all are less than 800
                            pixels wide.

                            The png image source directory and .tm
                            file come to under 9MB, and the document
                            uses only about half of those images.

                            The exported post-script is 280MB and the
                            PDF is 40MB - which are horrific sizes.

                            (I'm using latest git repository from
                            git://gitorious.org/texmacs/texmacs.git
<http://gitorious..org/texmacs/texmacs.git>)

                            I tried compiling with
                            --enable-pdf-renderer but then pdf export
                            fails with:


                            ** WARNING ** Failed to load AGL file
                            "pdfglyphlist.txt"...
                            ** WARNING ** Failed to load AGL file
                            "glyphlist.txt"...
/home/sam/.TeXmacs/system/tmp/tmp_2009563087.pdf /home/sam/.TeXmacs/system/tmp/tmp_514256944.pdf
                            ** ERROR ** TFM: Invalid TFM ID: -1

                            Does anyone have any tips on reducing the
                            PDF size? I tried changing from 600dpi to
                            300dpi but it only saved 2MB on the PDF

                            I'm quite certain from the PDF view (where
                            the screen shots are a bit washed out and
                            have a funny interference pattern) that
                            the images are being scaled up and even
                            screened in some way, which is why the
                            post script is so big.


                        Changing the dpi for the printer settings
                        affected the relative size of the images in
                        the document :-( so I had to change back.

                        Further examination lays the blame with
                        pstopdf, although I guess a png decoder could
                        have been embedded in the post-script which
                        would have kept the ..ps file small - and
                        myabe the PDF small too, although I still have
                        to investigate what the PDF is doing.

This link talks about png decoders in post-script. http://www.tek-tips.com/viewthread.cfm?qid=1050035&page=7
<http://www.tek-tips.com/viewthread.cfm?qid=1050035&page=7>

                        Here is a slice of one of my  images:
                        http://mail.liddicott.com/blotchy-2-orig.png

                        It looks just as fine in the post-script view.

                        And here is a screenshot from my PDF viewer at
                        400%
                        http://mail.liddicott.com/blotchy-2.png

                        The areas of solid 24 bit colour have been
                        dot-ified, some kind of hatching or other
                        dithering it seems.

                        The change I observe may be different defaults
                        for pstopdf as I have just upgraded my ubuntu
                        release.

                        Sam











    --     [FSF Associate Member #2325]
<http://www.fsf.org/register_form?referrer=2325>

<http://www.openrightsgroup.org/>






--
[FSF Associate Member #2325] <http://www.fsf.org/register_form?referrer=2325>

<http://www.openrightsgroup.org/>

Attachment: 0001-Disable-qt_image_to_eps-and-gs_to_eps-in-image_to_ep.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]