lzip-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: plzip: manual gives very false numbers, real defaults are huge!


From: Antonio Diaz Diaz
Subject: Re: plzip: manual gives very false numbers, real defaults are huge!
Date: Wed, 08 May 2024 17:38:34 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14

Hi Steffen,

Steffen Nurpmeso wrote:
   #?0|kent:plzip-1.11$ cp /x/balls/gcc-13.2.0.tar.xz X1
   #?0|kent:plzip-1.11$ cp X1 X2
[...]
   -rw-r----- 1 steffen steffen 89049959 May  7 22:14 X1.lz
   -rw-r----- 1 steffen steffen 89079463 May  7 22:14 X2.lz

Note that if you use uncompressible files as input, you'll always obtain similar compressed sizes, no matter the compression level or the dictionary size. Try the test with gcc-13.2.0.tar and you'll see the difference. (As in your other test with /x/doc/coding/austin-group/202x_d4.txt).

I think dynamically scalling according to the processors, talking
into account the dictionary size, as you said above, is the sane
approach for "saturating" with plzip, in the above job there are
quite a lot of files, of varying size (the spam DB being very
large), and one recipe is not good for them all.

Maybe there is a better way (almost optimal for many files) to compress the spam DB that does not require a parallel compressor, but uses all the processors in your machine. (And, as a bonus, achieves maximum compression on files of any size and produces reproducible files).

  ls | xargs -n1 -P4 lzip -9

The command above should produce better results than a saturated plzip.

'ls' may be replaced by any way to generate a list of the files to be compressed. See http://www.gnu.org/software/findutils/manual/html_node/find_html/xargs-options.html

Hope this helps,
Antonio.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]