[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lzip-bug] Want to Jettison xz(1), But Size Matters.
From: |
Antonio Diaz Diaz |
Subject: |
Re: [Lzip-bug] Want to Jettison xz(1), But Size Matters. |
Date: |
Wed, 18 Jul 2018 20:07:30 +0200 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14 |
Hi Ralph,
Ralph Corderoy wrote:
Having read http://lzip.nongnu.org/xz_inadequate.html I'm happy to move
away from xz(1), having been lured by coreutils adding it originally.
So I picked a random Gimp XCF file already xz'd and compared sizes.
55,569138 gimp
21,001368 xz -9
23,299403 lzip -9 23,299403 / 21,001368 = 1.109
[...]
Is there a known reason why xz does noticeably better is some
situations like this one?
I have never tried to compress .xcf files, but there is a known reason
why 'xz -9' compresses files larger than 32 MiB better than 'lzip -9'.
It is explained at http://www.nongnu.org/lzip/lzip_benchmark.html#xz2
----------------------------------------------------------------------
"xz -9" uses a dictionary size twice as large as "lzip -9" (and twice as
large as "lzma -9"). This makes it appear as if xz could compress large
files a little more than lzip. To find the truth just pass to lzip the
arguments equivalent to those of "xz -9" (or to xz the arguments
equivalent to those of "lzip -9"), and lzip will usually compress more
than xz:
linux-libre-3.12.5-gnu.tar (size 535347200)
"lzip -m64 -s64MiB" 74192464 9m16s
"xz -9" 74306080 9m 7s
"lzip -9" 74330266 10m53s
"xz --lzma2=nice=273,dict=32MiB" 74563636 10m15s
Note that using plain "-9" on both compressors, lzip usually compresses
large files about as much as xz, but using half the RAM and requiring
half the RAM to decompress.
----------------------------------------------------------------------
The large difference in compression ratio in this file may be caused,
for example, by it containing two areas of similar data more than 32 MiB
apart. This is why the lzip manual states that:
http://www.nongnu.org/lzip/manual/lzip_manual.html#Invoking-lzip
The bidimensional parameter space of LZMA can't be mapped to a linear
scale optimal for all files. If your files are large, very repetitive,
etc, you may need to use the '--dictionary-size' and '--match-length'
options directly to achieve optimal performance.
I have just tried to compress a couple small .xcf files from the gimp
distribution and lzip compresses both better than xz, so I guess that
'lzip -9 -s64MiB' should improve the compression ratio of this file.
Hope this helps,
Antonio.