Re: [Lzip-bug] lzip vs. zstd

lzip-bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] lzip vs. zstd

From:	Antonio Diaz Diaz
Subject:	Re: [Lzip-bug] lzip vs. zstd
Date:	Thu, 20 Oct 2016 02:47:21 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14

address@hidden wrote:

do you already have numbers, opinions and maybe a comparison in
reliability, speed, compression ratio etc. against the new zstd?

I have used unzcrash to test the ability of the zstd decoder to detectcorruption by itself (without a checksum), and the results are not good.As an example, here are the results of repeatedly decompressing the fileCOPYING.zst (a copy of the GPLv3) inverting a bit each time as to testall possible bit flips:


   11913 bytes tested
   95304 total decompressions
   56058 decompressions returned with zero status, of which
   56017 comparisons failed

The zstd decoder detects the corruption less than half of the times.Compare this with the lzip decoder, that detects about 99.99995% of thebit flips even without the help of its 3-factor integrity checking.

Using 'zstd --no-check' is significantly unsafer than using 'xz--check=none'.

Even with integrity checking enabled, my guess is that it is at least amillion times more probable to get a false negative (undetectedcorruption) from zstd than from lzip.

The zstd file format has many of the defects of the xz format[1];unprotected lengths, unprotected flags, unprotected dictionary IDs,optional integrity checking, optional file concatenation, and it doesnot seem to admit trailing data. Also the current version of the zstdfile format is 0.2.0, which may mean that changes in the format areexpected.

Zstd is described as a "fast real-time compression algorithm". AFAIK,its author does not recommend zstd for long-term archiving.


So my advice is that you should not use zstd for long-term archiving.

[1] http://www.nongnu.org/lzip/xz_inadequate.html

Juan Francisco Cantero Hurtado asked me if I know why the tests of zstdtake so long to finish.

It seems that 'make test' takes a lot of time (17 min) because it is afull regression test, not just a small test with a few files to verifythat compilation went well, as most programs do. The theoretical basisof zstd[2] seems more complicated than that of LZMA, and the authorprobably wants to make sure that any possible bug is caught early.

[2] https://arxiv.org/abs/1311.2540 Asymmetric numeral systems: entropycoding combining speed of Huffman coding with compression rate ofarithmetic coding.



Best regards,
Antonio.

[Prev in Thread]

Current Thread

[Next in Thread]

[Lzip-bug] lzip vs. zstd, address@hidden, 2016/10/13
- Re: [Lzip-bug] lzip vs. zstd, Antonio Diaz Diaz, 2016/10/14
- Re: [Lzip-bug] lzip vs. zstd, Antonio Diaz Diaz <=

Prev by Date: [Lzip-bug] Please, help me to compile lzip extract support into 7zip or easy7zip
Next by Date: [Lzip-bug] LZIP codec for 7-Zip
Previous by thread: Re: [Lzip-bug] lzip vs. zstd
Next by thread: [Lzip-bug] Please, help me to compile lzip extract support into 7zip or easy7zip
Index(es):
- Date
- Thread