Re: [Lzip-bug] Re: performance: gzip, lzip, xz

lzip-bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] Re: performance: gzip, lzip, xz

From:	Antonio Diaz Diaz
Subject:	Re: [Lzip-bug] Re: performance: gzip, lzip, xz
Date:	Tue, 13 Oct 2009 14:32:29 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.11) Gecko/20050905

Hello Jim,

Thanks for your interest in lzip. I hope I'll be able to convince youthat lzip is better than you think. :-)



Jim Meyering wrote:

Claiming that xz has no clear goal seems mildly libelous.

I am not trying to discredit anybody. I am only stating that the xzformat is far from ready for general use.

Maybe xz has a clear goal, but I have been unable to discover what itcould be. Perhaps its goal is to find out the limit between formatflexibility and format security, given the number of times the xz formathad to be changed due to security problems.

Clearly long term stability is not the goal of xz. Just read the READMEfile for 4.999.9beta, line 51:"Since the .xz format allows adding new filter IDs, it is possible thatsome day there will be a filter that is, for example, much faster tocompress than LZMA2 (but probably with worse compression ratio).Similarly, it is possible that some day there is a filter that willcompress better than LZMA2".

Will the old filters be removed as new ones are added, leaving userswithout support for old files, or will xz become increasingly bloated byold filters that almost nobody uses?

In any case, one does not need to be an IBM engineer to notice xz goalis not as clear as lzip goal:

http://lpar.ath0.com/2009/09/25/documentation-as-an-indicator-of-code-quality/

"Comparing the two, I see that xz has many more options. It has allkinds of tweaks to specify how much memory it uses, tweak variousinternal details of the LZMA algorithm, and filter the data. None ofthese options are adequately explained. To quote Ted Nelson quotingRoger Gregory, "An option means the programmer didn't have a clear ideaof what the module was supposed to do." Or as Steve Krug puts it, "Don'tmake me think."

In contrast, lzip's user interface is much simpler, and closer to theUnix philosophy of "do one thing, and do it well". The only two tweaksto the LZMA algorithm lzip provides are adequately explained if you knowthe basics of how compression algorithms tend to work, and there's atable showing how they correspond to the compression levels -0 to -9.The only borderline gratuitous option is to split the compressed fileinto chunks, and that's at least a useful one. It also gets the SI unitsright.


So, lzip wins by a landslide on UI and documentation".

The .xz format is in no way an archive-like format. You cannot store
file names in .xz, and .xz supports even less metadata than .gz.

By archiver-like I mean it is way too complicated for a general purposecompressor and it includes features I have only found in archiverformats, like the subblock filter.

Regarding the possibility of recovery, there are not many differences
between .xz and .lz.

There is an important difference; in case of data corruption, xz formatcan fail in a thousand more ways than the much simpler lzip format. Thisis the reason lzip does have a recovery tool already, and XZ Utils doesnot. Just compare the formats to see what I mean.

http://www.nongnu.org/lzip/manual/lzip_manual.html#File-Format
http://tukaani.org/xz/xz-file-format-1.0.4.txt

One inconsistency that can make difficult even the detection of datacorruption in xz files is that the format only requires implementationsto support CRC32[1], but the xz tool uses CRC64 by default[2].

[1] see xz-file-format-1.0.4.txt, line 353.
[2] see "man xz", line 362.

Claiming long-term stability of the .lz format is a stretch.

Lzip format is definitive. It offers the same capabilities as bzip2. Ifsome day I discover some better compression algorithm and decide toimplement it, I'll write a new compressor and format. Remember, "do onething, and do it well".

The file format has changed at least once (probably twice, but I'm
not sure) since the first stable release.  Older versions of lzip
cannot decompress new format files.  The same can and (I'm sure) will
happen with .xz too, but in case of .lz, it has been about adding basic
features that .xz had in the first place.

Lzip format has changed exactly once form the first released version.The only two changes were:The "member size" field was added to improve the recovery of undamagedmembers from multimember files.Coding of dictionary size in member header was extended to support morefine grained values.

I do not see those changes as "basic features", and certainly datarecovery is not present in xz even now.



Regards,
Antonio.

[Prev in Thread]

Current Thread

[Next in Thread]

[Lzip-bug] Re: performance: gzip, lzip, xz, Jim Meyering, 2009/10/12
- Re: [Lzip-bug] Re: performance: gzip, lzip, xz, Antonio Diaz Diaz <=
  - Re: [Lzip-bug] Re: performance: gzip, lzip, xz, Jim Meyering, 2009/10/13
    - Re: [Lzip-bug] Re: performance: gzip, lzip, xz, Antonio Diaz Diaz, 2009/10/14
    - Re: [Lzip-bug] Re: performance: gzip, lzip, xz, Jim Meyering, 2009/10/14
    - Re: [Lzip-bug] Re: performance: gzip, lzip, xz, Antonio Diaz Diaz, 2009/10/14

Prev by Date: Re: [Lzip-bug] [PATCH] lzip on Windows
Next by Date: Re: [Lzip-bug] [PATCH] lzip on Windows
Previous by thread: [Lzip-bug] Re: performance: gzip, lzip, xz
Next by thread: Re: [Lzip-bug] Re: performance: gzip, lzip, xz
Index(es):
- Date
- Thread