lzip-bug
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: plzip: manual gives very false numbers, real defaults are huge!


From: Antonio Diaz Diaz
Subject: Re: plzip: manual gives very false numbers, real defaults are huge!
Date: Mon, 06 May 2024 16:33:17 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1.19) Gecko/20110420 SeaMonkey/2.0.14

Steffen Nurpmeso wrote:
Thanks for the quick response on a Saturday.

You are welcome. :-)

  Note that the number of usable threads is limited by file size;
  on files larger than  a  few  GB plzip  can  use  hundreds  of
  processors, but on files of only a few MB plzip is no faster than lzip.

Ok "you get scaling effects", but 70 MiB is not "a few MiB".

The above means "on files of only a few MB plzip can't be faster than lzip, no matter what options you use". Of course, at high compression levels the "few MB" become "several tens of MB".

67 megabytes per processor!  (How about doing a stat and somehow taking
into account st_size?  Or fstat, after the file was opened?

This would break reproducibility (obtaining identical compressed output from identical input) because the size of uncompressed data read from standard input (not from a file) can't be known in advance.

A single sentence that the "defaults" are (of course?!!?!)
dependent on the compression level would have shed some enlightening.

I'll try to document it better in the manual and in the man page.

(Having read the referenced section in the .info file in the
source tarball i would open an issue as "wishlist" asking for an
option that would scale-to-"a-reasonable-number-of"-cpus.

As I said above, such an option would not work with data read from standard input, and would break reproducibility.

Best regards,
Antonio.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]