[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: zlib-ng as a compat replacement for zlib
From: |
Richard W.M. Jones |
Subject: |
Re: zlib-ng as a compat replacement for zlib |
Date: |
Fri, 1 Sep 2023 11:03:54 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, Sep 01, 2023 at 10:55:50AM +0100, Daniel P. Berrangé wrote:
> On Fri, Sep 01, 2023 at 10:42:16AM +0100, Richard W.M. Jones wrote:
> > On Fri, Sep 01, 2023 at 10:48:14AM +0200, Kevin Wolf wrote:
> > > Am 31.08.2023 um 11:20 hat Richard W.M. Jones geschrieben:
> > > > On Thu, Aug 31, 2023 at 11:05:55AM +0200, Kevin Wolf wrote:
> > > > > [ Cc: qemu-block ]
> > > > >
> > > > > Am 30.08.2023 um 20:26 hat Richard W.M. Jones geschrieben:
> > > > > > On Tue, Aug 29, 2023 at 05:49:24PM -0000, Daniel Alley wrote:
> > > > > > > > The background to this is I've spent far too long trying to
> > > > > > > > optimize
> > > > > > > > the conversion of qcow2 files to raw files. Most existing
> > > > > > > > qcow2 files
> > > > > > > > that you can find online are zlib compressed, including the
> > > > > > > > qcow2
> > > > > > > > images provided by Fedora. Each cluster in the file is
> > > > > > > > separately
> > > > > > > > compressed as a zlib stream, and qemu uses zlib library
> > > > > > > > functions to
> > > > > > > > decompress them. When downloading and decompressing these
> > > > > > > > files, I
> > > > > > > > measured 40%+ of the total CPU time is doing zlib decompression.
> > > > > > > >
> > > > > > > > [You don't need to tell me how great Zstd is, qcow2 supports
> > > > > > > > this for
> > > > > > > > compression also, but it is not widely used by existing
> > > > > > > > content.]
> > > > >
> > > > > You make it sound like compressing each cluster individually has a big
> > > > > impact. If so, does increasing the cluster size make a difference,
> > > > > too?
> > > > > That could be an change with less compatibility concerns.
> > > >
> > > > The issue we're discussing in the original thread is speed of
> > > > decompression. We noted that using zlib-ng (a not-quite drop-in
> > > > replacement for zlib) improves decompression speed by 40% or more.
> > > >
> > > > Original thread:
> > > > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/CDNPJ4SOTRQMYVCDI3ZSY4SP4FYESCWD/
> > > > zlib-ng proposed change:
> > > > https://src.fedoraproject.org/rpms/zlib-ng/pull-request/3
> > > >
> > > > Size of the compressed file is also a concern, but wasn't discussed.
> > >
> > > I understand the context and didn't really think about file size at all.
> > >
> > > My question was essentially if decompressing many small blocks (as we do
> > > today) performs significantly different from decompressing fewer larger
> > > blocks (as we would do with a larger cluster size).
> >
> > I did a quick test just by adjusting the cluster size of a qcow2 file:
> >
> > $ virt-builder fedora-36
> > $ ls -lsh fedora-36.img
> > 1.2G -rw-r--r--. 1 rjones rjones 6.0G Sep 1 09:53 fedora-36.img
> > $ cat fedora-36.img fedora-36.img fedora-36.img fedora-36.img > test.raw
> > $ ls -lsh test.raw
> > 4.7G -rw-r--r--. 1 rjones rjones 24G Sep 1 09:53 test.raw
> > $ qemu-img convert -f raw test.raw -O qcow2 test.qcow2.zlib.4k -c -o
> > compression_type=zlib,cluster_size=4096
> >
> > (for cluster sizes 4k, 64k, 512k, 2048k, and
> > compression types zlib & zstd)
> >
> > I tested the speed of decompression using:
> >
> > $ hyperfine 'qemu-img convert -W -m 16 -f qcow2 test.qcow2.XXX -O raw
> > test.out'
> > (qemu 8.0.0-4.fc39.x86_64)
> >
> > $ hyperfine 'nbdkit -U - --filter=qcow2dec file test.qcow2.XXX --run
> > '\''nbdcopy --request-size "$uri" test.out'\'' '
> > (nbdkit-1.35.11-2.fc40.x86_64)
> >
> > Results:
> >
> > Cluster Compression Compressed size Prog Decompression speed
> >
> > 4k zlib 3228811264 qemu 5.921 s ± 0.074 s
> > 4k zstd 3258097664 qemu 5.189 s ± 0.158 s
> >
> > 4k zlib/zstd nbdkit failed, bug!!
> >
> > 64k zlib 3164667904 qemu 3.579 s ± 0.094 s
> > 64k zstd 3132686336 qemu 1.770 s ± 0.060 s
> >
> > 64k zlib 3164667904 nbdkit 1.254 s ± 0.065 s
> > 64k zstd 3132686336 nbdkit 1.315 s ± 0.037 s
> >
> > 512k zlib 3158744576 qemu 4.008 s ± 0.058 s
> > 512k zstd 3032697344 qemu 1.503 s ± 0.072 s
> >
> > 512k zlib 3158744576 nbdkit 1.702 s ± 0.026 s
> > 512k zstd 3032697344 nbdkit 1.593 s ± 0.039 s
> >
> > 2048k zlib 3197569024 qemu 4.327 s ± 0.051 s
> > 2048k zstd 2995143168 qemu 1.465 s ± 0.085 s
> >
> > 2048k zlib 3197569024 nbdkit 3.660 s ± 0.011 s
> > 2048k zstd 2995143168 nbdkit 3.368 s ± 0.057 s
> >
> > No great surprises - very small cluster size is inefficient, but
> > scaling up the cluster size gain performance, and zstd performs better
> > than zlib once the cluster size is sufficiently large.
>
> The default qcow2 cluster size is 64k, which means we've already
> got the vast majority of the perfornmance and file size win. Going
> beyond 64k defaults doesn't seem massively compelling.
>
> zstd does have a small space win over zlib as expected, but again
> nothing so drastic that it seems compelling to change - that win
> will be line noise in the overall bigger picture of image storage
> and download times.
Yeah, I was a bit surprised by this. I expected zstd files to be
significantly smaller than zlib even though that's not what zstd is
optimized for. Not that they'd be about the same.
> The major difference here is that zstd is much faster than zlib
> at decompress. I'd be curious if zlib-ng closes that gap ?
It's quite hard to use zlib-ng in Fedora (currently) since it requires
changes to the source code. That is what the pull request being
discussed would change, as you could simply install zlib-ng-compat
which would replace libz.so. But anyway I can't easily get results
for qemu + zlib-ng, but we expect it would be ~ 40% faster at
decompression, and decompression is what is taking most of the time in
the qemu numbers above.
I forgot to say that nbdkit is using zlib-ng, since I made the source
level changes a few weeks back (but most of the nbdkit performance
improvement comes from being able to use lots of threads).
> If it does, then for the sake of image portability it'd be better
> to stick with zlib compression in qcow2 and leverage zlib-ng for
> speed, and ignore zstd.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
Re: zlib-ng as a compat replacement for zlib, Richard W.M. Jones, 2023/09/01
Re: zlib-ng as a compat replacement for zlib, Florian Weimer, 2023/09/01