qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] QCOW2 support for LZO compression


From: Denis V. Lunev
Subject: Re: [Qemu-block] QCOW2 support for LZO compression
Date: Mon, 26 Jun 2017 23:56:08 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1

On 06/26/2017 11:54 PM, Peter Lieven wrote:
>
>> Am 26.06.2017 um 22:30 schrieb Denis V. Lunev <address@hidden>:
>>
>>> On 06/26/2017 11:28 AM, Kevin Wolf wrote:
>>> [ Cc: qemu-devel; don't post to qemu-block only! ]
>>>
>>> Am 26.06.2017 um 09:57 hat Peter Lieven geschrieben:
>>>> Hi,
>>>>
>>>> I am currently working on optimizing speed for compressed QCOW2
>>>> images. We use them for templates and would also like to use them for
>>>> backups, but the latter is almost infeasible because using gzip for
>>>> compression is horribly slow. I tried to experiment with different
>>>> options to deflate, but in the end I think its better to use a
>>>> different compression algorithm for cases where speed matters. As we
>>>> already have probing for it in configure and as it is widely used I
>>>> would like to use LZO for that purpose. I think it would be best to
>>>> have a flag to indicate that compressed blocks use LZO compression,
>>>> but I would need a little explaination which of the feature fields I
>>>> have to use to prevent an older (incompatible) Qemu opening LZO
>>>> compressed QCOW2 images.
>>>>
>>>> I also have already some numbers. I converted a fresh Debian 9 Install
>>>> which has an uncomressed QCOW2 size of 1158 MB with qemu-img to a
>>>> compressed QCOW2.  With GZIP compression the result is 356MB whereas
>>>> the LZO version is 452MB. However, the current GZIP variant uses 35
>>>> seconds for this operation where LZO only needs 4 seconds. I think is
>>>> is a good trade in especially when its optional so the user can
>>>> choose.
>>>>
>>>> What are your thoughts?
>>> We had a related RFC patch by Den earlier this year, which never
>>> received many comment and never got out of RFC:
>>>
>>> https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg04682.html
>>>
>>> So he chose a different algorithm (zstd). When I asked, he posted a
>>> comparison of algorithms (however a generic one and not measured in the
>>> context of qemu) that suggests that LZO would be slightly faster, but
>>> have a considerable worse compression ratio with the settings that were
>>> benchmarked.
>>>
>>> I think it's clear that if there is any serious interest in compression,
>>> we'll want to support at least one more algorithm. What we still need to
>>> evaluate is which one(s) to take, and whether a simple incompatible flag
>>> in the header like in Den's patch is enough or whether we should add a
>>> whole new header field for the compression algorithm (like we already
>>> have for encryption).
>>>
>>> Kevin
>> I have been contacted today Yann Collet who is ZSTD maintainer, he has
>> dropped
>> nowadays status of ZSTD, which could be useful for the discussion:
>>
>> "_1. zstd package availability_
>>
>> We have been tracking distribution availability since Zstandard official
>> release, in September 2016 :
>> https://github.com/facebook/zstd/issues/320
>> There is also this tool which tracks availability of packages :
>> https://repology.org/metapackage/zstd/versions
>>
>> zstd seems now available as a package in most recent distributions.
>> It’s even part of “core” for recent BSD releases.
>> Zstandard v1.0 is still less than 1 year old, so older distributions
>> typically do not have it (or support a development version).
>> That’s the main limitation currently. We expect things to improve over time.
>>
>> 2.
>>
>>    _Compression speed is good but does not matter
>>    _For such scenarios, it’s possible to trade speed for more compression.
>>    At its maximum compression level (--ultra -22), zstd compression
>>    ratio (and speed) is close to lzma.
>>    A nice property though is that decompression speed remains roughly
>>    the same at all compression levels,
>>    about 10x faster than lzma decompression speed (about 1 GB/s on
>>    modern CPU).
>>
>> 3.
>>
>>    _zstd is multi-threaded, and it’s dangerous_
>>
>> libzstd is single-threaded.
>> There is a multi-thread extension, which is enabled in the CLI, but not
>> in the library.
>> There is also an experimental target which makes it possible to produce
>> a MT-enabled library.
>> Even in this case, the API remains single-threaded by default.
>> It’s necessary to use dedicated entry points to enable multi-threading.
>> TL;DR : zstd supports multithreading, but is single threaded by default.
>>
>>
>>
>> 4.
>>
>>    _How to identify gz format from zstd one ?
>>    _Many implementations assume they require to add some custom header
>>    in order to identify gz from zstd.
>>    That’s not the case: well-formed compression format already provide
>>    a header with enough information to guarantee their identity.
>>    Such “good” compression format include gz, zstd, xz, lz4-frame, to
>>    name a few.
>>    For zstd, the identifier is a 4-bytes value, documented in the
>>    compression format :
>>    
>> https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#zstandard-frames
>>    As an example, zstd project provides a zlib-wrapper which is able to
>>    dynamically recognize an input as gz or zstd, and route to
>>    appropriate decoder, without any special header :
>>    https://github.com/facebook/zstd/tree/dev/zlibWrapper
>>
>>
>> Unfortunately, not all compression algorithm do provide unambiguous
>> standard header.
>> LZO, for example, does not by default.
>> Behind a single name, lzo effectively groups multiple incompatible
>> variants, which must be correctly identified for proper decoding."
>>
>> Den
>>
> Hi Den,
>
> thanks for the update.
>
> I am about to have an RFC patchset ready for addition of the compression 
> algorithm header. It will be easy to add support for zstd on top of that.
>
> I would prefer to have the same algorithm for all compressed clusters and 
> avoid detecting for each cluster. If you want to change the algorithm you 
> would have to recode. This way you can also easily detect at open time of the 
> image if you support the compression algorithm and fail directly if necessary.
>
> Peter
completely agree.

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]