qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] QCOW2 support for LZO compression


From: Denis V. Lunev
Subject: Re: [Qemu-block] QCOW2 support for LZO compression
Date: Mon, 26 Jun 2017 13:23:34 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1

On 06/26/2017 01:16 PM, Laszlo Ersek wrote:
> On 06/26/17 11:33, Denis V. Lunev wrote:
>> On 06/26/2017 12:20 PM, Peter Lieven wrote:
>>> Am 26.06.2017 um 10:28 schrieb Kevin Wolf:
>>>> [ Cc: qemu-devel; don't post to qemu-block only! ]
>>>>
>>>> Am 26.06.2017 um 09:57 hat Peter Lieven geschrieben:
>>>>> Hi,
>>>>>
>>>>> I am currently working on optimizing speed for compressed QCOW2
>>>>> images. We use them for templates and would also like to use them for
>>>>> backups, but the latter is almost infeasible because using gzip for
>>>>> compression is horribly slow. I tried to experiment with different
>>>>> options to deflate, but in the end I think its better to use a
>>>>> different compression algorithm for cases where speed matters. As we
>>>>> already have probing for it in configure and as it is widely used I
>>>>> would like to use LZO for that purpose. I think it would be best to
>>>>> have a flag to indicate that compressed blocks use LZO compression,
>>>>> but I would need a little explaination which of the feature fields I
>>>>> have to use to prevent an older (incompatible) Qemu opening LZO
>>>>> compressed QCOW2 images.
>>>>>
>>>>> I also have already some numbers. I converted a fresh Debian 9 Install
>>>>> which has an uncomressed QCOW2 size of 1158 MB with qemu-img to a
>>>>> compressed QCOW2.  With GZIP compression the result is 356MB whereas
>>>>> the LZO version is 452MB. However, the current GZIP variant uses 35
>>>>> seconds for this operation where LZO only needs 4 seconds. I think is
>>>>> is a good trade in especially when its optional so the user can
>>>>> choose.
>>>>>
>>>>> What are your thoughts?
>>>> We had a related RFC patch by Den earlier this year, which never
>>>> received many comment and never got out of RFC:
>>>>
>>>> https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg04682.html
>>> I was not aware of that one. Thanks for pointing out.
>>>
>>>> So he chose a different algorithm (zstd). When I asked, he posted a
>>>> comparison of algorithms (however a generic one and not measured in the
>>>> context of qemu) that suggests that LZO would be slightly faster, but
>>>> have a considerable worse compression ratio with the settings that were
>>>> benchmarked.
>>> My idea to choose LZO was that it is widely available and available in
>>> any distro you can think of. We already have probing for it in configure.
>>> My concern with ZSTD would be that it seems there are no packages
>>> available for most distros and that it seems to be multi-threaded. I
>>> don't
>>> know if this will cause any trouble?
>>>
>> We have had that compression working in multithreaded process.
>>
>>>> I think it's clear that if there is any serious interest in compression,
>>>> we'll want to support at least one more algorithm. What we still need to
>>>> evaluate is which one(s) to take, and whether a simple incompatible flag
>>>> in the header like in Den's patch is enough or whether we should add a
>>>> whole new header field for the compression algorithm (like we already
>>>> have for encryption).
>>> From my side there clearly is interest in optimizing the compression. Its
>>> even possible to speed up zlib by 3-4x times by choosing other parameters
>>> for deflate which unfortunately are not compatible with our inflate
>>> settings.
>>>
>>> I don't know if its worth creating a new header field. Even if we
>>> spent to bits
>>> in the end (one for LZO and one for ZSDT). I think this wouldn't hurt.
>>> However,
>>> there are likely to pop up new compression algorithms in the future and
>>> a header would be more flexible.
>>>
>>> I just don't want to make it too complicated and as you pointed out
>>> compression is
>>> not that interesting for most people - maybe due to its speed.
>>>
>> I think we need something generic but simple. I think that we should not
>> support compression with the different algorithm in the single file.
>>
>> Speaking about compression, we do have different constraints for
>> different situation, f.e. backups are written once and rarely read while
>> generic compression in backing store is read frequently but never
>> read. Thus the exact algorithm should be selectable.
> Pluggable / selectable compression methods are likely the most flexible
> and future-proof. A new header sounds good to me (... said by someone
> who comments on this from the sidelines.)
>
> I would advise caution against multi-threaded compression libraries.
> Unless they are coded very-very carefully with regard to signal handling
> and general error handling / propagation, they cannot be considered
> "opaque" enough.
>
> (I had written and maintained the original (0.x) branch of "lbzip2",
> which was extremely conscious of error handling and signals. That was a
> challenge even in a standalone program, and I didn't even attempt to
> retrofit the code to the existing libbz2 APIs (i.e. I never even tried
> to librarize the code).)
>
> This does not mean that people cannot get such a library right. It's
> just that *by default* such a library will have a number of obscure bugs
> related to: signals, forking, and general error handling. It could also
> have problems with unbounded memory allocation. An MT compression
> library that gets all of this right is the exception IMO, not the norm.
> (I don't know anything about the ZSTD library; it could be such a high
> quality library.)
>
> Another complication with MT *de*compression is that the CPU demand from
> the IO thread (which is by default responsible for handling IO, when not
> using dataplane -- is that right?) would "leak" to other physical
> processors. I believe this can interfere with use cases where people
> carefully isolate host CPUs between "QEMU" and "non-QEMU" workloads,
> plus pin QEMU's VCPU threads, and IO threads, to different host CPUs
> (see vcpupin / emulatorpin / iothreadpin under
> <http://libvirt.org/formatdomain.html#elementsCPUTuning>.) It's probably
> possible to figure out the right thing for "ZSTD threads" as well, but
> IMO it remains a complication nonetheless.
We already have the prototype delegating compression
processing into thread pool processing other IO commands
like flush/fallocate. All compression/decompression
routines in long term MUST be moved out of IO thread
as they add too much latency.

>
> Personally I would recommend a new header, and LZO, as a starting point.
> As pointed out above, LZO is widely available in distros. It has good
> performance, and it is single-threaded similarly to zlib. I use LZO for
> two QEMU-related purposes ATM:
>
> - I use the kdump-lzo format when dumping guest memory
>   (virsh dump $DOMAIN $CORE_FLE --memory-only --format kdump-lzo)
>
> - I use LZO compression for "virsh managedsave"
>   (by setting "save_image_format" in "/etc/libvirt/qemu.conf" to "lzop")
>
> Thanks
> Laszlo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]