qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minim


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v2 1/2] qcow2: Give the refcount cache the minimum possible size by default
Date: Mon, 16 Apr 2018 16:05:21 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0

On 2018-04-16 15:56, Alberto Garcia wrote:
> On Fri 13 Apr 2018 05:00:48 PM CEST, Max Reitz wrote:
>> On 2018-03-14 09:29, Alberto Garcia wrote:
>>> The L2 and refcount caches have default sizes that can be overridden
>>> using the l2-cache-size and refcount-cache-size (an additional
>>> parameter named cache-size sets the combined size of both caches).
>>>
>>> Unless forced by one of the aforementioned parameters, QEMU will set
>>> the unspecified sizes so that the L2 cache is 4 times larger than the
>>> refcount cache.
>>>
>>> This is based on the premise that the refcount metadata needs to be
>>> only a fourth of the L2 metadata to cover the same amount of disk
>>> space. This is incorrect for two reasons:
>>>
>>>  a) The amount of disk covered by an L2 table depends solely on the
>>>     cluster size, but in the case of a refcount block it depends on
>>>     the cluster size *and* the width of each refcount entry.
>>>     The 4/1 ratio is only valid with 16-bit entries (the default).
>>>
>>>  b) When we talk about disk space and L2 tables we are talking about
>>>     guest space (L2 tables map guest clusters to host clusters),
>>>     whereas refcount blocks are used for host clusters (including
>>>     L1/L2 tables and the refcount blocks themselves). On a fully
>>>     populated (and uncompressed) qcow2 file, image size > virtual size
>>>     so there are more refcount entries than L2 entries.
>>>
>>> Problem (a) could be fixed by adjusting the algorithm to take into
>>> account the refcount entry width. Problem (b) could be fixed by
>>> increasing a bit the refcount cache size to account for the clusters
>>> used for qcow2 metadata.
>>
>> Yeah, well, but I wouldn't give too much thought to (b).  In case of
>> internal snapshots, you won't write to them, so those clusters don't
>> matter.  So, yes, you get a bit of overhead for the metadata, but come
>> on, that's not really much.
> 
> With that text I just wanted to point out that the calculation mixes
> host clusters and guest clusters. I agree that the ratio we're using is
> a good approximation (apart from the refcount width issue).
> 
>>> However this patch takes a completely different approach and instead
>>> of keeping a ratio between both cache sizes it assigns as much as
>>> possible to the L2 cache and the remainder to the refcount cache.
>>
>> Seems reasonable in a sense, but I'm wary still.  I especially don't
>> like that it just changes the default when nothing has been specified.
>>
>> I can see that we want to limit the refcount cache size, so if you
>> want to give really much RAM to the L2 tables, then you don't want a
>> 25 % overhead for the refcount cache, but do we really need to limit
>> it to the minimum size?
>>
>> OTOH, four clusters doesn't seem extremely limited...  Did you do your
>> benchmark on an HDD?
> 
> I tried, I see no difference, and I don't see why there would be any.
> 
> Refcount entries are used to reference host clusters, and host clusters
> are always(*) allocated sequentially, so caching refcount blocks doesn't
> do much. You're always loading the same refcount block until it's full,
> then you move on to the next one. As I mentioned in a previous e-mail,
> having a very large refcount cache would even be detrimental because it
> would make cache hits slower (noticeable under tmpfs).

Ah, right.  I'm not sure whether I want to ask you whether you have
tested internal snapshots.  I suppose it can be detrimental for them
because when taking or deleting a snapshot you need to update the
refcounts of a whole bunch of clusters that might be spread randomly
across the image.  But I suspect that it would be rather time-consuming
to produce an image with such a configuration; and I suppose if people
want to make heavy use of internal snapshots they can adapt the cache
size themselves?

Max

> (*) when a cluster gets deallocated (e.g. when the refcount table needs
> to grow) then there's a hole in the file and QEMU starts looking for
> free clusters again from the beginning of the file. In that case old
> refcount blocks are loaded again, but the cache wouldn't help here
> either (unless we would have all refcount blocks cached).

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]