qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC 0/1] Allow storing the qcow2 L2 cache in dis


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH RFC 0/1] Allow storing the qcow2 L2 cache in disk
Date: Tue, 13 Dec 2016 09:02:34 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1

On 2016-12-12 at 15:13, Alberto Garcia wrote:
On Fri 09 Dec 2016 03:21:08 PM CET, Max Reitz wrote:

In some scenarios, however, there's a different alternative: if the
qcow2 image is stored in a slow backend (eg. HDD), we could save
memory by putting the L2 cache in a faster one (SSD) instead of in
RAM.

Well, from a full design standpoint, it doesn't make a lot of sense to
me:

We have a two-level on-disk structure for cluster mapping so as to not
waste memory for unused areas and so that we don't need to keep one
large continuous chunk of metadata. Accessing the disk is slow, so we
also have an in-memory cache which is just a single level fully
associative cache replicating the same data (but just a part of it).

Now you want to replicate all of it and store it on disk. My mind
tells me that is duplicate data: We already have all of the metadata
elsewhere on disk, namely in the qcow2 file, and even better, it is
not stored in a fully associative structure there but directly mapped,
making finding the correct entry much quicker.

Yes but the use case is that the qcow2 image is stored in a slow disk,
so things will be faster if we avoid having to read it too often.

But the data is there and it needs to be read, so we have three options:

  1) Read it everytime we need it. It slows things down.
  2) Keep (part of) it in memory. It can use a lot of memory.
  3) Keep it in a faster disk.

We're talking about 3) here, and this it not about creating new
structures, but about changing the storage backend of the existing L2
cache (disk rather than RAM).

I'm arguing that we already have an on-disk L2 structure and that is called simply the L1-L2 structure in the qcow2 file. The cache only makes sense because it is in RAM.

However, the thing is that the existing structures also only exist in
the original qcow2 file and cannot be just placed anywhere else, as
opposed to our cache. In order to solve this, we would need to
(incompatibly) modify the qcow2 format to allow storing data
independently from metadata. I think this would be certainly doable,
but the question is whether it is worth the effort.

You mean split the qcow2 file in two: data and metadata? I don't think
it's worth the effort.

That's the thing. I don't know.

I definitely like how simple your approach is, but from a design standpoint it is not exactly optimal, because O(n) for a cluster lookup is simply worse than O(1).

Maybe we can at least make the cache directly mapped if it is supposed
to cover the whole image? That is, we would basically just load all of
the L2 tables into memory and bypass the existing cache.

I don't see how this addresses the original use case that I described.

It just fixes the issue that the cache is fully associative and then the only issue I would have with your approach is that we are keeping duplicate data.

But leaving that aside, would that improve anything? I don't think the
cache itself adds any significant overhead here, IIRC even in your
presentation at KVM Forum 2015 qcow2 was comparable to raw as long as
all L2 tables were cached in memory.

I haven't compared CPU usage, though. That may have gone up quite a bit, I don't know. For large enough images, it may even become a bottleneck.

Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]