[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] Performance impact of the qcow2 overlap checks
From: |
Alberto Garcia |
Subject: |
Re: [Qemu-block] Performance impact of the qcow2 overlap checks |
Date: |
Tue, 24 Jan 2017 17:43:44 +0100 |
User-agent: |
Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu) |
On Mon 23 Jan 2017 05:29:49 PM CET, Max Reitz wrote:
> Refcount data will only be queried when writing data to the image. If
> that data has been overwritten, we have a chance that it is being set
> to 0 (which is rather large because 0 generally has a higher
> probability of being a part of data, admittedly). But we also have a
> chance that it is set to something else, which generally will be
> greater than the number of internal snapshots (+ 1). Therefore, such
> corruption should be easily detectable before much data is wrongly
> overwritten.
>
> The drawbacks with this approach would be the following:
> (1) Is printing a warning enough to make the user shut down the VM as
> fast as possible and run qemu-img check?
> (2) It is legal to have a greater refcount than the number of internal
> snapshots plus one. qemu never produces such images, though (or does
> it?). Could there be existing images where users will be just annoyed by
> such warnings? Should we add a runtime option to disable them?
I don't think it's legal, or is there any reason why it would be?
I'll try to summarize my opinion:
- If using that refcount method that you propose we can guarantee that
the image is corrupted then that should clearly cause an I/O error,
and I would prevent further writes to the image (if that's possible).
- If this method cannot guarantee that it's corrupted but it can only
give us an indication that it could be then I don't think I'd bother
and I'd simply keep the current overlap check.
- Printing a warning and expecting the user to see it doesn't seem like
a good way to deal with data corruption.
> And of course another approach I already mentioned would be to scrap
> the overlap checks altogether once we have image locking (and I guess
> we can keep them around in their current form at least until then).
I think the overlap checks are fine, at least in my tests I only found
problems with one of them, and only in some scenarios(*). So if we
cannot optimize them easily I'd simply tell the user about the risks and
suggest to disable them. Maybe the only thing that we need is simply
good documentation. What are the chances of corrupted qcow2 images that
are not caused by the user messing up? Do we know how many cases of
those are?
I think the most obvious candidate for optimization is refcount-block,
and as I said it's the check what would create the bottleneck in most
common scenarios. The optimization is simple: if the size of the qcow2
image is 7GB then you only need to check the first 4 entries in the
refcount table.
I can think of two problems with this, which is why I haven't sent a
patch yet:
(1) This needs the current size of the image every time we want to
perform that check, and that means I/O.
(2) The unused entries that we're skipping in the refcount table should
be 0, but what if they're not? That would be a sign of data
corruption. But should we bother? Those entries will be checked
before they're used if the image grows large enough.
(*)I actually noticed (I'm talking about a qcow2 image stored in RAM
now) that disabling the refcount-block check increases dramatically
(+90%) the number of IOPS when using virtio-blk, but doesn't seem to
have any effect (my tests even show a slightly negative effect!!) when
using virtio-scsi. Does that make sense? Am I hitting a SCSI limit or
what would be the reason for this?
Berto
- [Qemu-block] Performance impact of the qcow2 overlap checks, Alberto Garcia, 2017/01/18
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Max Reitz, 2017/01/18
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Alberto Garcia, 2017/01/19
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Max Reitz, 2017/01/21
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Alberto Garcia, 2017/01/23
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Max Reitz, 2017/01/23
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks,
Alberto Garcia <=
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Max Reitz, 2017/01/25
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Alberto Garcia, 2017/01/25
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Max Reitz, 2017/01/25
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, John Snow, 2017/01/31
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Alberto Garcia, 2017/01/31
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Max Reitz, 2017/01/31
- Re: [Qemu-block] Performance impact of the qcow2 overlap checks, Max Reitz, 2017/01/31