qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/12] qcow2: Add new overlap check functions


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH 00/12] qcow2: Add new overlap check functions
Date: Fri, 14 Nov 2014 16:10:15 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 2014-11-03 at 18:04, Max Reitz wrote:
As has been requested, this series adds new overlap check functions to
the qcow2 code. My local branch is called "qcow2-improved-overlap-v1",
but I am not so sure whether it is actually an improvement; that is left
for you to decide, dear reviewers.

See patch 1 for an explanation of why this series exists and what it
does. Patch 1 is basically the core of this series, the rest just
employs the functions introduced there.

I have yet to do benchmarks to test whether this series actually
improves things, but judging from the iotests it at least does not slow
things down (which it did at one time during development, particularily
test 044 is good for testing this, so this actually has some
significance to it).

In a later patch, we may want to change the meaning of the "constant"
overlap checking option to mean the same as "cached", which is
everything except for inactive L2 tables. This series does make
checking for overlaps with inactive L2 tables at runtime just as cheap
as everything else (constant time plus caching), but using these checks
means qemu has to read all the snapshot L1 tables when opening a qcow2
file. This does not take long, of course, but it does result in a bit of
overhead so I did not want to enable it by default.

I think just enabling all overlap checks by default after this series
should be fine and useful, though.

Rejoice, for I return with benchmarks; the kind of benchmarks which always show the result I want them to show, which I should be known for by now.

First, I basically tried the setup from last time only this time I didn't care about the in-VM I/O performance but just use perf record -g to record the amount of cycles used by the overlap check. This worked somehow, but bonnie++ (which I used as the in-VM benchmark tool) does some different tests, both reading and writing and writing with different sizes, so the result is not that bad there.

So I did what's absolutely worst for the overlap checks: dd if=/dev/zero of=/dev/vda bs=65536 oflag=direct (even worse would be cluster_size=512 and bs=512, but I wanted to get the test over with today, so I just went for 64k). The image was a 1G qcow2 image in tmpfs with ten snapshots (each having 128 MB of data, all pointing to the same data clusters which are different from the active clusters) just because having internal snapshots makes the overlap checks even more CPU intensive, of course.

I ran dd 42 times in a row (for i in $(seq 1 42); do ...; done) and started up perf record just after I hit enter and canceled it just before the last dd exited.

I don't remember the exact numbers, but for the currently existing overlap check function (using the default of overlap-check=cached), it used about 13.5 % in the first run, 10.5 % in the second and (this I do know) 12.41 % in the third run.

With these patches applied, I had 0.08 % in the first run with overlap-check=cached and 0.09 % in the second run with overlap-check=all.

(all percentages are referring to the fraction of cycles used by qcow2_check_metadata_overlap())

So this series apparently is actually worth it. I could yet do another benchmark where I test what happens if the cache is too small, which means that the range list representation has to be converted to the bitmap all the time and vice versa. The biggest problem with that would be to somehow fit the image into tmpfs (if it's not in tmpfs, I don't want to do CPU benchmarks)...

Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]