[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/1] qcow2: Skip copy-on-write when allocating a zero cluster
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [PATCH 1/1] qcow2: Skip copy-on-write when allocating a zero cluster |
Date: |
Fri, 14 Aug 2020 21:07:08 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 |
14.08.2020 17:57, Alberto Garcia wrote:
Since commit c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a when a write
request results in a new allocation QEMU first tries to see if the
rest of the cluster outside the written area contains only zeroes.
In that case, instead of doing a normal copy-on-write operation and
writing explicit zero buffers to disk, the code zeroes the whole
cluster efficiently using pwrite_zeroes() with BDRV_REQ_NO_FALLBACK.
This improves performance very significantly but it only happens when
we are writing to an area that was completely unallocated before. Zero
clusters (QCOW2_CLUSTER_ZERO_*) are treated like normal clusters and
are therefore slower to allocate.
This happens because the code uses bdrv_is_allocated_above() rather
bdrv_block_status_above(). The former is not as accurate for this
purpose but it is faster. However in the case of qcow2 the underlying
call does already report zero clusters just fine so there is no reason
why we cannot use that information.
After testing 4KB writes on an image that only contains zero clusters
this patch results in almost five times more IOPS.
Would be great to add this case to simplebench as well.
The idea is good, but I'm a bit confused with new interface.
As I understand the "The former is not as accurate for this
purpose but it is faster" is related to (and only to) want_zero
parameter of block_status. bdrv_is_allocated_above is about
allocation (in terms of backing chains), and is_unallocated() is
just wrong user (with wrong name:)): it actually want another kind
of information.
So, for me it looks like we need an interface to
bdrv_block_status_above with want_zero=false (add another
function, or add this parameter to public interface).
And even with your approach, I'd keep original
bdrv_is_allocated_above as is, and just add new function with *is_zero
argument, to not modify all users of bdrv_is_allocated_above to pass
additional NULL value, in this way patch will touch less files.
Also, note, I have a series about block_status & is_allocated:
"[PATCH v5 0/5] fix & merge block_status_above and is_allocated_above"
(me go and ping it), which a bit reduces all the mess around
block_status & is_allocated.
--
Best regards,
Vladimir