[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PULL 18/21] qcow2: Optimize the refcount-block overlap che
From: |
Max Reitz |
Subject: |
[Qemu-devel] [PULL 18/21] qcow2: Optimize the refcount-block overlap check |
Date: |
Sun, 12 Feb 2017 02:39:26 +0100 |
From: Alberto Garcia <address@hidden>
The metadata overlap checks introduced in a40f1c2add help detect
corruption in the qcow2 image by verifying that data writes don't
overlap with existing metadata sections.
The 'refcount-block' check in particular iterates over the refcount
table in order to get the addresses of all refcount blocks and check
that none of them overlap with the region where we want to write.
The problem with the refcount table is that since it always occupies
complete clusters its size is usually very big. With the default
values of cluster_size=64KB and refcount_bits=16 this table holds 8192
entries, each one of them enough to map 2GB worth of host clusters.
So unless we're using images with several TB of allocated data this
table is going to be mostly empty, and iterating over it is a waste of
CPU. If the storage backend is fast enough this can have an effect on
I/O performance.
This patch keeps the index of the last used (i.e. non-zero) entry in
the refcount table and updates it every time the table changes. The
refcount-block overlap check then uses that index instead of reading
the whole table.
In my tests with a 4GB qcow2 file stored in RAM this doubles the
amount of write IOPS.
Signed-off-by: Alberto Garcia <address@hidden>
Message-id: address@hidden
Reviewed-by: Max Reitz <address@hidden>
Signed-off-by: Max Reitz <address@hidden>
---
block/qcow2.h | 1 +
block/qcow2-refcount.c | 24 +++++++++++++++++++++++-
block/qcow2.c | 1 +
3 files changed, 25 insertions(+), 1 deletion(-)
diff --git a/block/qcow2.h b/block/qcow2.h
index 182341483a..f8aeb08794 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -251,6 +251,7 @@ typedef struct BDRVQcow2State {
uint64_t *refcount_table;
uint64_t refcount_table_offset;
uint32_t refcount_table_size;
+ uint32_t max_refcount_table_index; /* Last used entry in refcount_table */
uint64_t free_cluster_index;
uint64_t free_byte_offset;
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index cbfb3fe064..3dbde18612 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -83,6 +83,16 @@ static Qcow2SetRefcountFunc *const set_refcount_funcs[] = {
/*********************************************************/
/* refcount handling */
+static void update_max_refcount_table_index(BDRVQcow2State *s)
+{
+ unsigned i = s->refcount_table_size - 1;
+ while (i > 0 && (s->refcount_table[i] & REFT_OFFSET_MASK) == 0) {
+ i--;
+ }
+ /* Set s->max_refcount_table_index to the index of the last used entry */
+ s->max_refcount_table_index = i;
+}
+
int qcow2_refcount_init(BlockDriverState *bs)
{
BDRVQcow2State *s = bs->opaque;
@@ -111,6 +121,7 @@ int qcow2_refcount_init(BlockDriverState *bs)
}
for(i = 0; i < s->refcount_table_size; i++)
be64_to_cpus(&s->refcount_table[i]);
+ update_max_refcount_table_index(s);
}
return 0;
fail:
@@ -439,6 +450,10 @@ static int alloc_refcount_block(BlockDriverState *bs,
}
s->refcount_table[refcount_table_index] = new_block;
+ /* If there's a hole in s->refcount_table then it can happen
+ * that refcount_table_index < s->max_refcount_table_index */
+ s->max_refcount_table_index =
+ MAX(s->max_refcount_table_index, refcount_table_index);
/* The new refcount block may be where the caller intended to put its
* data, so let it restart the search. */
@@ -580,6 +595,7 @@ static int alloc_refcount_block(BlockDriverState *bs,
s->refcount_table = new_table;
s->refcount_table_size = table_size;
s->refcount_table_offset = table_offset;
+ update_max_refcount_table_index(s);
/* Free old table. */
qcow2_free_clusters(bs, old_table_offset, old_table_size *
sizeof(uint64_t),
@@ -2171,6 +2187,7 @@ write_refblocks:
s->refcount_table = on_disk_reftable;
s->refcount_table_offset = reftable_offset;
s->refcount_table_size = reftable_size;
+ update_max_refcount_table_index(s);
return 0;
@@ -2383,7 +2400,11 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs,
int ign, int64_t offset,
}
if ((chk & QCOW2_OL_REFCOUNT_BLOCK) && s->refcount_table) {
- for (i = 0; i < s->refcount_table_size; i++) {
+ unsigned last_entry = s->max_refcount_table_index;
+ assert(last_entry < s->refcount_table_size);
+ assert(last_entry + 1 == s->refcount_table_size ||
+ (s->refcount_table[last_entry + 1] & REFT_OFFSET_MASK) == 0);
+ for (i = 0; i <= last_entry; i++) {
if ((s->refcount_table[i] & REFT_OFFSET_MASK) &&
overlaps_with(s->refcount_table[i] & REFT_OFFSET_MASK,
s->cluster_size)) {
@@ -2871,6 +2892,7 @@ int qcow2_change_refcount_order(BlockDriverState *bs, int
refcount_order,
/* Now update the rest of the in-memory information */
old_reftable = s->refcount_table;
s->refcount_table = new_reftable;
+ update_max_refcount_table_index(s);
s->refcount_bits = 1 << refcount_order;
s->refcount_max = UINT64_C(1) << (s->refcount_bits - 1);
diff --git a/block/qcow2.c b/block/qcow2.c
index 96fb8a8f16..3e274bd1ba 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2743,6 +2743,7 @@ static int make_completely_empty(BlockDriverState *bs)
s->refcount_table_offset = s->cluster_size;
s->refcount_table_size = s->cluster_size / sizeof(uint64_t);
+ s->max_refcount_table_index = 0;
g_free(s->refcount_table);
s->refcount_table = new_reftable;
--
2.11.0
- [Qemu-devel] [PULL 08/21] qemu-iotests: Don't create fifos / pidfiles with protocol paths, (continued)
- [Qemu-devel] [PULL 10/21] block/qapi: reduce the coupling between the bdrv_query_stats and bdrv_query_bds_stats, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 11/21] block/qapi: reduce the execution time of qmp_query_blockstats, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 12/21] block: bdrv_invalidate_cache: invalidate children first, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 13/21] block/nfs: fix NULL pointer dereference in URI parsing, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 14/21] block/nfs: fix naming of runtime opts, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 16/21] qemu-iotests: Add _unsupported_fmt helper, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 15/21] qemu-io: Return non-zero exit code on failure, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 17/21] qemu-io: Add failure regression tests, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 18/21] qcow2: Optimize the refcount-block overlap check,
Max Reitz <=
- [Qemu-devel] [PULL 19/21] qemu-io: don't allow I/O operations larger than BDRV_REQUEST_MAX_BYTES, Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 20/21] qemu-img: Use qemu_strtoul() rather than raw strtoul(), Max Reitz, 2017/02/11
- [Qemu-devel] [PULL 21/21] qemu-img: Avoid setting ret to unused value in img_convert(), Max Reitz, 2017/02/11
- Re: [Qemu-devel] [PULL 00/21] Block patches, Peter Maydell, 2017/02/13