[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_boun
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once |
Date: |
Tue, 13 Aug 2019 15:32:31 +0000 |
13.08.2019 18:02, Max Reitz wrote:
> On 13.08.19 17:00, Vladimir Sementsov-Ogievskiy wrote:
>> 13.08.2019 17:57, Max Reitz wrote:
>>> On 13.08.19 16:39, Vladimir Sementsov-Ogievskiy wrote:
>>>> 13.08.2019 17:23, Max Reitz wrote:
>>>>> On 13.08.19 16:14, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> 12.08.2019 19:37, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> 12.08.2019 19:11, Max Reitz wrote:
>>>>>>>> On 12.08.19 17:47, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>>> 12.08.2019 18:10, Max Reitz wrote:
>>>>>>>>>> On 10.08.19 21:31, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>>>>>> backup_cow_with_offload can transfer more than one cluster. Let
>>>>>>>>>>> backup_cow_with_bounce_buffer behave similarly. It reduces the
>>>>>>>>>>> number
>>>>>>>>>>> of IO requests, since there is no need to copy cluster by cluster.
>>>>>>>>>>>
>>>>>>>>>>> Logic around bounce_buffer allocation changed: we can't just
>>>>>>>>>>> allocate
>>>>>>>>>>> one-cluster-sized buffer to share for all iterations. We can't also
>>>>>>>>>>> allocate buffer of full-request length it may be too large, so
>>>>>>>>>>> BACKUP_MAX_BOUNCE_BUFFER is introduced. And finally, allocation
>>>>>>>>>>> logic
>>>>>>>>>>> is to allocate a buffer sufficient to handle all remaining
>>>>>>>>>>> iterations
>>>>>>>>>>> at the point where we need the buffer for the first time.
>>>>>>>>>>>
>>>>>>>>>>> Bonus: get rid of pointer-to-pointer.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>>>>>>>>> ---
>>>>>>>>>>> block/backup.c | 65
>>>>>>>>>>> +++++++++++++++++++++++++++++++-------------------
>>>>>>>>>>> 1 file changed, 41 insertions(+), 24 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/block/backup.c b/block/backup.c
>>>>>>>>>>> index d482d93458..65f7212c85 100644
>>>>>>>>>>> --- a/block/backup.c
>>>>>>>>>>> +++ b/block/backup.c
>>>>>>>>>>> @@ -27,6 +27,7 @@
>>>>>>>>>>> #include "qemu/error-report.h"
>>>>>>>>>>> #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
>>>>>>>>>>> +#define BACKUP_MAX_BOUNCE_BUFFER (64 * 1024 * 1024)
>>>>>>>>>>> typedef struct CowRequest {
>>>>>>>>>>> int64_t start_byte;
>>>>>>>>>>> @@ -98,44 +99,55 @@ static void cow_request_end(CowRequest *req)
>>>>>>>>>>> qemu_co_queue_restart_all(&req->wait_queue);
>>>>>>>>>>> }
>>>>>>>>>>> -/* Copy range to target with a bounce buffer and return the bytes
>>>>>>>>>>> copied. If
>>>>>>>>>>> - * error occurred, return a negative error number */
>>>>>>>>>>> +/*
>>>>>>>>>>> + * Copy range to target with a bounce buffer and return the bytes
>>>>>>>>>>> copied. If
>>>>>>>>>>> + * error occurred, return a negative error number
>>>>>>>>>>> + *
>>>>>>>>>>> + * @bounce_buffer is assumed to enough to store
>>>>>>>>>>
>>>>>>>>>> s/to/to be/
>>>>>>>>>>
>>>>>>>>>>> + * MIN(BACKUP_MAX_BOUNCE_BUFFER, @end - @start) bytes
>>>>>>>>>>> + */
>>>>>>>>>>> static int coroutine_fn
>>>>>>>>>>> backup_cow_with_bounce_buffer(BackupBlockJob *job,
>>>>>>>>>>> int64_t
>>>>>>>>>>> start,
>>>>>>>>>>> int64_t
>>>>>>>>>>> end,
>>>>>>>>>>> bool
>>>>>>>>>>> is_write_notifier,
>>>>>>>>>>> bool
>>>>>>>>>>> *error_is_read,
>>>>>>>>>>> - void
>>>>>>>>>>> **bounce_buffer)
>>>>>>>>>>> + void
>>>>>>>>>>> *bounce_buffer)
>>>>>>>>>>> {
>>>>>>>>>>> int ret;
>>>>>>>>>>> BlockBackend *blk = job->common.blk;
>>>>>>>>>>> - int nbytes;
>>>>>>>>>>> + int nbytes, remaining_bytes;
>>>>>>>>>>> int read_flags = is_write_notifier ?
>>>>>>>>>>> BDRV_REQ_NO_SERIALISING : 0;
>>>>>>>>>>> assert(QEMU_IS_ALIGNED(start, job->cluster_size));
>>>>>>>>>>> - bdrv_reset_dirty_bitmap(job->copy_bitmap, start,
>>>>>>>>>>> job->cluster_size);
>>>>>>>>>>> - nbytes = MIN(job->cluster_size, job->len - start);
>>>>>>>>>>> - if (!*bounce_buffer) {
>>>>>>>>>>> - *bounce_buffer = blk_blockalign(blk, job->cluster_size);
>>>>>>>>>>> - }
>>>>>>>>>>> + bdrv_reset_dirty_bitmap(job->copy_bitmap, start, end - start);
>>>>>>>>>>> + nbytes = MIN(end - start, job->len - start);
>>>>>>>>>>> - ret = blk_co_pread(blk, start, nbytes, *bounce_buffer,
>>>>>>>>>>> read_flags);
>>>>>>>>>>> - if (ret < 0) {
>>>>>>>>>>> - trace_backup_do_cow_read_fail(job, start, ret);
>>>>>>>>>>> - if (error_is_read) {
>>>>>>>>>>> - *error_is_read = true;
>>>>>>>>>>> +
>>>>>>>>>>> + remaining_bytes = nbytes;
>>>>>>>>>>> + while (remaining_bytes) {
>>>>>>>>>>> + int chunk = MIN(BACKUP_MAX_BOUNCE_BUFFER, remaining_bytes);
>>>>>>>>>>> +
>>>>>>>>>>> + ret = blk_co_pread(blk, start, chunk, bounce_buffer,
>>>>>>>>>>> read_flags);
>>>>>>>>>>> + if (ret < 0) {
>>>>>>>>>>> + trace_backup_do_cow_read_fail(job, start, ret);
>>>>>>>>>>> + if (error_is_read) {
>>>>>>>>>>> + *error_is_read = true;
>>>>>>>>>>> + }
>>>>>>>>>>> + goto fail;
>>>>>>>>>>> }
>>>>>>>>>>> - goto fail;
>>>>>>>>>>> - }
>>>>>>>>>>> - ret = blk_co_pwrite(job->target, start, nbytes, *bounce_buffer,
>>>>>>>>>>> - job->write_flags);
>>>>>>>>>>> - if (ret < 0) {
>>>>>>>>>>> - trace_backup_do_cow_write_fail(job, start, ret);
>>>>>>>>>>> - if (error_is_read) {
>>>>>>>>>>> - *error_is_read = false;
>>>>>>>>>>> + ret = blk_co_pwrite(job->target, start, chunk,
>>>>>>>>>>> bounce_buffer,
>>>>>>>>>>> + job->write_flags);
>>>>>>>>>>> + if (ret < 0) {
>>>>>>>>>>> + trace_backup_do_cow_write_fail(job, start, ret);
>>>>>>>>>>> + if (error_is_read) {
>>>>>>>>>>> + *error_is_read = false;
>>>>>>>>>>> + }
>>>>>>>>>>> + goto fail;
>>>>>>>>>>> }
>>>>>>>>>>> - goto fail;
>>>>>>>>>>> +
>>>>>>>>>>> + start += chunk;
>>>>>>>>>>> + remaining_bytes -= chunk;
>>>>>>>>>>> }
>>>>>>>>>>> return nbytes;
>>>>>>>>>>> @@ -301,9 +313,14 @@ static int coroutine_fn
>>>>>>>>>>> backup_do_cow(BackupBlockJob *job,
>>>>>>>>>>> }
>>>>>>>>>>> }
>>>>>>>>>>> if (!job->use_copy_range) {
>>>>>>>>>>> + if (!bounce_buffer) {
>>>>>>>>>>> + size_t len = MIN(BACKUP_MAX_BOUNCE_BUFFER,
>>>>>>>>>>> + MAX(dirty_end - start, end -
>>>>>>>>>>> dirty_end));
>>>>>>>>>>> + bounce_buffer =
>>>>>>>>>>> blk_try_blockalign(job->common.blk, len);
>>>>>>>>>>> + }
>>>>>>>>>>
>>>>>>>>>> If you use _try_, you should probably also check whether it
>>>>>>>>>> succeeded.
>>>>>>>>>
>>>>>>>>> Oops, you are right, of course.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Anyway, I wonder whether it’d be better to just allocate this buffer
>>>>>>>>>> once per job (the first time we get here, probably) to be of size
>>>>>>>>>> BACKUP_MAX_BOUNCE_BUFFER and put it into BackupBlockJob. (And maybe
>>>>>>>>>> add
>>>>>>>>>> a buf-size parameter similar to what the mirror jobs have.)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Once per job will not work, as we may have several guest writes in
>>>>>>>>> parallel and therefore
>>>>>>>>> several parallel copy-before-write operations.
>>>>>>>>
>>>>>>>> Hm. I’m not quite happy with that because if the guest just issues
>>>>>>>> many
>>>>>>>> large discards in parallel, this means that qemu will allocate a large
>>>>>>>> amount of memory.
>>>>>>>>
>>>>>>>> It would be nice if there was a simple way to keep track of the total
>>>>>>>> memory usage and let requests yield if they would exceed it.
>>>>>>>
>>>>>>> Agree, it should be fixed anyway.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> But still..
>>>>>>
>>>>>> Synchronous mirror allocates full-request buffers on guest write. Is it
>>>>>> correct?
>>>>>>
>>>>>> If we assume that it is correct to double memory usage of guest
>>>>>> operations, than for backup
>>>>>> the problem is only in write_zero and discard where guest-assumed memory
>>>>>> usage should be zero.
>>>>>
>>>>> Well, but that is the problem. I didn’t say anything in v2, because I
>>>>> only thought of normal writes and I found it fine to double the memory
>>>>> usage there (a guest won’t issue huge write requests in parallel). But
>>>>> discard/write-zeroes are a different matter.
>>>>>
>>>>>> And if we should distinguish writes from write_zeroes and discard, it's
>>>>>> better to postpone this
>>>>>> improvement to be after backup-top filter merged.
>>>>>
>>>>> But do you need to distinguish it? Why not just keep track of memory
>>>>> usage and put the current I/O coroutine to sleep in a CoQueue or
>>>>> something, and wake that up at the end of backup_do_cow()?
>>>>>
>>>>
>>>> 1. Because if we _can_ allow doubling of memory, it's more effective to
>>>> not restrict allocations on
>>>> guest writes. It's just seems to be more effective technique.
>>>
>>> But the problem with backup and zero writes/discards is that the memory
>>> is not doubled. The request doesn’t need any memory, but the CBW
>>> operation does, and maybe lots of it.
>>>
>>> So the guest may issue many zero writes/discards in parallel and thus
>>> exhaust memory on the host.
>>
>> So this is the reason to separate writes from write-zeros/discrads. So at
>> least write will be happy. And I
>> think that write is more often request than write-zero/discard
>
> But that makes it complicated for no practical gain whatsoever.
>
>>>
>>>> 2. Anyway, I'd allow some always-available size to allocate - let it be
>>>> one cluster, which will correspond
>>>> to current behavior and prevent guest io hang in worst case.
>>>
>>> The guest would only hang if it we have to copy more than e.g. 64 MB at
>>> a time. At which point I think it’s not unreasonable to sequentialize
>>> requests.
>
> Because of this. How is it bad to start sequentializing writes when the
> data exceeds 64 MB?
>
So you want total memory limit of 64 MB? (with possible parameter like in
mirror)
And allocation algorithm to copy count bytes:
if free_mem >= count: allocate count bytes
else if free_mem >= cluster: allocate cluster and copy in a loop
else wait in co-queue until some memory available and retry
Is it OK for you?
--
Best regards,
Vladimir
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, (continued)
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Max Reitz, 2019/08/12
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Vladimir Sementsov-Ogievskiy, 2019/08/12
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Max Reitz, 2019/08/12
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Vladimir Sementsov-Ogievskiy, 2019/08/12
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Vladimir Sementsov-Ogievskiy, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Max Reitz, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Vladimir Sementsov-Ogievskiy, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Max Reitz, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Vladimir Sementsov-Ogievskiy, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Max Reitz, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once,
Vladimir Sementsov-Ogievskiy <=
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Max Reitz, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Vladimir Sementsov-Ogievskiy, 2019/08/13
- Re: [Qemu-block] [PATCH v3 6/7] block/backup: teach backup_cow_with_bounce_buffer to copy more at once, Max Reitz, 2019/08/13