qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 07/12] block/backup: add 'always' bitmap sync po


From: John Snow
Subject: Re: [Qemu-devel] [PATCH 07/12] block/backup: add 'always' bitmap sync policy
Date: Thu, 20 Jun 2019 14:44:30 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0


On 6/20/19 1:00 PM, Max Reitz wrote:
> On 20.06.19 03:03, John Snow wrote:
>> This adds an "always" policy for bitmap synchronization. Regardless of if
>> the job succeeds or fails, the bitmap is *always* synchronized. This means
>> that for backups that fail part-way through, the bitmap retains a record of
>> which sectors need to be copied out to accomplish a new backup using the
>> old, partial result.
>>
>> In effect, this allows us to "resume" a failed backup; however the new backup
>> will be from the new point in time, so it isn't a "resume" as much as it is
>> an "incremental retry." This can be useful in the case of extremely large
>> backups that fail considerably through the operation and we'd like to not 
>> waste
>> the work that was already performed.
>>
>> Signed-off-by: John Snow <address@hidden>
>> ---
>>  qapi/block-core.json |  5 ++++-
>>  block/backup.c       | 10 ++++++----
>>  2 files changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index 0332dcaabc..58d267f1f5 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -1143,6 +1143,9 @@
>>  # An enumeration of possible behaviors for the synchronization of a bitmap
>>  # when used for data copy operations.
>>  #
>> +# @always: The bitmap is always synchronized with remaining blocks to copy,
>> +#          whether or not the operation has completed successfully or not.
>> +#
>>  # @conditional: The bitmap is only synchronized when the operation is 
>> successul.
>>  #               This is useful for Incremental semantics.
>>  #
>> @@ -1153,7 +1156,7 @@
>>  # Since: 4.1
>>  ##
>>  { 'enum': 'BitmapSyncMode',
>> -  'data': ['conditional', 'never'] }
>> +  'data': ['always', 'conditional', 'never'] }
>>  
>>  ##
>>  # @MirrorCopyMode:
>> diff --git a/block/backup.c b/block/backup.c
>> index 627f724b68..beb2078696 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -266,15 +266,17 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob 
>> *job, int ret)
>>      BlockDriverState *bs = blk_bs(job->common.blk);
>>  
>>      if (ret < 0 || job->bitmap_mode == BITMAP_SYNC_MODE_NEVER) {
>> -        /* Failure, or we don't want to synchronize the bitmap.
>> -         * Merge the successor back into the parent, delete nothing. */
>> +        /* Failure, or we don't want to synchronize the bitmap. */
>> +        if (job->bitmap_mode == BITMAP_SYNC_MODE_ALWAYS) {
>> +            bdrv_dirty_bitmap_claim(job->sync_bitmap, &job->copy_bitmap);
> 
> Hmm...  OK, bitmaps in backup always confuse me, so bear with me, please.
> 

I realize this is an extremely dense section that actually covers a
*lot* of pathways.

> (Hi, I’m a time traveler from the end of this section and I can tell you
> that everything is fine.  I was just confused.  I’ll still keep this
> here, because it was so much work.)
> 
> The copy_bitmap is copied from the sync_bitmap at the beginning, so the
> sync_bitmap can continue to be dirtied, but that won’t affect the job.
> In normal incremental mode, this means that the sync point is always at
> the beginning of the job.  (Well, naturally, because that’s how backup
> is supposed to go.)
> 

sync_bitmap: This is used as an initial manifest for which sectors to
copy out. It is the user-provided bitmap. We actually *never* edit this
bitmap in the body of the job.

copy_bitmap: This is the manifest for which blocks remain to be copied
out. We clear bits in this as we go, because we use it as our loop
condition.

So what you say is actually only half-true: the sync_bitmap actually
remains static during the duration of the job, and it has an anonymous
child that accrues new writes. This is a holdover from before we had a
copy_bitmap, and we used to use a sync_bitmap directly as our loop
condition.

(This could be simplified upstream at present; but after this patch it
cannot be for reasons explained below. We do wish to maintain three
distinct sets of bits:
1. The bits at the start of the operation,
2. The bits accrued during the operation, and
3. The bits that remain to be, or were not, copied during the operation.)

So there's actually three bitmaps:

- sync_bitmap: actually just static and read-only
- sync_bitmap's anonymous child: accrues new writes.
- copy_bitmap: loop conditional.

> But then replacing the sync_bitmap with the copy_bitmap here means that
> all of these dirtyings that happened during the job are lost.  Hmm, but
> that doesn’t matter, does it?  Because whenever something was dirtied in
> sync_bitmap, the corresponding area must have been copied to the backup
> due to the job.
> 

The new dirty bits were accrued very secretly in the anonymous child.
The new dirty bits are merged in via the reclaim() function.

So, what happens is:

- Sync_bitmap gets the bit pattern of copy_bitmap (one way or another)
- Sync_bitmap reclaims (merges with) its anonymous child.

> Ah, yes, it would actually be wrong to keep the new dirty bits, because
> in this mode, sync_bitmap should (on failure) reflect what is left to
> copy to make the backup complete.  Copying these newly dirtied sectors
> would be wrong.  (Yes, I know you wrote that in the documentation of
> @always.  I just tried to get a different perspective.)
> 
> Yes, yes, and copy_bitmap is always set whenever a CBW to the target
> fails before the source can be updated.  Good, good.
> 

You might have slightly the wrong idea; it's important to keep track of
what was dirtied during the operation because that data is important for
the next bitmap backup.

The merging of "sectors left to copy" (in the case of a failed backup)
and "sectors dirtied since we started the operation" forms the actual
minimal set needed to re-write to this target to achieve a new
functioning point in time. This is what you get with the "always" mode
in a failure case.

In a success case, it just so happens that "sectors left to copy" is the
empty set.

It's like an incremental on top of the incremental.

Consider this:

We have a 4TB drive and we have dirtied 3TB of it since our full backup.
We copy out 2TB as part of a new incremental backup before suffering
some kind of failure.

Today, you'd need to start a new incremental backup that copies that
entire 3TB *plus* whatever was dirtied since the job failed.

With this mode, you'd only need to copy the remaining 1TB + whatever was
dirtied since.

So, what this logic is really doing is:

If we failed, OR if we want the "never" sync policy:

Merge the anonymous child (bits written during op) back into sync_bitmap
(bits we were instructed to copy), leaving us as if we have never
started this operation.

If, however, we failed and we have the "always" sync policy, we destroy
the sync_bitmap (bits we were instructed to copy) and replace it with
the copy_bitmap (bits remaining to copy). Then, we merge that with the
anonymous child (bits written during op).

Or, in success cases (when sync policy is not never), we simply delete
the sync_bitmap (bits we were instructed to copy) and replace it with
its anonymous child (bits written during op).

> 
> Hi, I’m the time traveler from above.  I also left the section here so I
> can give one of my trademark “Ramble, ramble,
> 
> Reviewed-by: Max Reitz <address@hidden>
> 
>
> 
>> +        }
>> +        /* Merge the successor back into the parent. */
>>          bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
>> -        assert(bm);
>>      } else {
>>          /* Everything is fine, delete this bitmap and install the backup. */
>>          bm = bdrv_dirty_bitmap_abdicate(bs, job->sync_bitmap, NULL);
>> -        assert(bm);
>>      }
>> +    assert(bm);
>>  }
>>  
>>  static void backup_commit(Job *job)
>>
> 
> 

-- 
—js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]