qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] [PATCH v3 01/10] qapi: Add transaction sup


From: John Snow
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v3 01/10] qapi: Add transaction support to block-dirty-bitmap operations
Date: Fri, 08 May 2015 12:19:09 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0


On 05/08/2015 09:17 AM, Max Reitz wrote:
> On 08.05.2015 15:14, Stefan Hajnoczi wrote:
>> On Thu, May 07, 2015 at 01:22:26PM -0400, John Snow wrote:
>>>
>>> On 05/07/2015 10:54 AM, Stefan Hajnoczi wrote:
>>>> On Wed, Apr 22, 2015 at 08:04:44PM -0400, John Snow wrote:
>>>>> +static void block_dirty_bitmap_clear_prepare(BlkTransactionState
>>>>> *common, +                                             Error
>>>>> **errp) +{ +    BlockDirtyBitmapState *state =
>>>>> DO_UPCAST(BlockDirtyBitmapState, +
>>>>> common, common); +    BlockDirtyBitmap *action; + +    action =
>>>>> common->action->block_dirty_bitmap_clear; +    state->bitmap =
>>>>> block_dirty_bitmap_lookup(action->node, +
>>>>> action->name, +
>>>>> &state->bs, +
>>>>> &state->aio_context, +
>>>>> errp); +    if (!state->bitmap) { +        return; +    } + +
>>>>> if (bdrv_dirty_bitmap_frozen(state->bitmap)) { +
>>>>> error_setg(errp, "Cannot modify a frozen bitmap"); +
>>>>> return; +    } else if
>>>>> (!bdrv_dirty_bitmap_enabled(state->bitmap)) { +
>>>>> error_setg(errp, "Cannot clear a disabled bitmap"); +
>>>>> return; +    } + +    /* AioContext is released in .clean() */
>>>>> +} + +static void
>>>>> block_dirty_bitmap_clear_commit(BlkTransactionState *common) +{ +
>>>>> BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState, +
>>>>> common, common); +    bdrv_clear_dirty_bitmap(state->bitmap); +}
>>>> These semantics don't work in this example:
>>>>
>>>> [block-dirty-bitmap-clear, drive-backup]
>>>>
>>>> Since drive-backup starts the blockjob in .prepare() but
>>>> block-dirty-bitmap-clear only clears the bitmap in .commit() the
>>>> order is wrong.
> 
> Well, "starts the block job" is technically correct, but the block job
> doesn't run until later. If it were to really start in prepare, that
> would be wrong. Actually, the block job is initialized and yields,
> allowing the code handling the QMP transaction command to continue. I
> think in your example that means that the block job won't actually run
> until after block-dirty-bitmap-clear has been committed.
> 
> Max
> 

The important thing is that we get the contents of the drive as it was,
according to the bitmap as it was, when we start the job.

So if clear doesn't actually modify the bitmap until the commit() phase,
but drive-backup actually goes until the first yield() in prepare...

We're actually going to see an assertion() failure, likely. drive-backup
will freeze the bitmap with a successor, but then the clear transaction
will try to "commit" the changes and try to modify a frozen bitmap.

Oops.

What we (The royal we...) need to figure out is how we want to solve
this problem.

>>>>
>>>> .prepare() has to do something non-destructive, like stashing away
>>>> the HBitmap and replacing it with an empty one.  Then .commit() can
>>>> discard the old bitmap while .abort() can move the old bitmap back
>>>> to undo the operation.
>>>>
>>>> Stefan
>>>>
>>> Hmm, that's sort of gross. That means that any transactional command
>>> *ever* destined to be used with drive-backup in any conceivable way
>>> needs to move a lot more of its action forward to .prepare().
>>>
>>> That sort of defeats the premise of .prepare() and .commit(), no? And
>>> all because drive-backup jumped the gun.
>> No it doesn't.  Actions have to appear atomic to the qmp_transaction
>> caller.  Both approaches achieve that so they are both correct in
>> isolation.
>>
>> The ambiguity is whether "commit the changes" for .commit() means
>> "changes take effect" or "discard stashed state, making undo
>> impossible".
>>
>> I think the "discard stashed state, making undo impossible"
>> interpretation is good because .commit() is not allowed to fail.  That
>> function should only do things that never fail.
>>

To be clear, you are favoring drive-backup's interpretation of the
prepare and commit phases. I had been operating under the other
interpretation.

I think I like the semantics of my interpretation better, but have to
admit it's a lot harder programmatically to enforce "commit cannot fail"
for all of the transactions we support under mine, so your
interpretation is probably the right way to go for sanity's sake -- we
just need to add a bit of documentation to make it clear.

I suppose in this case clear() isn't too hard to modify -- just rip the
Hbitmap out of it and replace it with a new empty one. Don't free() the
old one until commit(). Should be a fairly inexpensive operation.

I will have to re-audit all the existing transactions to make sure these
semantics are consistent, though.

>>> That's going to get hard to maintain as we add more transactions.
>> Yes, we need to be consistent and stick to one of the interpretations in
>> order to guarantee ordering.
>>
>> Unfortunately, there is already an inconsistency:
>>
>> 1. internal_snapshot - snapshot taken in .prepare()
>> 2. external_snapshot - BDS node appended in .commit()
>> 3. drive_backup - block job started in .prepare()
>> 4. blockdev_backup - block job started in .prepare()
>>
>> external_snapshot followed by internal_snapshot acts like the reverse
>> ordering!
>>
>> Stefan
> 

What a mess!

--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]