qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] [PATCH v10 12/14] block: add transactional


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v10 12/14] block: add transactional properties
Date: Fri, 6 Nov 2015 16:36:20 +0000
User-agent: Mutt/1.5.23 (2015-06-09)

On Fri, Nov 06, 2015 at 09:32:19AM +0100, Kevin Wolf wrote:
> Am 05.11.2015 um 19:52 hat John Snow geschrieben:
> > 
> > 
> > On 11/05/2015 05:47 AM, Stefan Hajnoczi wrote:
> > > On Tue, Nov 03, 2015 at 12:27:19PM -0500, John Snow wrote:
> > >>
> > >>
> > >> On 11/03/2015 10:17 AM, Stefan Hajnoczi wrote:
> > >>> On Fri, Oct 23, 2015 at 07:56:50PM -0400, John Snow wrote:
> > >>>> @@ -1732,6 +1757,10 @@ static void 
> > >>>> block_dirty_bitmap_add_prepare(BlkActionState *common,
> > >>>>      BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
> > >>>>                                               common, common);
> > >>>>  
> > >>>> +    if (action_check_cancel_mode(common, errp) < 0) {
> > >>>> +        return;
> > >>>> +    }
> > >>>> +
> > >>>>      action = common->action->block_dirty_bitmap_add;
> > >>>>      /* AIO context taken and released within 
> > >>>> qmp_block_dirty_bitmap_add */
> > >>>>      qmp_block_dirty_bitmap_add(action->node, action->name,
> > >>>> @@ -1767,6 +1796,10 @@ static void 
> > >>>> block_dirty_bitmap_clear_prepare(BlkActionState *common,
> > >>>>                                               common, common);
> > >>>>      BlockDirtyBitmap *action;
> > >>>>  
> > >>>> +    if (action_check_cancel_mode(common, errp) < 0) {
> > >>>> +        return;
> > >>>> +    }
> > >>>> +
> > >>>>      action = common->action->block_dirty_bitmap_clear;
> > >>>>      state->bitmap = block_dirty_bitmap_lookup(action->node,
> > >>>>                                                action->name,
> > >>>
> > >>> Why do the bitmap add/clear actions not support err-cancel=all?
> > >>>
> > >>> I understand why other block jobs don't support it, but it's not clear
> > >>> why these non-block job actions cannot.
> > >>>
> > >>
> > >> Because they don't have a callback to invoke if the rest of the job 
> > >> fails.
> > >>
> > >> I could create a BlockJob for them complete with a callback to invoke,
> > >> but basically it's just because there's no interface to unwind them, or
> > >> an interface to join them with the transaction.
> > >>
> > >> They're small, synchronous non-job actions. Which makes them weird.
> > > 
> > > Funny, we've been looking at the same picture while seeing different
> > > things:
> > > https://en.wikipedia.org/wiki/Rabbit%E2%80%93duck_illusion
> > > 
> > > I think I understand your idea: the transaction should include both
> > > immediate actions as well as block jobs.
> > > 
> > > My mental model was different: immediate actions commit/abort along with
> > > the 'transaction' command.  Block jobs are separate and complete/cancel
> > > together in a group.
> > > 
> > > In practice I think the two end up being similar because we won't be
> > > able to implement immediate action commit/abort together with
> > > long-running block jobs because the immediate actions rely on
> > > quiescing/pausing the guest for atomic commit/abort.
> > > 
> > > So with your mental model the QMP client has to submit 2 'transaction'
> > > commands: 1 for the immediate actions, 1 for the block jobs.
> > > 
> > > In my mental model the QMP client submits 1 command but the immediate
> > > actions and block jobs are two separate transaction scopes.  This means
> > > if the block jobs fail, the client needs to be aware of the immediate
> > > actions that have committed.  Because of this, it becomes just as much
> > > client effort as submitting two separate 'transaction' commands in your
> > > model.
> > > 
> > > Can anyone see a practical difference?  I think I'm happy with John's
> > > model.
> > > 
> > > Stefan
> > > 
> > 
> > We discussed this off-list, but for the sake of the archive:
> > 
> > == How it is now ==
> > 
> > Currently, transactions have two implicit phases: the first is the
> > synchronous phase. If this phase completes successfully, we consider the
> > transaction a success. The second phase is the asynchronous phase where
> > jobs launched by the synchronous phase run to completion.
> > 
> > all synchronous commands must complete for the transaction to "succeed."
> > There are currently (pre-patch) no guarantees about asynchronous command
> > completion. As long as all synchronous actions complete, asynchronous
> > actions are free to succeed or fail individually.
> 
> You're overcomplicating this. All actions are currently synchronous and
> what you consider asynchronous transaction actions aren't actually part
> of the transaction at all. The action is "start block job X", not "run
> block job X".

Yes, this is how I've thought of it too.

> > == My Model ==
> > 
> > The current behavior is my "err-cancel = none" scenario: we offer no
> > guarantee about the success or failure of the transaction as a whole
> > after the synchronous portion has completed.
> > 
> > What I was proposing is "err-cancel = all," which to me means that _ALL_
> > commands in this transaction are to succeed (synchronous or not) before
> > _any_ actions are irrevocably committed. This means that for a
> > hypothetical mixed synchronous-asynchronous transaction, that even after
> > the transaction succeeded (it passed the synchronous phase), if an
> > asynchronous action later fails, all actions both synchronous and non
> > are rolled-back -- a kind of retroactive failure of the transaction.
> > This is clearly not possible in all cases, so commands that cannot
> > support these semantics will refuse "err-cancel = all" during the
> > synchronous phase.
> 
> Is this possible in any case? You're losing transaction semantics the
> lastest when you drop the BQL that the monitor holds. At least atomicity
> and isolation aren't given any more.
> 
> You can try to undo some parts of what you did later one, but if any
> involved BDS was used in the meantime by anything other than the block
> job, you don't have transactional behaviour any more.
> 
> And isn't the management tool perfectly capable of cleaning up all the
> block jobs without magic happening in qemu if one of them fails? Do we
> actually need atomic failure later on? And if so, do we need atomic
> failure only of block jobs started in the same transaction? Why?

I think we do because the backup block job, when given a dirty bitmap to
copy out, will either discard that dirty bitmap upon completion or merge
the dirty bitmap back into the currently active dirty bitmap for the
device on failure (that way your incremental backup can be retried).

Now when you take an incremental backup of multiple drives at the same
instant in time, it would be a pain to have one or more jobs complete
(and discard the bitmap) but others fail.  Then you would no longer have
a single point in time when the incremental backup was taken...

That is the motivation for this feature.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]