qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 17/45] Add wrappers and handlers for sending/


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v5 17/45] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages.
Date: Mon, 30 Mar 2015 15:03:11 +1100
User-agent: Mutt/1.5.23 (2014-03-12)

On Fri, Mar 27, 2015 at 10:48:52AM +0000, Dr. David Alan Gilbert wrote:
> * David Gibson (address@hidden) wrote:
> > On Thu, Mar 26, 2015 at 04:33:28PM +0000, Dr. David Alan Gilbert wrote:
> > > (Only replying to some of the items in this mail - the others I'll get
> > > to another time).
[snip]
> > > > DISCARD will typically immediately precede LISTEN, won't it?  Is there
> > > > a reason not to put the discard data into the LISTEN command?
> > > 
> > > Discard data can be quite large, so I potentially send multiple discard
> > > commands.
> > > (Also as you can tell generally I've got a preference for one message 
> > > does one
> > > thing, and thus I have tried to keep them separate).
> > 
> > So, I like the idea of one message per action in principle - but only
> > if that action really is well-defined without reference to what
> > operations come before and after it.  If there are hidden dependencies
> > about what actions have to come in what order, I'd rather bake that
> > into the command structure.
> 
> In no way is it hidden; the commands match the state transitions.

Not all of them.  In particular DISCARD's relation to state
transitions is not at all obvious.  Likewise I don't think the
connection of LISTEN to the internal state machines at each end is
terribly clear.

> > > > > +static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
> > > > > +                                         uint64_t remote_hps,
> > > > > +                                         uint64_t remote_tps)
> > > > > +{
> > > > > +    PostcopyState ps = postcopy_state_get(mis);
> > > > > +    trace_loadvm_postcopy_handle_advise();
> > > > > +    if (ps != POSTCOPY_INCOMING_NONE) {
> > > > > +        error_report("CMD_POSTCOPY_ADVISE in wrong postcopy state 
> > > > > (%d)", ps);
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > > +    if (remote_hps != getpagesize())  {
> > > > > +        /*
> > > > > +         * Some combinations of mismatch are probably possible but 
> > > > > it gets
> > > > > +         * a bit more complicated.  In particular we need to place 
> > > > > whole
> > > > > +         * host pages on the dest at once, and we need to ensure 
> > > > > that we
> > > > > +         * handle dirtying to make sure we never end up sending part 
> > > > > of
> > > > > +         * a hostpage on it's own.
> > > > > +         */
> > > > > +        error_report("Postcopy needs matching host page sizes (s=%d 
> > > > > d=%d)",
> > > > > +                     (int)remote_hps, getpagesize());
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > > +    if (remote_tps != (1ul << qemu_target_page_bits())) {
> > > > > +        /*
> > > > > +         * Again, some differences could be dealt with, but for now 
> > > > > keep it
> > > > > +         * simple.
> > > > > +         */
> > > > > +        error_report("Postcopy needs matching target page sizes 
> > > > > (s=%d d=%d)",
> > > > > +                     (int)remote_tps, 1 << qemu_target_page_bits());
> > > > > +        return -1;
> > > > > +    }
> > > > > +
> > > > > +    postcopy_state_set(mis, POSTCOPY_INCOMING_ADVISE);
> > > > 
> > > > Should you be checking the return value here to make sure it's still
> > > > POSTCOPY_INCOMING_NONE?  Atomic xchgs seem overkill if you still have
> > > > a race between the fetch at the top and the set here.
> > > > 
> > > > Or, in fact, should you be just doing an atomic exchange-then-check at
> > > > the top, rather than checking at the top, then changing at the bottom.
> > > 
> > > There's no race at this point yet; going from None->advise we still only
> > > have one thread.  The check at the top is a check against protocol
> > > violations (e.g. getting two advise or something like that).
> > 
> > Yeah.. and having to have intimate knowledge of the thread structure
> > at each point in order to analyze the atomic operation correctness is
> > exactly what bothers me about it.
> 
> No, you don't; By always using the postcopy_state_set you don't need
> to worry about the thread structure or protocol to understand the atomic 
> operation
> correctness.  The only reason you got into that comparison is because
> you worried that it might be overkill in this case; by keeping it consistent
> like it already is then you don't need to understand the thread structure.

You really do, though.  You may be using the same function in the
original, but not the same idiom: in this case you don't check the
return value and deal with it accordingly.

Because of that, this looks pretty much exactly like the classic
"didn't understand what was atomic in the atomic" race condition, and
you need to know that there's only one thread at this point to realize
it's correct after all.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: pgpRl1fxL5Trf.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]