qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pe


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-block] [PATCH v10 05/12] migration: introduce postcopy-only pending
Date: Tue, 13 Mar 2018 15:35:14 +0000
User-agent: Mutt/1.9.2 (2017-12-15)

* Vladimir Sementsov-Ogievskiy (address@hidden) wrote:
> 13.03.2018 13:30, Dr. David Alan Gilbert wrote:
> > * Vladimir Sementsov-Ogievskiy (address@hidden) wrote:
> > > 12.03.2018 18:30, Dr. David Alan Gilbert wrote:
> > > > * Vladimir Sementsov-Ogievskiy (address@hidden) wrote:
> > > > > There would be savevm states (dirty-bitmap) which can migrate only in
> > > > > postcopy stage. The corresponding pending is introduced here.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
> > > > > ---
> > > [...]
> > > 
> > > > >    static MigIterateState migration_iteration_run(MigrationState *s)
> > > > >    {
> > > > > -    uint64_t pending_size, pend_post, pend_nonpost;
> > > > > +    uint64_t pending_size, pend_pre, pend_compat, pend_post;
> > > > >        bool in_postcopy = s->state == 
> > > > > MIGRATION_STATUS_POSTCOPY_ACTIVE;
> > > > > -    qemu_savevm_state_pending(s->to_dst_file, s->threshold_size,
> > > > > -                              &pend_nonpost, &pend_post);
> > > > > -    pending_size = pend_nonpost + pend_post;
> > > > > +    qemu_savevm_state_pending(s->to_dst_file, s->threshold_size, 
> > > > > &pend_pre,
> > > > > +                              &pend_compat, &pend_post);
> > > > > +    pending_size = pend_pre + pend_compat + pend_post;
> > > > >        trace_migrate_pending(pending_size, s->threshold_size,
> > > > > -                          pend_post, pend_nonpost);
> > > > > +                          pend_pre, pend_compat, pend_post);
> > > > >        if (pending_size && pending_size >= s->threshold_size) {
> > > > >            /* Still a significant amount to transfer */
> > > > >            if (migrate_postcopy() && !in_postcopy &&
> > > > > -            pend_nonpost <= s->threshold_size &&
> > > > > -            atomic_read(&s->start_postcopy)) {
> > > > > +            pend_pre <= s->threshold_size &&
> > > > > +            (atomic_read(&s->start_postcopy) ||
> > > > > +             (pend_pre + pend_compat <= s->threshold_size)))
> > > > This change does something different from the description;
> > > > it causes a postcopy_start even if the user never ran the postcopy-start
> > > > command; so sorry, we can't do that; because postcopy for RAM is
> > > > something that users can enable but only switch into when they've given
> > > > up on it completing normally.
> > > > 
> > > > However, I guess that leaves you with a problem; which is what happens
> > > > to the system when you've run out of pend_pre+pend_compat but can't
> > > > complete because pend_post is non-0; so I don't know the answer to that.
> > > > 
> > > > 
> > > Hmm. Here, we go to postcopy only if "pend_pre + pend_compat <=
> > > s->threshold_size". Pre-patch, in this case we will go to
> > > migration_completion(). So, precopy stage is finishing anyway.
> > Right.
> > 
> > > So, we want
> > > in this case to finish ram migration like it was finished by
> > > migration_completion(), and then, run postcopy, which will handle only 
> > > dirty
> > > bitmaps, yes?
> > It's a bit tricky; the first important thing is that we can't change the
> > semantics of the migration without the 'dirty bitmaps'.
> > 
> > So then there's the question of how  a migration with both
> > postcopy-ram+dirty bitmaps should work;  again I don't think we should
> > enter the postcopy-ram phase until start-postcopy is issued.
> > 
> > Then there's the 3rd case; dirty-bitmaps but no postcopy-ram; in that
> > case I worry less about the semantics of how you want to do it.
> 
> I have an idea:
> 
> in postcopy_start(), in ram_has_postcopy() (and may be some other places?),
> check atomic_read(&s->start_postcopy) instead of migrate_postcopy_ram()

We've got to use migrate_postcopy_ram() to decide whether we should do
ram specific things, e.g. send the ram discard data.
I'm wanting to make sure that if we have another full postcopy device
(like RAM, maybe storage say) that we'll just add that in with
migrate_postcopy_whatever().

> then:
> 
> 1. behavior without dirty-bitmaps is not changed, as currently we cant go
> into postcopy_start and ram_has_postcopy without s->start_postcopy
> 2. dirty-bitmaps+ram: if user don't set s->start_postcopy, postcopy_start()
> will operate as if migration capability was not enabled, so ram should
> complete its migration
> 3. only dirty-bitmaps: again, postcopy_start() will operate as if migration
> capability was not enabled, so ram should complete its migration

Why can't we just remove the change to the trigger condition in this
patch?  Then I think everything works as long as the management layer
does eventually call migration-start-postcopy ?
(It might waste some bandwidth at the point where there's otherwise
nothing left to send).

Even with the use of migrate-start-postcopy, you're going to need to be
careful about the higher level story; you need to document when to do it
and what the higher levels should do after a migration failure - at the
moment they know that once postcopy starts migration is irrecoverable if
it fails; I suspect that's not true with your dirty bitmaps.

IMHO this still comes back to my original observation from ~18months ago
that in many ways this isn't very postcopy like; in the sense that all
the semantics are quite different from RAM.

Dave

> > 
> > > Hmm2. Looked through migration_completion(), I don't understand, how it
> > > finishes ram migration without postcopy. It calls
> > > qemu_savevm_state_complete_precopy(), which skips states with
> > > has_postcopy=true, which is ram...
> > Because savevm_state_complete_precopy only skips has_postcopy=true in
> > the in_postcopy case:
> > 
> >              (in_postcopy && se->ops->has_postcopy &&
> >               se->ops->has_postcopy(se->opaque)) ||
> > 
> > so when we call it in migration_completion(), if we've not entered
> > postcopy yet, then that test doesn't trigger.
> > 
> > (Apologies for not spotting this earlier; but I thought this patch was
> > a nice easy one just adding the postcopy_only_pending - I didn't realise it 
> > changed
> > existing semantics until I spotted that)
> 
> oh, yes, I was inattentive :(
> 
> > 
> > Dave
> > 
> > > -- 
> > > Best regards,
> > > Vladimir
> > > 
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> 
> 
> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]