qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC] memory: Don't allow to resize RAM while migrating


From: Dr. David Alan Gilbert
Subject: Re: [PATCH RFC] memory: Don't allow to resize RAM while migrating
Date: Fri, 14 Feb 2020 15:14:50 +0000
User-agent: Mutt/1.13.3 (2020-01-12)

* David Hildenbrand (address@hidden) wrote:
> 
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index ed23ed1c7c..f86f32b453 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -52,6 +52,7 @@
> >>  #include "migration/colo.h"
> >>  #include "block.h"
> >>  #include "sysemu/sysemu.h"
> >> +#include "sysemu/runstate.h"
> >>  #include "savevm.h"
> >>  #include "qemu/iov.h"
> >>  #include "multifd.h"
> >> @@ -3710,8 +3711,49 @@ static SaveVMHandlers savevm_ram_handlers = {
> >>      .resume_prepare = ram_resume_prepare,
> >>  };
> >>  
> >> +static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
> >> +                                      size_t old_size, size_t new_size)
> >> +{
> >> +    /*
> >> +     * We don't care about resizes triggered on incoming migration (when
> >> +     * syncing ram blocks), or of course, when no migration is going on.
> >> +     */
> >> +    if (migration_is_idle() || !runstate_is_running()) {
> >> +        return;
> >> +    }
> >> +
> >> +    if (!postcopy_is_running()) {
> >> +        Error *err = NULL;
> >> +
> >> +        /*
> >> +         * Precopy code cannot deal with the size of ram blocks changing 
> >> at
> >> +         * random points in time. We're still running on the source, abort
> >> +         * the migration and continue running here. Make sure to wait 
> >> until
> >> +         * migration was canceled.
> >> +         */
> >> +        error_setg(&err, "RAM resized during precopy.");
> >> +        migrate_set_error(migrate_get_current(), err);
> >> +        error_free(err);
> >> +        migration_cancel();
> > 
> > If we can't do anything else, this is reasonable.
> > 
> > But as discussed before, it is still not fully clear to me _why_ are
> > ramblocks changing if we have disabled add/remove memory during migration.
> 
> 
> Ramblock add/remove is ties to device add/remove, which we block.
> 
> Resize, however, it not. Here, the resize happens while the guest is
> booting up. The content/size of the ram block depends also on previous
> guest action AFAIK. There is no way from stopping the guest from doing
> that. It's required for the guest to continue booting (with ACPI).
> 
> I'm currently working on a project which reuses resizable ram blocks in
> different context. There, I can simply defer/avoid resizing when
> migration is active. In the ACPI case, however, we cannot avoid it.
> 
> Hope that answers your question
> 
> > 
> >> +    } else {
> >> +        /*
> >> +         * Postcopy code cannot deal with the size of ram blocks changing 
> >> at
> >> +         * random points in time. We're running on the target. Fail hard.
> >> +         *
> >> +         * TODO: How to handle this in a better way?
> >> +         */
> >> +        error_report("RAM resized during postcopy.");
> >> +        exit(-1);
> > 
> > Idea is good, but we also need to exit destination, not only source, no?
> 
> @Dave, any idea what could be the right thing to do here?

I think that's OK; postcopy_is_running() will return true on the
destination (e.g. see it's use in ram_load()) and should work.

I'd really appreciate if you could print hte RAMBlock or something at
this point - when we hit this error we're going to want to try and
figure out why.

Dave

> > 
> >> +    }
> >> +}
> > 
> > 
> > 
> >> +static RAMBlockNotifier ram_mig_ram_notifier = {
> >> +    .ram_block_resized = ram_mig_ram_block_resized,
> >> +};
> >> +
> >>  void ram_mig_init(void)
> >>  {
> >>      qemu_mutex_init(&XBZRLE.lock);
> >>      register_savevm_live("ram", 0, 4, &savevm_ram_handlers, &ram_state);
> >> +    ram_block_notifier_add(&ram_mig_ram_notifier);
> >>  }
> > 
> > Shouldn't we remove the notifier when we finish the migration.
> 
> It's called from main() unconditionally (so not when migration starts),
> so the notifier remains active the whole QEMU lifetime (which should be
> fine AFAIKT).
> 
> -- 
> Thanks,
> 
> David / dhildenb
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]