qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC] memory: Don't allow to resize RAM while migrating


From: Peter Xu
Subject: Re: [PATCH RFC] memory: Don't allow to resize RAM while migrating
Date: Fri, 14 Feb 2020 12:29:33 -0500

On Fri, Feb 14, 2020 at 01:02:46PM +0100, David Hildenbrand wrote:
> From c0049ac2e95d6756037db918852c507fb88297d9 Mon Sep 17 00:00:00 2001
> From: David Hildenbrand <address@hidden>
> Date: Fri, 14 Feb 2020 13:01:03 +0100
> Subject: [PATCH v1] tmp
> 
> Signed-off-by: David Hildenbrand <address@hidden>
> ---
>  migration/migration.c |  9 +++++++--
>  migration/migration.h |  1 +
>  migration/ram.c       | 42 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 50 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 3a21a4686c..0e7efe2920 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -175,13 +175,18 @@ void migration_object_init(void)
>      }
>  }
>  
> +void migration_cancel(void)
> +{
> +    migrate_fd_cancel(current_migration);
> +}
> +
>  void migration_shutdown(void)
>  {
>      /*
>       * Cancel the current migration - that will (eventually)
>       * stop the migration using this structure
>       */
> -    migrate_fd_cancel(current_migration);
> +    migration_cancel();
>      object_unref(OBJECT(current_migration));
>  }
>  
> @@ -2019,7 +2024,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
> blk,
>  
>  void qmp_migrate_cancel(Error **errp)
>  {
> -    migrate_fd_cancel(migrate_get_current());
> +    migration_cancel();
>  }
>  
>  void qmp_migrate_continue(MigrationStatus state, Error **errp)
> diff --git a/migration/migration.h b/migration/migration.h
> index 8473ddfc88..79fd74afa5 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -343,5 +343,6 @@ int foreach_not_ignored_block(RAMBlockIterFunc func, void 
> *opaque);
>  void migration_make_urgent_request(void);
>  void migration_consume_urgent_request(void);
>  bool migration_rate_limit(void);
> +void migration_cancel(void);
>  
>  #endif
> diff --git a/migration/ram.c b/migration/ram.c
> index ed23ed1c7c..f86f32b453 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -52,6 +52,7 @@
>  #include "migration/colo.h"
>  #include "block.h"
>  #include "sysemu/sysemu.h"
> +#include "sysemu/runstate.h"
>  #include "savevm.h"
>  #include "qemu/iov.h"
>  #include "multifd.h"
> @@ -3710,8 +3711,49 @@ static SaveVMHandlers savevm_ram_handlers = {
>      .resume_prepare = ram_resume_prepare,
>  };
>  
> +static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
> +                                      size_t old_size, size_t new_size)
> +{
> +    /*
> +     * We don't care about resizes triggered on incoming migration (when
> +     * syncing ram blocks), or of course, when no migration is going on.
> +     */
> +    if (migration_is_idle() || !runstate_is_running()) {
> +        return;
> +    }

I feel like migration_is_idle() check is enough.  Firstly, I feel like
we allow migration even with VM stopped.  At the meantime, if VM is
not running, I see no reason that this resizing will happen after all? :)

> +
> +    if (!postcopy_is_running()) {
> +        Error *err = NULL;
> +
> +        /*
> +         * Precopy code cannot deal with the size of ram blocks changing at
> +         * random points in time. We're still running on the source, abort
> +         * the migration and continue running here. Make sure to wait until
> +         * migration was canceled.
> +         */
> +        error_setg(&err, "RAM resized during precopy.");
> +        migrate_set_error(migrate_get_current(), err);
> +        error_free(err);
> +        migration_cancel();
> +    } else {
> +        /*
> +         * Postcopy code cannot deal with the size of ram blocks changing at
> +         * random points in time. We're running on the target. Fail hard.
> +         *
> +         * TODO: How to handle this in a better way?
> +         */
> +        error_report("RAM resized during postcopy.");
> +        exit(-1);

Now I'm rethinking the postcopy case....

ram_dirty_bitmap_reload() should only happen during the postcopy
recovery, and when that happens the VM should be stopped on both
sides.  Which means, ram resizing should not trigger there...

Thanks,

> +    }
> +}
> +
> +static RAMBlockNotifier ram_mig_ram_notifier = {
> +    .ram_block_resized = ram_mig_ram_block_resized,
> +};
> +
>  void ram_mig_init(void)
>  {
>      qemu_mutex_init(&XBZRLE.lock);
>      register_savevm_live("ram", 0, 4, &savevm_ram_handlers, &ram_state);
> +    ram_block_notifier_add(&ram_mig_ram_notifier);
>  }
> -- 
> 2.24.1
> 
> 
> Thoughts?
> 
> -- 
> Thanks,
> 
> David / dhildenb
> 
> 

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]