qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine change


From: Juan Quintela
Subject: Re: [Qemu-devel] [RFC PATCH v2 06/12] mc: introduce state machine changes for MC
Date: Tue, 11 Mar 2014 22:57:45 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

address@hidden wrote:
> From: "Michael R. Hines" <address@hidden>
>
> This patch sets up the initial changes to the migration state
> machine and prototypes to be used by the checkpointing code
> to interact with the state machine so that we can later handle
> failure and recovery scenarios.
>
> Signed-off-by: Michael R. Hines <address@hidden>
> ---
>  arch_init.c                   | 29 ++++++++++++++++++++++++-----
>  include/migration/migration.h |  2 ++
>  migration.c                   | 37 +++++++++++++++++++++----------------
>  3 files changed, 47 insertions(+), 21 deletions(-)
>
> diff --git a/arch_init.c b/arch_init.c
> index db75120..e9d4d9e 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -658,13 +658,13 @@ static void ram_migration_cancel(void *opaque)
>      migration_end();
>  }
>  
> -static void reset_ram_globals(void)
> +static void reset_ram_globals(bool reset_bulk_stage)
>  {
>      last_seen_block = NULL;
>      last_sent_block = NULL;
>      last_offset = 0;
>      last_version = ram_list.version;
> -    ram_bulk_stage = true;
> +    ram_bulk_stage = reset_bulk_stage;
>  }
>  
>  #define MAX_WAIT 50 /* ms, half buffered_file limit */
> @@ -674,6 +674,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      RAMBlock *block;
>      int64_t ram_pages = last_ram_offset() >> TARGET_PAGE_BITS;
>  
> +    /*
> +     * RAM stays open during micro-checkpointing for the next transaction.
> +     */
> +    if (migration_is_mc(migrate_get_current())) {
> +        qemu_mutex_lock_ramlist();
> +        reset_ram_globals(false);
> +        goto skip_setup;
> +    }
> +
>      migration_bitmap = bitmap_new(ram_pages);
>      bitmap_set(migration_bitmap, 0, ram_pages);
>      migration_dirty_pages = ram_pages;
> @@ -710,12 +719,14 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      qemu_mutex_lock_iothread();
>      qemu_mutex_lock_ramlist();
>      bytes_transferred = 0;
> -    reset_ram_globals();
> +    reset_ram_globals(true);
>  
>      memory_global_dirty_log_start();
>      migration_bitmap_sync();
>      qemu_mutex_unlock_iothread();
>  
> +skip_setup:
> +
>      qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE);
>  
>      QTAILQ_FOREACH(block, &ram_list.blocks, next) {
> @@ -744,7 +755,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      qemu_mutex_lock_ramlist();
>  
>      if (ram_list.version != last_version) {
> -        reset_ram_globals();
> +        reset_ram_globals(true);
>      }
>  
>      ram_control_before_iterate(f, RAM_CONTROL_ROUND);
> @@ -825,7 +836,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>      }
>  
>      ram_control_after_iterate(f, RAM_CONTROL_FINISH);
> -    migration_end();
> +
> +    /*
> +     * Only cleanup at the end of normal migrations
> +     * or if the MC destination failed and we got an error.
> +     * Otherwise, we are (or will soon be) in MIG_STATE_CHECKPOINTING.
> +     */
> +    if(!migrate_use_mc() || migration_has_failed(migrate_get_current())) {
> +        migration_end();
> +    }
>  
>      qemu_mutex_unlock_ramlist();
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);



I haven't looked at the code in detail, but what we have here is
esentially:


ram_save_complete()
{
   code not needed for mc
   common codo for migration and mc
   code not needed for mc
}

Similar code on ram_save_setup.  Yes, I know that there are some locking
issues here.


SHould we be able do do something like

__ram_save_complete()
{
    common code
}

mc_ram_save_complete()
{
    # Possible something else here
    __ram_save_complete()
}

rest_ram_save_complete()
{
    code not needed for mc
    __ram_save_complete()
    code not needed for mc
}

My problem here is that current code is already quite complex and
convoluted.  At some point we are going to need to change it to
something that is easier to understand?


> -enum {
> -    MIG_STATE_ERROR = -1,
> -    MIG_STATE_NONE,
> -    MIG_STATE_SETUP,
> -    MIG_STATE_CANCELLING,
> -    MIG_STATE_CANCELLED,
> -    MIG_STATE_ACTIVE,
> -    MIG_STATE_COMPLETED,
> -};
> -

Here comes the code seen on the previous patch O:-)

>  
> -static void migrate_set_state(MigrationState *s, int old_state, int 
> new_state)
> +bool migration_is_active(MigrationState *s)
> +{
> +    return (s->state == MIG_STATE_ACTIVE) || s->state == MIG_STATE_SETUP
> +            || s->state == MIG_STATE_CHECKPOINTING;
> +}

The whole idea of moving MIG_STATE_* to this file was to "force" all
other users to use accessor functions.  This way we know what the others
expect frum us.

> -    assert(s->state != MIG_STATE_ACTIVE);
> +    assert(!migration_is_active(s));

I can understand that we want here MIG_STATE_CHECKPOINTING, but _SETUP?
Or it is a bug on upstream?

Thanks, Juan.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]