qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] migration: skip sending ram pages released b


From: Jitendra Kolhe
Subject: Re: [Qemu-devel] [PATCH v2] migration: skip sending ram pages released by virtio-balloon driver.
Date: Tue, 5 Apr 2016 15:34:18 +0530
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0

On 3/31/2016 10:09 PM, Dr. David Alan Gilbert wrote:
> * Jitendra Kolhe (address@hidden) wrote:
>> While measuring live migration performance for qemu/kvm guest, it
>> was observed that the qemu doesn’t maintain any intelligence for the
>> guest ram pages which are released by the guest balloon driver and
>> treat such pages as any other normal guest ram pages. This has direct
>> impact on overall migration time for the guest which has released
>> (ballooned out) memory to the host.

sorry for delayed response.

> 
> Hi Jitendra,
>   I've read over the patch and I've got a mix of comments; I've not read
> it in full detail:
> 
>    a) It does need splitting up; it's a bit big to review in one go;
>       I suggest you split it into the main code, and separately the bitmap
>       save/load.  It might be worth splitting it up even more.

Sure will post patch series in next version.

> 
>    b) in balloon_bitmap_load check the next and len fields; since it's read
>       over the wire we've got to treat them as hostile; so check they don't
>       run over the length of the bitmap.

will check for validity for next and len fields, would assert 
makes sense in case the values are read wrongly form the wire?

> 
>    c) The bitmap_load/save needs to be tied to the machine type or something
>       that means that if you were migraitng in a stream from an older qemu
>       it wouldn't get upset when it tried to read the extra data you read.
>       I prefer if it's tied to either the config setting or the new machine
>       type (that way backwards migration works as well).

Thanks, migration from older qemu to qemu+patch and back is a problem. 
We already have a configuration option “skip-balloon” to disable the 
optimization (test against bitmap during migration), but bitmap_load/save 
are independent of this option. Will extend the configuration option for 
bitmap_load/save.

> 
>    d) I agree with the other comments tha the stuff looking up the ram blocks
>       addressing looks wrong;  you use last_ram_offset() to size the bitmap,
>       so it makes me think it's the whole of the ram_addr_t; but I think 
> you're
>       saying you're not interested in all of it.   However remember that
>       the order of ram_addr_t is not stable between two qemu's - even 
> something
>       like hotplugging a ethercard in one qemu vs having it on the command 
> line on
>       the other can change that order; so anything going over the wire has
>       to be block+offset-into-block. Also remember that last_ram_offset() 
> includes
>       annoying things like firmware RAM, video memory and all those things;
> 

Yes, we are using last_ram_offset() to create the bitmap, but the offsetting 
in the bitmap was wrongly done in patch (epically for hotplugged ramblock).
Will change code to calculate ram_addr_t using qemu_ram_block_from_host() 
rather than just using section.offset

>    e) It should be possible to get it to work for postcopy if you just use
>       it as a way to speed up the zero detection but still send the zero page
>       messages.
> 

In case of postcopy, bitmap_save will happen after vm_stop(), 
this should have direct impact on downtime. Would that be fine? 
In case of pre-copy the major chunk of the bitmap is copied as 
part of ram_save_iterate itself.
zero page detection can be enabled for postcopy. Can’t ballooned 
out zero page messages be skipped in postcopy as it is done for 
precopy, am I missing something?

> Dave
> 
>>
>> In case of large systems, where we can configure large guests with 1TB
>> and with considerable amount of memory release by balloon driver to the,
>> host the migration time gets worse.
>>
>> The solution proposed below is local only to qemu (and does not require
>> any modification to Linux kernel or any guest driver). We have verified
>> the fix for large guests =1TB on HPE Superdome X (which can support up
>> to 240 cores and 12TB of memory) and in case where 90% of memory is
>> released by balloon driver the migration time for an idle guests reduces
>> to ~600 sec's from ~1200 sec’s.
>>
>> Detail: During live migration, as part of 1st iteration in ram_save_iterate()
>> -> ram_find_and_save_block () will try to migrate ram pages which are
>> released by vitrio-balloon driver as part of dynamic memory delete.
>> Even though the pages which are returned to the host by virtio-balloon
>> driver are zero pages, the migration algorithm will still end up
>> scanning the entire page ram_find_and_save_block() -> ram_save_page/
>> ram_save_compressed_page -> save_zero_page() -> is_zero_range().  We
>> also end-up sending some control information over network for these
>> page during migration. This adds to total migration time.
>>
>> The proposed fix, uses the existing bitmap infrastructure to create
>> a virtio-balloon bitmap. The bits in the bitmap represent a guest ram
>> page of size 1UL<< VIRTIO_BALLOON_PFN_SHIFT. The bitmap represents
>> entire guest ram memory till max configured memory. Guest ram pages
>> claimed by the virtio-balloon driver will be represented by 1 in the
>> bitmap. During live migration, each guest ram page (host VA offset)
>> is checked against the virtio-balloon bitmap, if the bit is set the
>> corresponding ram page will be excluded from scanning and sending
>> control information during migration. The bitmap is also migrated to
>> the target as part of every ram_save_iterate loop and after the
>> guest is stopped remaining balloon bitmap is migrated as part of
>> balloon driver save / load interface.
>>
>> With the proposed fix, the average migration time for an idle guest
>> with 1TB maximum memory and 64vCpus
>>  - reduces from ~1200 secs to ~600 sec, with guest memory ballooned
>>    down to 128GB (~10% of 1TB).
>>  - reduces from ~1300 to ~1200 sec (7%), with guest memory ballooned
>>    down to 896GB (~90% of 1TB),
>>  - with no ballooning configured, we don’t expect to see any impact
>>    on total migration time.
>>
>> The optimization gets temporarily disabled, if the balloon operation is
>> in progress. Since the optimization skips scanning and migrating control
>> information for ballooned out pages, we might skip guest ram pages in
>> cases where the guest balloon driver has freed the ram page to the guest
>> but not yet informed the host/qemu about the ram page
>> (VIRTIO_BALLOON_F_MUST_TELL_HOST). In such case with optimization, we
>> might skip migrating ram pages which the guest is using. Since this
>> problem is specific to balloon leak, we can restrict balloon operation in
>> progress check to only balloon leak operation in progress check.
>>
>> The optimization also get permanently disabled (for all subsequent
>> migrations) in case any of the migration uses postcopy capability. In case
>> of postcopy the balloon bitmap would be required to send after vm_stop,
>> which has significant impact on the downtime. Moreover, the applications
>> in the guest space won’t be actually faulting on the ram pages which are
>> already ballooned out, the proposed optimization will not show any
>> improvement in migration time during postcopy.
>>
>> Signed-off-by: Jitendra Kolhe <address@hidden>
>> ---
>> Changed in v2:
>>  - Resolved compilation issue for qemu-user binaries in exec.c
>>  - Localize balloon bitmap test to save_zero_page().
>>  - Updated version string for newly added migration capability to 2.7.
>>  - Made minor modifications to patch commit text.
>>
>>  balloon.c                          | 253 
>> ++++++++++++++++++++++++++++++++++++-
>>  exec.c                             |   3 +
>>  hw/virtio/virtio-balloon.c         |  35 ++++-
>>  include/hw/virtio/virtio-balloon.h |   1 +
>>  include/migration/migration.h      |   1 +
>>  include/sysemu/balloon.h           |  15 ++-
>>  migration/migration.c              |   9 ++
>>  migration/ram.c                    |  31 ++++-
>>  qapi-schema.json                   |   5 +-
>>  9 files changed, 341 insertions(+), 12 deletions(-)
>>
>> diff --git a/balloon.c b/balloon.c
>> index f2ef50c..1c2d228 100644
>> --- a/balloon.c
>> +++ b/balloon.c
>> @@ -33,11 +33,34 @@
>>  #include "qmp-commands.h"
>>  #include "qapi/qmp/qerror.h"
>>  #include "qapi/qmp/qjson.h"
>> +#include "exec/ram_addr.h"
>> +#include "migration/migration.h"
>> +
>> +#define BALLOON_BITMAP_DISABLE_FLAG -1UL
>> +
>> +typedef enum {
>> +    BALLOON_BITMAP_DISABLE_NONE = 1, /* Enabled */
>> +    BALLOON_BITMAP_DISABLE_CURRENT,
>> +    BALLOON_BITMAP_DISABLE_PERMANENT,
>> +} BalloonBitmapDisableState;
>>  
>>  static QEMUBalloonEvent *balloon_event_fn;
>>  static QEMUBalloonStatus *balloon_stat_fn;
>> +static QEMUBalloonInProgress *balloon_in_progress_fn;
>>  static void *balloon_opaque;
>>  static bool balloon_inhibited;
>> +static unsigned long balloon_bitmap_pages;
>> +static unsigned int  balloon_bitmap_pfn_shift;
>> +static QemuMutex balloon_bitmap_mutex;
>> +static bool balloon_bitmap_xfered;
>> +static unsigned long balloon_min_bitmap_offset;
>> +static unsigned long balloon_max_bitmap_offset;
>> +static BalloonBitmapDisableState balloon_bitmap_disable_state;
>> +
>> +static struct BitmapRcu {
>> +    struct rcu_head rcu;
>> +    unsigned long *bmap;
>> +} *balloon_bitmap_rcu;
>>  
>>  bool qemu_balloon_is_inhibited(void)
>>  {
>> @@ -49,6 +72,21 @@ void qemu_balloon_inhibit(bool state)
>>      balloon_inhibited = state;
>>  }
>>  
>> +void qemu_mutex_lock_balloon_bitmap(void)
>> +{
>> +    qemu_mutex_lock(&balloon_bitmap_mutex);
>> +}
>> +
>> +void qemu_mutex_unlock_balloon_bitmap(void)
>> +{
>> +    qemu_mutex_unlock(&balloon_bitmap_mutex);
>> +}
>> +
>> +void qemu_balloon_reset_bitmap_data(void)
>> +{
>> +    balloon_bitmap_xfered = false;
>> +}
>> +
>>  static bool have_balloon(Error **errp)
>>  {
>>      if (kvm_enabled() && !kvm_has_sync_mmu()) {
>> @@ -65,9 +103,12 @@ static bool have_balloon(Error **errp)
>>  }
>>  
>>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>> -                             QEMUBalloonStatus *stat_func, void *opaque)
>> +                             QEMUBalloonStatus *stat_func,
>> +                             QEMUBalloonInProgress *in_progress_func,
>> +                             void *opaque, int pfn_shift)
>>  {
>> -    if (balloon_event_fn || balloon_stat_fn || balloon_opaque) {
>> +    if (balloon_event_fn || balloon_stat_fn ||
>> +        balloon_in_progress_fn || balloon_opaque) {
>>          /* We're already registered one balloon handler.  How many can
>>           * a guest really have?
>>           */
>> @@ -75,17 +116,39 @@ int qemu_add_balloon_handler(QEMUBalloonEvent 
>> *event_func,
>>      }
>>      balloon_event_fn = event_func;
>>      balloon_stat_fn = stat_func;
>> +    balloon_in_progress_fn = in_progress_func;
>>      balloon_opaque = opaque;
>> +
>> +    qemu_mutex_init(&balloon_bitmap_mutex);
>> +    balloon_bitmap_disable_state = BALLOON_BITMAP_DISABLE_NONE;
>> +    balloon_bitmap_pfn_shift = pfn_shift;
>> +    balloon_bitmap_pages = (last_ram_offset() >> balloon_bitmap_pfn_shift);
>> +    balloon_bitmap_rcu = g_new0(struct BitmapRcu, 1);
>> +    balloon_bitmap_rcu->bmap = bitmap_new(balloon_bitmap_pages);
>> +    bitmap_clear(balloon_bitmap_rcu->bmap, 0, balloon_bitmap_pages);
>> +
>>      return 0;
>>  }
>>  
>> +static void balloon_bitmap_free(struct BitmapRcu *bmap)
>> +{
>> +    g_free(bmap->bmap);
>> +    g_free(bmap);
>> +}
>> +
>>  void qemu_remove_balloon_handler(void *opaque)
>>  {
>> +    struct BitmapRcu *bitmap = balloon_bitmap_rcu;
>>      if (balloon_opaque != opaque) {
>>          return;
>>      }
>> +    atomic_rcu_set(&balloon_bitmap_rcu, NULL);
>> +    if (bitmap) {
>> +        call_rcu(bitmap, balloon_bitmap_free, rcu);
>> +    }
>>      balloon_event_fn = NULL;
>>      balloon_stat_fn = NULL;
>> +    balloon_in_progress_fn = NULL;
>>      balloon_opaque = NULL;
>>  }
>>  
>> @@ -116,3 +179,189 @@ void qmp_balloon(int64_t target, Error **errp)
>>      trace_balloon_event(balloon_opaque, target);
>>      balloon_event_fn(balloon_opaque, target);
>>  }
>> +
>> +/* Handle Ram hotplug case, only called in case old < new */
>> +int qemu_balloon_bitmap_extend(ram_addr_t old, ram_addr_t new)
>> +{
>> +    struct BitmapRcu *old_bitmap = balloon_bitmap_rcu, *bitmap;
>> +    unsigned long old_offset, new_offset;
>> +
>> +    if (!balloon_bitmap_rcu) {
>> +        return -1;
>> +    }
>> +
>> +    old_offset = (old >> balloon_bitmap_pfn_shift);
>> +    new_offset = (new >> balloon_bitmap_pfn_shift);
>> +
>> +    bitmap = g_new(struct BitmapRcu, 1);
>> +    bitmap->bmap = bitmap_new(new_offset);
>> +
>> +    qemu_mutex_lock_balloon_bitmap();
>> +    bitmap_clear(bitmap->bmap, 0,
>> +                 balloon_bitmap_pages + new_offset - old_offset);
>> +    bitmap_copy(bitmap->bmap, old_bitmap->bmap, old_offset);
>> +
>> +    atomic_rcu_set(&balloon_bitmap_rcu, bitmap);
>> +    balloon_bitmap_pages += new_offset - old_offset;
>> +    qemu_mutex_unlock_balloon_bitmap();
>> +    call_rcu(old_bitmap, balloon_bitmap_free, rcu);
>> +
>> +    return 0;
>> +}
>> +
>> +/* Should be called with balloon bitmap mutex lock held */
>> +int qemu_balloon_bitmap_update(ram_addr_t addr, int deflate)
>> +{
>> +    unsigned long *bitmap;
>> +    unsigned long offset = 0;
>> +
>> +    if (!balloon_bitmap_rcu) {
>> +        return -1;
>> +    }
>> +    offset = (addr >> balloon_bitmap_pfn_shift);
>> +    if (balloon_bitmap_xfered) {
>> +        if (offset < balloon_min_bitmap_offset) {
>> +            balloon_min_bitmap_offset = offset;
>> +        }
>> +        if (offset > balloon_max_bitmap_offset) {
>> +            balloon_max_bitmap_offset = offset;
>> +        }
>> +    }
>> +
>> +    rcu_read_lock();
>> +    bitmap = atomic_rcu_read(&balloon_bitmap_rcu)->bmap;
>> +    if (deflate == 0) {
>> +        set_bit(offset, bitmap);
>> +    } else {
>> +        clear_bit(offset, bitmap);
>> +    }
>> +    rcu_read_unlock();
>> +    return 0;
>> +}
>> +
>> +void qemu_balloon_bitmap_setup(void)
>> +{
>> +    if (migrate_postcopy_ram()) {
>> +        balloon_bitmap_disable_state = BALLOON_BITMAP_DISABLE_PERMANENT;
>> +    } else if ((!balloon_bitmap_rcu || !migrate_skip_balloon()) &&
>> +               (balloon_bitmap_disable_state !=
>> +                BALLOON_BITMAP_DISABLE_PERMANENT)) {
>> +        balloon_bitmap_disable_state = BALLOON_BITMAP_DISABLE_CURRENT;
>> +    }
>> +}
>> +
>> +int qemu_balloon_bitmap_test(RAMBlock *rb, ram_addr_t addr)
>> +{
>> +    unsigned long *bitmap;
>> +    ram_addr_t base;
>> +    unsigned long nr = 0;
>> +    int ret = 0;
>> +
>> +    if (balloon_bitmap_disable_state == BALLOON_BITMAP_DISABLE_CURRENT ||
>> +        balloon_bitmap_disable_state == BALLOON_BITMAP_DISABLE_PERMANENT) {
>> +        return 0;
>> +    }
>> +    balloon_in_progress_fn(balloon_opaque, &ret);
>> +    if (ret == 1) {
>> +        return 0;
>> +    }
>> +
>> +    rcu_read_lock();
>> +    bitmap = atomic_rcu_read(&balloon_bitmap_rcu)->bmap;
>> +    base = rb->offset >> balloon_bitmap_pfn_shift;
>> +    nr = base + (addr >> balloon_bitmap_pfn_shift);
>> +    if (test_bit(nr, bitmap)) {
>> +        ret = 1;
>> +    }
>> +    rcu_read_unlock();
>> +    return ret;
>> +}
>> +
>> +int qemu_balloon_bitmap_save(QEMUFile *f)
>> +{
>> +    unsigned long *bitmap;
>> +    unsigned long offset = 0, next = 0, len = 0;
>> +    unsigned long tmpoffset = 0, tmplimit = 0;
>> +
>> +    if (balloon_bitmap_disable_state == BALLOON_BITMAP_DISABLE_PERMANENT) {
>> +        qemu_put_be64(f, BALLOON_BITMAP_DISABLE_FLAG);
>> +        return 0;
>> +    }
>> +
>> +    qemu_mutex_lock_balloon_bitmap();
>> +    if (balloon_bitmap_xfered) {
>> +        tmpoffset = balloon_min_bitmap_offset;
>> +        tmplimit  = balloon_max_bitmap_offset;
>> +    } else {
>> +        balloon_bitmap_xfered = true;
>> +        tmpoffset = offset;
>> +        tmplimit  = balloon_bitmap_pages;
>> +    }
>> +
>> +    balloon_min_bitmap_offset = balloon_bitmap_pages;
>> +    balloon_max_bitmap_offset = 0;
>> +
>> +    qemu_put_be64(f, balloon_bitmap_pages);
>> +    qemu_put_be64(f, tmpoffset);
>> +    qemu_put_be64(f, tmplimit);
>> +    rcu_read_lock();
>> +    bitmap = atomic_rcu_read(&balloon_bitmap_rcu)->bmap;
>> +    while (tmpoffset < tmplimit) {
>> +        unsigned long next_set_bit, start_set_bit;
>> +        next_set_bit = find_next_bit(bitmap, balloon_bitmap_pages, 
>> tmpoffset);
>> +        start_set_bit = next_set_bit;
>> +        if (next_set_bit == balloon_bitmap_pages) {
>> +            len = 0;
>> +            next = start_set_bit;
>> +            qemu_put_be64(f, next);
>> +            qemu_put_be64(f, len);
>> +            break;
>> +        }
>> +        next_set_bit = find_next_zero_bit(bitmap,
>> +                                          balloon_bitmap_pages,
>> +                                          ++next_set_bit);
>> +        len = (next_set_bit - start_set_bit);
>> +        next = start_set_bit;
>> +        qemu_put_be64(f, next);
>> +        qemu_put_be64(f, len);
>> +        tmpoffset = next + len;
>> +    }
>> +    rcu_read_unlock();
>> +    qemu_mutex_unlock_balloon_bitmap();
>> +    return 0;
>> +}
>> +
>> +int qemu_balloon_bitmap_load(QEMUFile *f)
>> +{
>> +    unsigned long *bitmap;
>> +    unsigned long next = 0, len = 0;
>> +    unsigned long tmpoffset = 0, tmplimit = 0;
>> +
>> +    if (!balloon_bitmap_rcu) {
>> +        return -1;
>> +    }
>> +
>> +    qemu_mutex_lock_balloon_bitmap();
>> +    balloon_bitmap_pages = qemu_get_be64(f);
>> +    if (balloon_bitmap_pages == BALLOON_BITMAP_DISABLE_FLAG) {
>> +        balloon_bitmap_disable_state = BALLOON_BITMAP_DISABLE_PERMANENT;
>> +        qemu_mutex_unlock_balloon_bitmap();
>> +        return 0;
>> +    }
>> +    tmpoffset = qemu_get_be64(f);
>> +    tmplimit  = qemu_get_be64(f);
>> +    rcu_read_lock();
>> +    bitmap = atomic_rcu_read(&balloon_bitmap_rcu)->bmap;
>> +    while (tmpoffset < tmplimit) {
>> +        next = qemu_get_be64(f);
>> +        len  = qemu_get_be64(f);
>> +        if (len == 0) {
>> +            break;
>> +        }
>> +        bitmap_set(bitmap, next, len);
>> +        tmpoffset = next + len;
>> +    }
>> +    rcu_read_unlock();
>> +    qemu_mutex_unlock_balloon_bitmap();
>> +    return 0;
>> +}
>> diff --git a/exec.c b/exec.c
>> index f398d21..7a448e5 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -43,6 +43,7 @@
>>  #else /* !CONFIG_USER_ONLY */
>>  #include "sysemu/xen-mapcache.h"
>>  #include "trace.h"
>> +#include "sysemu/balloon.h"
>>  #endif
>>  #include "exec/cpu-all.h"
>>  #include "qemu/rcu_queue.h"
>> @@ -1610,6 +1611,8 @@ static void ram_block_add(RAMBlock *new_block, Error 
>> **errp)
>>      if (new_ram_size > old_ram_size) {
>>          migration_bitmap_extend(old_ram_size, new_ram_size);
>>          dirty_memory_extend(old_ram_size, new_ram_size);
>> +        qemu_balloon_bitmap_extend(old_ram_size << TARGET_PAGE_BITS,
>> +                                   new_ram_size << TARGET_PAGE_BITS);
>>      }
>>      /* Keep the list sorted from biggest to smallest block.  Unlike QTAILQ,
>>       * QLIST (which has an RCU-friendly variant) does not have insertion at
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 22ad25c..9f3a4c8 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -27,6 +27,7 @@
>>  #include "qapi/visitor.h"
>>  #include "qapi-event.h"
>>  #include "trace.h"
>> +#include "migration/migration.h"
>>  
>>  #if defined(__linux__)
>>  #include <sys/mman.h>
>> @@ -214,11 +215,13 @@ static void virtio_balloon_handle_output(VirtIODevice 
>> *vdev, VirtQueue *vq)
>>      VirtQueueElement *elem;
>>      MemoryRegionSection section;
>>  
>> +    qemu_mutex_lock_balloon_bitmap();
>>      for (;;) {
>>          size_t offset = 0;
>>          uint32_t pfn;
>>          elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>>          if (!elem) {
>> +            qemu_mutex_unlock_balloon_bitmap();
>>              return;
>>          }
>>  
>> @@ -242,6 +245,7 @@ static void virtio_balloon_handle_output(VirtIODevice 
>> *vdev, VirtQueue *vq)
>>              addr = section.offset_within_region;
>>              balloon_page(memory_region_get_ram_ptr(section.mr) + addr,
>>                           !!(vq == s->dvq));
>> +            qemu_balloon_bitmap_update(addr, !!(vq == s->dvq));
>>              memory_region_unref(section.mr);
>>          }
>>  
>> @@ -249,6 +253,7 @@ static void virtio_balloon_handle_output(VirtIODevice 
>> *vdev, VirtQueue *vq)
>>          virtio_notify(vdev, vq);
>>          g_free(elem);
>>      }
>> +    qemu_mutex_unlock_balloon_bitmap();
>>  }
>>  
>>  static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
>> @@ -303,6 +308,16 @@ out:
>>      }
>>  }
>>  
>> +static void virtio_balloon_migration_state_changed(Notifier *notifier,
>> +                                                   void *data)
>> +{
>> +    MigrationState *mig = data;
>> +
>> +    if (migration_has_failed(mig)) {
>> +        qemu_balloon_reset_bitmap_data();
>> +    }
>> +}
>> +
>>  static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t 
>> *config_data)
>>  {
>>      VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>> @@ -382,6 +397,16 @@ static void virtio_balloon_stat(void *opaque, 
>> BalloonInfo *info)
>>                                               VIRTIO_BALLOON_PFN_SHIFT);
>>  }
>>  
>> +static void virtio_balloon_in_progress(void *opaque, int *status)
>> +{
>> +    VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
>> +    if (cpu_to_le32(dev->actual) != cpu_to_le32(dev->num_pages)) {
>> +        *status = 1;
>> +        return;
>> +    }
>> +    *status = 0;
>> +}
>> +
>>  static void virtio_balloon_to_target(void *opaque, ram_addr_t target)
>>  {
>>      VirtIOBalloon *dev = VIRTIO_BALLOON(opaque);
>> @@ -409,6 +434,7 @@ static void virtio_balloon_save_device(VirtIODevice 
>> *vdev, QEMUFile *f)
>>  
>>      qemu_put_be32(f, s->num_pages);
>>      qemu_put_be32(f, s->actual);
>> +    qemu_balloon_bitmap_save(f);
>>  }
>>  
>>  static int virtio_balloon_load(QEMUFile *f, void *opaque, int version_id)
>> @@ -426,6 +452,7 @@ static int virtio_balloon_load_device(VirtIODevice 
>> *vdev, QEMUFile *f,
>>  
>>      s->num_pages = qemu_get_be32(f);
>>      s->actual = qemu_get_be32(f);
>> +    qemu_balloon_bitmap_load(f);
>>      return 0;
>>  }
>>  
>> @@ -439,7 +466,9 @@ static void virtio_balloon_device_realize(DeviceState 
>> *dev, Error **errp)
>>                  sizeof(struct virtio_balloon_config));
>>  
>>      ret = qemu_add_balloon_handler(virtio_balloon_to_target,
>> -                                   virtio_balloon_stat, s);
>> +                                   virtio_balloon_stat,
>> +                                   virtio_balloon_in_progress, s,
>> +                                   VIRTIO_BALLOON_PFN_SHIFT);
>>  
>>      if (ret < 0) {
>>          error_setg(errp, "Only one balloon device is supported");
>> @@ -453,6 +482,9 @@ static void virtio_balloon_device_realize(DeviceState 
>> *dev, Error **errp)
>>  
>>      reset_stats(s);
>>  
>> +    s->migration_state_notifier.notify = 
>> virtio_balloon_migration_state_changed;
>> +    add_migration_state_change_notifier(&s->migration_state_notifier);
>> +
>>      register_savevm(dev, "virtio-balloon", -1, 1,
>>                      virtio_balloon_save, virtio_balloon_load, s);
>>  }
>> @@ -462,6 +494,7 @@ static void virtio_balloon_device_unrealize(DeviceState 
>> *dev, Error **errp)
>>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>      VirtIOBalloon *s = VIRTIO_BALLOON(dev);
>>  
>> +    remove_migration_state_change_notifier(&s->migration_state_notifier);
>>      balloon_stats_destroy_timer(s);
>>      qemu_remove_balloon_handler(s);
>>      unregister_savevm(dev, "virtio-balloon", s);
>> diff --git a/include/hw/virtio/virtio-balloon.h 
>> b/include/hw/virtio/virtio-balloon.h
>> index 35f62ac..1ded5a9 100644
>> --- a/include/hw/virtio/virtio-balloon.h
>> +++ b/include/hw/virtio/virtio-balloon.h
>> @@ -43,6 +43,7 @@ typedef struct VirtIOBalloon {
>>      int64_t stats_last_update;
>>      int64_t stats_poll_interval;
>>      uint32_t host_features;
>> +    Notifier migration_state_notifier;
>>  } VirtIOBalloon;
>>  
>>  #endif
>> diff --git a/include/migration/migration.h b/include/migration/migration.h
>> index ac2c12c..6c1d1af 100644
>> --- a/include/migration/migration.h
>> +++ b/include/migration/migration.h
>> @@ -267,6 +267,7 @@ void migrate_del_blocker(Error *reason);
>>  
>>  bool migrate_postcopy_ram(void);
>>  bool migrate_zero_blocks(void);
>> +bool migrate_skip_balloon(void);
>>  
>>  bool migrate_auto_converge(void);
>>  
>> diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
>> index 3f976b4..5325c38 100644
>> --- a/include/sysemu/balloon.h
>> +++ b/include/sysemu/balloon.h
>> @@ -15,14 +15,27 @@
>>  #define _QEMU_BALLOON_H
>>  
>>  #include "qapi-types.h"
>> +#include "migration/qemu-file.h"
>>  
>>  typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
>>  typedef void (QEMUBalloonStatus)(void *opaque, BalloonInfo *info);
>> +typedef void (QEMUBalloonInProgress) (void *opaque, int *status);
>>  
>>  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
>> -                         QEMUBalloonStatus *stat_func, void *opaque);
>> +                             QEMUBalloonStatus *stat_func,
>> +                             QEMUBalloonInProgress *progress_func,
>> +                             void *opaque, int pfn_shift);
>>  void qemu_remove_balloon_handler(void *opaque);
>>  bool qemu_balloon_is_inhibited(void);
>>  void qemu_balloon_inhibit(bool state);
>> +void qemu_mutex_lock_balloon_bitmap(void);
>> +void qemu_mutex_unlock_balloon_bitmap(void);
>> +void qemu_balloon_reset_bitmap_data(void);
>> +void qemu_balloon_bitmap_setup(void);
>> +int qemu_balloon_bitmap_extend(ram_addr_t old, ram_addr_t new);
>> +int qemu_balloon_bitmap_update(ram_addr_t addr, int deflate);
>> +int qemu_balloon_bitmap_test(RAMBlock *rb, ram_addr_t addr);
>> +int qemu_balloon_bitmap_save(QEMUFile *f);
>> +int qemu_balloon_bitmap_load(QEMUFile *f);
>>  
>>  #endif
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 034a918..cb86307 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1200,6 +1200,15 @@ int migrate_use_xbzrle(void)
>>      return s->enabled_capabilities[MIGRATION_CAPABILITY_XBZRLE];
>>  }
>>  
>> +bool migrate_skip_balloon(void)
>> +{
>> +    MigrationState *s;
>> +
>> +    s = migrate_get_current();
>> +
>> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_SKIP_BALLOON];
>> +}
>> +
>>  int64_t migrate_xbzrle_cache_size(void)
>>  {
>>      MigrationState *s;
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 704f6a9..161ab73 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -40,6 +40,7 @@
>>  #include "trace.h"
>>  #include "exec/ram_addr.h"
>>  #include "qemu/rcu_queue.h"
>> +#include "sysemu/balloon.h"
>>  
>>  #ifdef DEBUG_MIGRATION_RAM
>>  #define DPRINTF(fmt, ...) \
>> @@ -65,6 +66,7 @@ static uint64_t bitmap_sync_count;
>>  #define RAM_SAVE_FLAG_XBZRLE   0x40
>>  /* 0x80 is reserved in migration.h start with 0x100 next */
>>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
>> +#define RAM_SAVE_FLAG_BALLOON  0x200
>>  
>>  static const uint8_t ZERO_TARGET_PAGE[TARGET_PAGE_SIZE];
>>  
>> @@ -702,13 +704,17 @@ static int save_zero_page(QEMUFile *f, RAMBlock 
>> *block, ram_addr_t offset,
>>  {
>>      int pages = -1;
>>  
>> -    if (is_zero_range(p, TARGET_PAGE_SIZE)) {
>> -        acct_info.dup_pages++;
>> -        *bytes_transferred += save_page_header(f, block,
>> +    if (qemu_balloon_bitmap_test(block, offset) != 1) {
>> +        if (is_zero_range(p, TARGET_PAGE_SIZE)) {
>> +            acct_info.dup_pages++;
>> +            *bytes_transferred += save_page_header(f, block,
>>                                                 offset | 
>> RAM_SAVE_FLAG_COMPRESS);
>> -        qemu_put_byte(f, 0);
>> -        *bytes_transferred += 1;
>> -        pages = 1;
>> +            qemu_put_byte(f, 0);
>> +            *bytes_transferred += 1;
>> +            pages = 1;
>> +        }
>> +    } else {
>> +        pages = 0;
>>      }
>>  
>>      return pages;
>> @@ -773,7 +779,7 @@ static int ram_save_page(QEMUFile *f, PageSearchStatus 
>> *pss,
>>               * page would be stale
>>               */
>>              xbzrle_cache_zero_page(current_addr);
>> -        } else if (!ram_bulk_stage && migrate_use_xbzrle()) {
>> +        } else if (pages != 0 && !ram_bulk_stage && migrate_use_xbzrle()) {
> 
> Is this test the right way around - don't you want to try xbzrle if you've 
> NOT sent
> a page?
> 

The page is ballooned out on the source itself, so we need not xbzrle and send 
the 
page. But will change the code to cache_insert() ZERO_TARGET_PAGE, so that even 
when the page gets ballooned back in on the source (during migration), 
get_cached_data() would return zero page.

Thanks,
- Jitendra

>>              pages = save_xbzrle_page(f, &p, current_addr, block,
>>                                       offset, last_stage, bytes_transferred);
>>              if (!last_stage) {
>> @@ -1355,6 +1361,9 @@ static int ram_find_and_save_block(QEMUFile *f, bool 
>> last_stage,
>>          }
>>  
>>          if (found) {
>> +            /* skip saving ram host page if the corresponding guest page
>> +             * is ballooned out
>> +             */
>>              pages = ram_save_host_page(ms, f, &pss,
>>                                         last_stage, bytes_transferred,
>>                                         dirty_ram_abs);
>> @@ -1959,6 +1968,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>  
>>      rcu_read_unlock();
>>  
>> +    qemu_balloon_bitmap_setup();
>>      ram_control_before_iterate(f, RAM_CONTROL_SETUP);
>>      ram_control_after_iterate(f, RAM_CONTROL_SETUP);
>>  
>> @@ -1984,6 +1994,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>>  
>>      ram_control_before_iterate(f, RAM_CONTROL_ROUND);
>>  
>> +    qemu_put_be64(f, RAM_SAVE_FLAG_BALLOON);
>> +    qemu_balloon_bitmap_save(f);
>> +
>>      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>>      i = 0;
>>      while ((ret = qemu_file_rate_limit(f)) == 0) {
>> @@ -2493,6 +2506,10 @@ static int ram_load(QEMUFile *f, void *opaque, int 
>> version_id)
>>              }
>>              break;
>>  
>> +        case RAM_SAVE_FLAG_BALLOON:
>> +            qemu_balloon_bitmap_load(f);
>> +            break;
>> +
>>          case RAM_SAVE_FLAG_COMPRESS:
>>              ch = qemu_get_byte(f);
>>              ram_handle_compressed(host, ch, TARGET_PAGE_SIZE);
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 7f8d799..38163ca 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -544,11 +544,14 @@
>>  #          been migrated, pulling the remaining pages along as needed. 
>> NOTE: If
>>  #          the migration fails during postcopy the VM will fail.  (since 
>> 2.6)
>>  #
>> +# @skip-balloon: Skip scanning ram pages released by virtio-balloon driver.
>> +#          (since 2.7)
>> +#
>>  # Since: 1.2
>>  ##
>>  { 'enum': 'MigrationCapability',
>>    'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
>> -           'compress', 'events', 'postcopy-ram'] }
>> +           'compress', 'events', 'postcopy-ram', 'skip-balloon'] }
>>  
>>  ##
>>  # @MigrationCapabilityStatus
>> -- 
>> 1.8.3.1
>>
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]