[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failedbecauseof vhos
From: |
peng.hao2 |
Subject: |
Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failedbecauseof vhost region merge |
Date: |
Mon, 24 Jul 2017 23:15:12 +0800 (CST) |
>On Mon, 24 Jul 2017 18:32:35 +0800 (CST)
><address@hidden> wrote:
>> > On Sun, 23 Jul 2017 20:46:11 +0800
>> > Peng Hao <address@hidden> wrote:
>>
>> > > When a guest that has several hotplugged dimms is migrated, on
>> > > destination it will fail to resume. Because regions on source
>> > > are merged and on destination the order of realizing devices
>> > > is different from on source with dimms, so when part of devices
>> > > are realizd some region can not be merged.That may be more than
>> > > vhost slot limit.
>> > >
>> > > Signed-off-by: Peng Hao <address@hidden>
>> > > Signed-off-by: Wang Yechao <address@hidden>
>> > > ---
>> > > hw/mem/pc-dimm.c | 2 +-
>> > > include/sysemu/sysemu.h | 1 +
>> > > vl.c | 5 +++++
>> > > 3 files changed, 7 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
>> > > index ea67b46..13f3db5 100644
>> > > --- a/hw/mem/pc-dimm.c
>> > > +++ b/hw/mem/pc-dimm.c
>> > > @@ -101,7 +101,7 @@ void pc_dimm_memory_plug(DeviceState *dev,
>> > > MemoryHotplugState *hpms,
>> > > goto out
>> > > }
>> > >
>> > > - if (!vhost_has_free_slot()) {
>> > > + if (!vhost_has_free_slot() && qemu_is_machine_init_done()) {
>> > > error_setg(&local_err, "a used vhost backend has no free"
>> > > " memory slots left")
>> > that doesn't fix issue,
>> > 1st: number of used entries is changing after machine_init_done() is
>> > called
>> > as regions continue to mapped/unmapped during runtime
>> > 2nd: it brings regression and allows to start QEMU with number memory
>> > regions more than supported by backend, which combined with missing
>> > error handling in vhost will lead to qemu crashes or obscure bugs
>> > in
>> > guest breaking vhost enabled drivers.
>> > i.e. patch undoes what were fixed by
>> >
>> > https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg00789.html
>> I don't think I undo the previous patch. There are tow scenarios :
>>
>> hotplugging memory happens after machine_init_done(). so the modified code
>> is no
>>
>> influence.
>>
>> hotplugged memory's VM is just migrated . on source its regions is less than
>>
>> supported by backend ,so on destination it should satisfy. During restoring
>> VM's regions
>>
>> may be more than supported by backend but after machine_init_done VM's
>> regions
>> can be less than supported by backend.
>here is simulation with vhost-kernel where
>/sys/module/vhost/parameters/max_mem_regions set to 8
>for limit to look like vhost user.
>qemu-system-x86_64 --enable-kvm -m 128,slots=256,maxmem=1T \
>> -netdev type=tap,id=guest0,vhost=on,script=/bin/true,vhostforce \
-device virtio-net-pci,netdev=guest0 \
> `i=0 while [ $i -lt 10 ] do echo "-object memory-backend-ram,id=m$i,size=128M
> -device pc-dimm,id=d$i,memdev=m$i" i=$(($i + 1)) done`
> it end ups with 12 used_memslots, and prints following error messages:
> qemu-system-x86_64: vhost_set_mem_table failed: Argument list too long (7)
> qemu-system-x86_64: unable to start vhost net: 7: falling back on userspace
> virtio
> above CLI should fail to startup as it's above supported limit even with
> merging
> (with merging available slots is 'random' number and merging could happen
> regardless
> of the order devices are created).
> vhost_dev_init() also has checks vhost_backend_memslots_limit(),
> and skipping check in pc_dimm_memory_plug() might lead to failure
> later in vhost_dev_init() - I'm not sure when it's called and to what
> consequences it would lead.
I don't think we describe the same scenario.I mean hotplugged mem VM on source
is less than limit,
and after migration on destination VM should satisfy the limit.
before vhost_dev_init all devices initialization don't affect vhost
used_memslots statistics
(because no vhost device). so when vhost_dev_init is called, vhost
used_memslots statistics is zero.
I test in my migration scenario.
after migration on destination, print used_memslots in vhost_set_memory like
this (with my patch)
used_memslots : 1
used_memslots : 2
used_memslots : 1
used_memslots : 2
used_memslots : 3
used_memslots : 4
used_memslots : 5
used_memslots : 6
used_memslots : 7
used_memslots : 8
used_memslots : 9
used_memslots : 10
used_memslots : 9
used_memslots : 10
used_memslots : 9
used_memslots : 10
used_memslots : 11
used_memslots : 12
used_memslots : 11
used_memslots : 10
used_memslots : 9
used_memslots : 9
used_memslots : 9
used_memslots : 9
used_memslots : 8
> > goto out
> > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > index b213696..48228ad 100644
> > --- a/include/sysemu/sysemu.h
> > +++ b/include/sysemu/sysemu.h
> > @@ -88,6 +88,7 @@ void qemu_system_guest_panicked(GuestPanicInformation
> > *info)
> > void qemu_add_exit_notifier(Notifier *notify)
> > void qemu_remove_exit_notifier(Notifier *notify)
> >
> > +bool qemu_is_machine_init_done(void)
> > void qemu_add_machine_init_done_notifier(Notifier *notify)
> > void qemu_remove_machine_init_done_notifier(Notifier *notify)
> >
> > diff --git a/vl.c b/vl.c
> > index fb6b2ef..43aee22 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -2681,6 +2681,11 @@ static void qemu_run_exit_notifiers(void)
> >
> > static bool machine_init_done
> >
> > +bool qemu_is_machine_init_done(void)
> > +{
> > + return machine_init_done
> > +}
> > +
> > void qemu_add_machine_init_done_notifier(Notifier *notify)
> > {
> > notifier_list_add(&machine_init_done_notifiers, notify
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failedbecauseof vhost region merge,
peng.hao2 <=