qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failedbecauseof vhos


From: peng.hao2
Subject: Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failedbecauseof vhost region merge
Date: Mon, 24 Jul 2017 23:15:12 +0800 (CST)

>On Mon, 24 Jul 2017 18:32:35 +0800 (CST)





><address@hidden> wrote:

>> > On Sun, 23 Jul 2017 20:46:11 +0800  

>> > Peng Hao <address@hidden> wrote:  
>> 
>> > > When a guest that has several hotplugged dimms is migrated, on
>> > > destination it will fail to resume. Because regions on source
>> > > are merged and on destination the order of realizing devices
>> > > is different from on source with dimms, so when part of devices
>> > > are realizd some region can not be merged.That may be more than
>> > > vhost slot limit.
>> > > 
>> > > Signed-off-by: Peng Hao <address@hidden>
>> > > Signed-off-by: Wang Yechao <address@hidden>
>> > > ---
>> > >  hw/mem/pc-dimm.c        | 2 +-
>> > >  include/sysemu/sysemu.h | 1 +
>> > >  vl.c                    | 5 +++++
>> > >  3 files changed, 7 insertions(+), 1 deletion(-)
>> > > 
>> > > diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
>> > > index ea67b46..13f3db5 100644
>> > > --- a/hw/mem/pc-dimm.c
>> > > +++ b/hw/mem/pc-dimm.c
>> > > @@ -101,7 +101,7 @@ void pc_dimm_memory_plug(DeviceState *dev, 
>> > > MemoryHotplugState *hpms,
>> > >          goto out
>> > >      }
>> > >  
>> > > -    if (!vhost_has_free_slot()) {
>> > > +    if (!vhost_has_free_slot() && qemu_is_machine_init_done()) {
>> > >          error_setg(&local_err, "a used vhost backend has no free"
>> > >                                 " memory slots left")  
>> > that doesn't fix issue,
>> >    1st: number of used entries is changing after machine_init_done() is 
>> > called
>> >         as regions continue to mapped/unmapped during runtime
>> >    2nd: it brings regression and allows to start QEMU with number memory
>> >         regions more than supported by backend, which combined with missing
>> >         error handling in vhost will lead to qemu crashes or obscure bugs 
>> > in
>>  >        guest breaking vhost enabled drivers.
>> >         i.e. patch undoes what were fixed by
>> >         
>> > https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg00789.html  
>> I don't think I undo the previous patch. There are tow scenarios :
>> 
>> hotplugging memory happens after machine_init_done(). so the modified code 
>> is no 
>> 
>> influence.
>> 
>> hotplugged memory's VM is just migrated . on source its regions is less than 
>> 
>> supported by backend ,so on destination it should satisfy. During restoring 
>> VM's regions
>> 
>> may be more than supported by backend but after machine_init_done VM's 
>> regions

>> can be less than  supported by backend.

>here is simulation with vhost-kernel where 
>/sys/module/vhost/parameters/max_mem_regions set to 8
>for limit to look like vhost user.

>qemu-system-x86_64 --enable-kvm -m 128,slots=256,maxmem=1T  \
>> -netdev type=tap,id=guest0,vhost=on,script=/bin/true,vhostforce \
 -device virtio-net-pci,netdev=guest0 \
> `i=0 while [ $i -lt 10 ] do echo "-object memory-backend-ram,id=m$i,size=128M 
> -device pc-dimm,id=d$i,memdev=m$i" i=$(($i + 1)) done`

> it end ups with 12 used_memslots, and prints following error messages:
> qemu-system-x86_64: vhost_set_mem_table failed: Argument list too long (7)
> qemu-system-x86_64: unable to start vhost net: 7: falling back on userspace 
> virtio

> above CLI should fail to startup as it's above supported limit even with 
> merging
> (with merging available slots is 'random' number and merging could happen 
> regardless
> of the order devices are created).


> vhost_dev_init() also has checks vhost_backend_memslots_limit(),
> and skipping check in pc_dimm_memory_plug() might lead to failure
> later in vhost_dev_init() - I'm not sure when it's called and to what
> consequences it would lead.

I don't think we describe the same  scenario.I mean hotplugged mem VM on source 
is less than limit,

and after migration on destination VM should satisfy the limit.

before vhost_dev_init   all devices initialization don't affect vhost 
used_memslots statistics

 (because no vhost device). so when  vhost_dev_init  is called, vhost 
used_memslots statistics is zero.

I test in my migration scenario.

after migration on destination, print used_memslots  in vhost_set_memory like 
this (with my patch)




 used_memslots : 1


 used_memslots : 2


used_memslots : 1

 used_memslots : 2

 used_memslots : 3

 used_memslots : 4

 used_memslots : 5

 used_memslots : 6


 used_memslots : 7

used_memslots : 8


 used_memslots : 9

 used_memslots : 10

used_memslots : 9

 used_memslots : 10

 used_memslots : 9


 used_memslots : 10

 used_memslots : 11

 used_memslots : 12

 used_memslots : 11


 used_memslots : 10

 used_memslots : 9

 used_memslots : 9

 used_memslots : 9

 used_memslots : 9

 used_memslots : 8








> >          goto out
> > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > index b213696..48228ad 100644
> > --- a/include/sysemu/sysemu.h
> > +++ b/include/sysemu/sysemu.h
> > @@ -88,6 +88,7 @@ void qemu_system_guest_panicked(GuestPanicInformation 
> > *info)
> >  void qemu_add_exit_notifier(Notifier *notify)
> >  void qemu_remove_exit_notifier(Notifier *notify)
> >  
> > +bool qemu_is_machine_init_done(void)
> >  void qemu_add_machine_init_done_notifier(Notifier *notify)
> >  void qemu_remove_machine_init_done_notifier(Notifier *notify)
> >  
> > diff --git a/vl.c b/vl.c
> > index fb6b2ef..43aee22 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -2681,6 +2681,11 @@ static void qemu_run_exit_notifiers(void)
> >  
> >  static bool machine_init_done
> >  
> > +bool qemu_is_machine_init_done(void)
> > +{
> > +    return machine_init_done
> > +}
> > +
> >  void qemu_add_machine_init_done_notifier(Notifier *notify)
> >  {
> >      notifier_list_add(&machine_init_done_notifiers, notify

reply via email to

[Prev in Thread] Current Thread [Next in Thread]