qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] dpdk/vpp and cross-version migration for vhost
Date: Tue, 22 Nov 2016 16:53:05 +0200

On Tue, Nov 22, 2016 at 09:02:23PM +0800, Yuanhan Liu wrote:
> On Thu, Nov 17, 2016 at 07:37:09PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Nov 17, 2016 at 05:49:36PM +0800, Yuanhan Liu wrote:
> > > On Thu, Nov 17, 2016 at 09:47:09AM +0100, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 11/17/2016 09:29 AM, Yuanhan Liu wrote:
> > > > >As usaual, sorry for late response :/
> > > > >
> > > > >On Thu, Oct 13, 2016 at 08:50:52PM +0300, Michael S. Tsirkin wrote:
> > > > >>Hi!
> > > > >>So it looks like we face a problem with cross-version
> > > > >>migration when using vhost. It's not new but became more
> > > > >>acute with the advent of vhost user.
> > > > >>
> > > > >>For users to be able to migrate between different versions
> > > > >>of the hypervisor the interface exposed to guests
> > > > >>by hypervisor must stay unchanged.
> > > > >>
> > > > >>The problem is that a qemu device is connected
> > > > >>to a backend in another process, so the interface
> > > > >>exposed to guests depends on the capabilities of that
> > > > >>process.
> > > > >>
> > > > >>Specifically, for vhost user interface based on virtio, this includes
> > > > >>the "host features" bitmap that defines the interface, as well as more
> > > > >>host values such as the max ring size.  Adding new features/changing
> > > > >>values to this interface is required to make progress, but on the 
> > > > >>other
> > > > >>hand we need ability to get the old host features to be compatible.
> > > > >
> > > > >It looks like to the same issue of vhost-user reconnect to me. For 
> > > > >example,
> > > > >
> > > > >- start dpdk 16.07 & qemu 2.5
> > > > >- kill dpdk
> > > > >- start dpdk 16.11
> > > > >
> > > > >Though DPDK 16.11 has more features comparing to dpdk 16.07 (say, 
> > > > >indirect),
> > > > >above should work. Because qemu saves the negotiated features before 
> > > > >the
> > > > >disconnect and stores it back after the reconnection.
> > > > >
> > > > >    commit a463215b087c41d7ca94e51aa347cde523831873
> > > > >    Author: Marc-André Lureau <address@hidden>
> > > > >    Date:   Mon Jun 6 18:45:05 2016 +0200
> > > > >
> > > > >        vhost-net: save & restore vhost-user acked features
> > > > >
> > > > >        The initial vhost-user connection sets the features to be 
> > > > > negotiated
> > > > >        with the driver. Renegotiation isn't possible without device 
> > > > > reset.
> > > > >
> > > > >        To handle reconnection of vhost-user backend, ensure the same 
> > > > > set of
> > > > >        features are provided, and reuse already acked features.
> > > > >
> > > > >        Signed-off-by: Marc-André Lureau <address@hidden>
> > > > >
> > > > >
> > > > >So we could do similar to vhost-user? I mean, save the acked features
> > > > >before migration and store it back after it. This should be able to
> > > > >keep the compatibility. If user downgrades DPDK version, it also could
> > > > >be easily detected, and then exit with an error to user: migration
> > > > >failed due to un-compatible vhost features.
> > > > >
> > > > >Just some rough thoughts. Makes tiny sense?
> > > > 
> > > > My understanding is that the management tool has to know whether
> > > > versions are compatible before initiating the migration:
> > > 
> > > Makes sense. How about getting and restoring the acked features through
> > > qemu command lines then, say, through the monitor interface?
> > > 
> > > With that, it would be something like:
> > > 
> > > - start vhost-user backend (DPDK, VPP, or whatever) & qemu in the src host
> > > 
> > > - read the acked features (through monitor interface)
> > > 
> > > - start vhost-user backend in the dst host
> > > 
> > > - start qemu in the dst host with the just queried acked features
> > > 
> > >   QEMU then is expected to use this feature set for the later vhost-user
> > >   feature negotitation. Exit if features compatibility is broken.
> > > 
> > > Thoughts?
> > > 
> > >   --yliu
> > 
> > 
> > You keep assuming that you have the VM started first and
> > figure out things afterwards, but this does not work.
> > 
> > Think about a cluster of machines. You want to start a VM in
> > a way that will ensure compatibility with all hosts
> > in a cluster.
> 
> I see. I was more considering about the case when the dst
> host (including the qemu and dpdk combo) is given, and
> then determine whether it will be a successfull migration
> or not.
> 
> And you are asking that we need to know which host could
> be a good candidate before starting the migration. In such
> case, we indeed need some inputs from both the qemu and
> vhost-user backend.
> 
> For DPDK, I think it could be simple, just as you said, it
> could be either a tiny script, or even a macro defined in
> the source code file (we extend it every time we add a
> new feature) to let the libvirt to read it. Or something
> else.

There's the issue of APIs that tweak features as Maxime
suggested. Maybe the only thing to do is to deprecate it,
but I feel some way for application to pass info into
guest might be benefitial.


> > If you don't, guest visible interface will change
> > and you won't be able to migrate.
> > 
> > It does not make sense to discuss feature bits specifically
> > since that is not the only part of interface.
> > For example, max ring size supported might change.
> 
> I don't quite understand why we have to consider the max ring
> size here? Isn't it a virtio device attribute, that QEMU could
> provide such compatibility information?
>
> I mean, DPDK is supposed to support vary vring size, it's QEMU
> to give a specifc value.

If backend supports s/g of any size up to 2^16, there's no issue.

ATM some backends might be assuming up to 1K s/g since
QEMU never supported bigger ones. We might classify this
as a bug, or not and add a feature flag.

But it's just an example. There might be more values at issue
in the future.

> > Let me describe how it works in qemu/libvirt.
> > When you install a VM, you can specify compatibility
> > level (aka "machine type"), and you can query the supported compatibility
> > levels. Management uses that to find the supported compatibility
> > and stores the compatibility in XML that is migrated with the VM.
> > There's also a way to find the latest level which is the
> > default unless overridden by user, again this level
> > is recorded and then
> > - management can make sure migration destination is compatible
> > - management can avoid migration to hosts without that support
> 
> Thanks for the info, it helps.
> 
> ...
> > > > >>As version here is an opaque string for libvirt and qemu,
> > > > >>anything can be used - but I suggest either a list
> > > > >>of values defining the interface, e.g.
> > > > >>any_layout=on,max_ring=256
> > > > >>or a version including the name and vendor of the backend,
> > > > >>e.g. "org.dpdk.v4.5.6".
> 
> The version scheme may not be ideal here. Assume a QEMU is supposed
> to work with a specific DPDK version, however, user may disable some
> newer features through qemu command line, that it also could work with
> an elder DPDK version. Using the version scheme will not allow us doing
> such migration to an elder DPDK version. The MTU is a lively example
> here? (when MTU feature is provided by QEMU but is actually disabled
> by user, that it could also work with an elder DPDK without MTU support).
> 
>       --yliu

OK, so does a list of values look better to you then?



> > > > >>
> > > > >>Note that typically the list of supported versions can only be
> > > > >>extended, not shrunk. Also, if the host/guest interface
> > > > >>does not change, don't change the current version as
> > > > >>this just creates work for everyone.
> > > > >>
> > > > >>Thoughts? Would this work well for management? dpdk? vpp?
> > > > >>
> > > > >>Thanks!
> > > > >>
> > > > >>--
> > > > >>MST



reply via email to

[Prev in Thread] Current Thread [Next in Thread]