qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [libvirt] RFC decoupling VM NIC provisioning from VM NI


From: Daniel P. Berrange
Subject: Re: [Qemu-devel] [libvirt] RFC decoupling VM NIC provisioning from VM NIC connection to backend networks
Date: Tue, 1 Nov 2011 10:34:17 +0000
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Oct 31, 2011 at 04:23:35PM -0500, Christian Benvenuti (benve) wrote:
> > -----Original Message-----
> > From: address@hidden [mailto:qemu-devel-
> > address@hidden On Behalf Of Daniel P. Berrange
> > Sent: Monday, October 31, 2011 3:49 AM
> > To: Sumit Naiksatam (snaiksat)
> > Cc: address@hidden; David Wang (dwang2); Ram Durairaj
> > (radurair); address@hidden
> > Subject: Re: [Qemu-devel] [libvirt] RFC decoupling VM NIC provisioning
> > from VM NIC connection to backend networks
> > 
> > On Fri, Oct 28, 2011 at 04:15:41PM -0700, Sumit Naiksatam (snaiksat)
> > wrote:
> > > Hi,
> > >
> > > In its current implementation Libvirt makes sure that the network
> > > interfaces that it passes/provision to a VM (for example to qemu[-
> > kvm])
> > > are already connected to its backend (interfaces/networks) by the
> > time
> > > the VM starts its boot process. In a non virtualized setup it would
> > be
> > > like booting a machine with the Ethernet cable already plugged into a
> > > router/switch port. While in a non virtualized setup you can boot a
> > > machine first (with no physical connection to a router/switch) and
> > later
> > > connect its NIC/s to the switch/router, when you boot a VM via
> > Libvirt
> > > it is not possible to decouple the two actions (VM boot, cable
> > > plug/unplug).
> > >
> > > An example of case where the capability of decoupling the two actions
> > > mentioned above is a requirement in Quantum/NetStack which is the
> > > network service leveraged by OpenStack. The modular design of
> > OpenStack
> > > allows you to:
> > > - provision VMs with NIC/s
> > > - create networks
> > > - create ports on networks
> > > - plug/unplug a VM NIC into/from a given port on a network (at
> > runtime)
> > >
> > > Note that this runtime plug/unplug requirement has nothing to do with
> > > hot plug/unplug of NICs.
> > > The idea is more that of decoupling the provisioning of a VM from the
> > > connection of the VM to the network/s.
> > > This would make it possible to change (at run-time too) the networks
> > the
> > > NIC/s of a given VM are connected to.
> > >
> > > For example, when a VM boots, its interfaces should be in link down
> > > state if the network admin has not connected the VM NIC/s to any
> > > "network" yet.
> > > Even though libvirt already provides a way to change the link state
> > of
> > > an a VM NIC, link state and physical connection are two different
> > things
> > > and should be manageable independently.
> > >
> > > Ideally the configuration syntax should be interface type and
> > hypervisor
> > > type agnostic.
> > >
> > > Let's take QEMU[-kvm] as an example - when Libvirt starts a QEMU VM,
> > it
> > > passes to QEMU a number of file descriptors that map to host backend
> > > interfaces (for example macvtap interfaces).
> > >
> > > In order to introduce this runtime plug/unplug capability, we need a
> > > mechanism that permits to delay the binding between the host macvtap
> > > interfaces and the guest taps (because you cannot know the fd of the
> > > macvtap interfaces before you create them). This means you need a
> > > mechanism that allows you to change such fd/s at runtime:
> > >
> > > - you can close/reset an fd (ie, when you disconnect a VM NIC from
> > its
> > > network)
> > > - you can open/set an fd (ie, when you connect a VM NIC to a network)
> > >
> > > This could probably be a libvirt command that translates to a QEMU
> > > monitor command.
> > >
> > > Can the runtime plug/unplug capability described above be achieved
> > > (cleanly) with another mechanism?
> > >
> > > Is anybody working on implementing something similar?
> > 
> > No, but I've long thought about doing this & it is quite
> > straightforward
> > todo really. Ordinarily when we start QEMU we do
> > 
> >    qemu ...  -device e1000,id=nic0,netdev=netdevnic0 \
> >              -netdev user,id=netdevnic0
> > 
> > Todo what you describe we need to be able to:
> > 
> >  1. Start QEMU with a NIC, but no netdev
> >  2. Add a netdev to running QEMU.
> >  3. Remove a netdev from a running QEMU
> >  4. Associate a netdev with a NIC in running QEMU
> > 
> > We can do 1:
> > 
> >   $ qemu ...  -device e1000,id=nic0
> > 
> > But QEMU prints an annoying warning
> > 
> >   Warning: nic nic0 has no peer
> 
> If we introduce this new functionality, can this warning change?
> If we change it, would it break any test/script?
> Actually it is just a warning (not an error). Why do you think it
> is annoying? (I guess it is supposed to catch misconfigurations)
> 
> > We can do 2 via the monitor:
> > 
> >   (qemu) netdev_add type=user,id=netdevnic0
> > 
> > We can do 3 via the monitor:
> > 
> >   (qemu) netdev_del netdevnic0
> > 
> > 
> > The problem is 4 - AFAICT we can't connect the existing NIC upto the
> > newly
> > hotplugged netdev, since we can't update the 'netdev' property in the
> > NIC
> > device. Also if we delete the netdev, we can't clear out the 'netdev'
> > property in the NIC, so its dangling to a netdev that no longer exists.
> > The latter is fairly harmless, since we can just re-use the name if
> > adding
> > a new backend later. The first problem is a bit of a pain, unless we
> > plug
> > in a 'user' backend on the CLI, and immediately netdev_del it before
> > starting
> > the CPUs. Ideally we'd have some way to set qdev properties for devices
> > so we
> > can associate the NIC with the new netdev.
> > 
> > eg when adding a netdev:
> > 
> >    (qemu) netdev_add type=user,id=netdevnic0
> >    (qemu) set nic0 netdev=netdevnic0
> > 
> > Or removing one
> > 
> >    (qemu) netdev_add netdevnic0
> >    (qemu) unset nic0 netdev
> > 
> > 
> > WRT to libvirt XML config. Normally you specifiy a NIC like
> > 
> >      <interface type='network'>
> >       <mac address='52:54:00:0f:7d:ad'/>
> >       <source network='default'/>
> >       <model type='virtio'/>
> >     </interface>
> > 
> > To boot a guest without any netdev backend present, we'd introduce a
> > new network type="none". eg
> > 
> >      <interface type='none'>
> >        <mac address='52:54:00:0f:7d:ad'/>
> >        <model type='virtio'/>
> >      </interface>
> > 
> > The existing API  'virDomainUpdateDevice', can then be used to change
> > the interface config on the fly, adding or removing the netdev by
> > passing in new XML with a different 'type' attribute & <source>
> > element.
> > 
> > Finally, when adding & removing the netdev backends to a running guest,
> > we likely want to be able to set the NIC's  link carrier, so the guest
> > OS sees that it has lost / gain its network connection & will thus
> > retry DHCP / IPv6 autoconfig.
> 
> I assume this is what you meant:
> 
> - when the NIC does not have a backend netdev configured, its link
>   state is DOWN by default. Right?
>   (*) Would it make sense to provide anyway the possibility of setting
>       the link state in this case too? (ie, ability to force link UP also
>       when there is no backend netdev configured).
>       There may be use cases where this could be needed, but I can't think
>       of any right now.
> 
> - When you unbind a NIC from a backend netdev (ie, "unset" command above)
>   the NIC's link goes down (unless the above (*) option said otherwise)
> 
> - When you bind a NIC to a backend netdev (ie, "set" command above), the
>   NIC link state should be copied from the backend netdev link state.
>   For example, when you plug an Ethernet NIC to a switch, the NIC receives
>   link UP only if the switch's port is UP.
>   This means that the NIC link state reflects the backend netdev's link state
>   (ie, the NIC is not always UP regardless of the backend netdev's link 
> state).

Yeah, that sounds about right.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]