qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/2] virtio-vhost-user: add virtio-vhost-user devi


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC 0/2] virtio-vhost-user: add virtio-vhost-user device
Date: Tue, 6 Feb 2018 14:13:34 +0000
User-agent: Mutt/1.9.1 (2017-09-22)

On Tue, Feb 06, 2018 at 12:42:36PM +0000, Wang, Wei W wrote:
> On Tuesday, February 6, 2018 5:32 PM, Stefan Hajnoczi wrote:
> > On Tue, Feb 06, 2018 at 01:28:25AM +0000, Wang, Wei W wrote:
> > > On Tuesday, February 6, 2018 12:26 AM, Stefan Hajnoczi wrote:
> > > > On Fri, Feb 02, 2018 at 09:08:44PM +0800, Wei Wang wrote:
> > > > > On 02/02/2018 01:08 AM, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jan 30, 2018 at 08:09:19PM +0800, Wei Wang wrote:
> > > > > > > Issues:
> > > > > > > Suppose we have both the vhost and virtio-net set up, and 
> > > > > > > vhost pmd <-> virtio-net pmd communication works well. Now, 
> > > > > > > vhost pmd exits (virtio-net pmd is still there). Some time 
> > > > > > > later, we re-run vhost pmd, the vhost pmd doesn't know the 
> > > > > > > virtqueue addresses of the virtio-net pmd, unless the 
> > > > > > > virtio-net pmd reloads to start the 2nd phase of the 
> > > > > > > vhost-user protocol. So the second run of the vhost
> > > > pmd won't work.
> > > > > > >
> > > > > > > Any thoughts?
> > > > > > >
> > > > > > > Best,
> > > > > > > Wei
> > > > > > So vhost in qemu must resend all configuration on reconnect.
> > > > > > Does this address the issues?
> > > > > >
> > > > >
> > > > > Yes, but the issues are
> > > > > 1) there is no reconnecting when a pmd exits (the socket 
> > > > > connection seems still on at the device layer);
> > > >
> > > > This is how real hardware works too.  If the driver suddenly stops 
> > > > running then the device remains operational.  When the driver is 
> > > > started again it resets the device and initializes it.
> > > >
> > > > > 2) If we find a way to break the QEMU layer socket connection 
> > > > > when pmd exits and get it reconnect, virtio-net device still 
> > > > > won't send all the configure when reconnecting, because socket 
> > > > > connecting only triggers phase 1 of vhost-user negotiation (i.e.
> > > > > vhost_user_init). Phase 2 is triggered after the driver loads 
> > > > > (i.e. vhost_net_start). If the virtio-net pmd doesn't reload, 
> > > > > there are no phase 2 messages (like virtqueue addresses which 
> > > > > are allocated by the pmd). I think we need to think more about 
> > > > > this before
> > moving forward.
> > > >
> > > > Marc-André: How does vhost-user reconnect work when the master 
> > > > goes away and a new master comes online?  Wei found that the QEMU 
> > > > slave implementation only does partial vhost-user initialization 
> > > > upon reconnect, so the new master doesn't get the virtqueue 
> > > > address and
> > related information.
> > > > Is this a QEMU bug?
> > >
> > > Actually we are discussing the slave (vhost is the slave, right?) going 
> > > away.
> > When a slave exits and some moment later a new slave runs, the master
> > (virtio-net) won't send the virtqueue addresses to the new vhost slave.
> > 
> > Yes, apologies for the typo.  s/QEMU slave/QEMU master/
> > 
> > Yesterday I asked Marc-André for help on IRC and we found the code 
> > path where the QEMU master performs phase 2 negotiation upon 
> > reconnect.  It's not obvious but the qmp_set_link() calls in 
> > net_vhost_user_event() will do it.
> > 
> > I'm going to try to reproduce the issue you're seeing now.  Will let 
> > you know what I find.
> > 
> 
> OK. Thanks. I observed no messages after re-run virtio-vhost-user pmd, and 
> found there is no re-connection event happening in the device side. 
> 
> I also tried to switch the role of client/server - virtio-net to run a server 
> socket, and virtio-vhost-user to run the client, and it seems the current 
> code fails to run that way. The reason is the virtio-net side 
> vhost_user_get_features() doesn't return. On the vhost side, I don't see 
> virtio_vhost_user_deliver_m2s being invoked to deliver the GET_FEATURES 
> message. I'll come back to continue later.

This morning I reached the conclusion that reconnection is currently
broken in the QEMU vhost-user master.  It's a bug in the QEMU vhost-user
master implementation, not a design or protocol problem.

On my machine the following QEMU command-line does not launch because
vhost-user.c gets stuck while trying to connect/negotiate:

  qemu -M accel=kvm -cpu host -m 1G \
       -object 
memory-backend-file,id=mem0,mem-path=/var/tmp/foo,size=1G,share=on \
       -numa node,memdev=mem0 \
       -drive if=virtio,file=test.img,format=raw \
       -chardev socket,id=chardev0,path=vhost-user.sock,reconnect=1 \
       -netdev vhost-user,chardev=chardev0,id=netdev0 \
       -device virtio-net-pci,netdev=netdev0

Commit c89804d674e4e3804bd3ac1fe79650896044b4e8 ("vhost-user: wait until
backend init is completed") broke reconnect by introducing a call to
qemu_chr_fe_wait_connected().

qemu_chr_fe_wait_connected() doesn't work together with -chardev
...,reconnect=1.  This is because reconnect=1 connects asynchronously
and then qemu_chr_fe_wait_connect() connects synchronously (if the async
connect hasn't completed yet).  This means there will be 2 sockets
connecting to the vhost-user slave!

The virtio-vhost-user slave accepts the first connection but never
receives any data because the QEMU master is trying to use the 2nd
socket instead.

Reconnection probably worked when Marc-André implemented it since QEMU
wasn't using qemu_chr_fe_wait_connected().

Marc-André: How do you think this should be fixed?

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]