qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Hotplug ram and vhost-user


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] Hotplug ram and vhost-user
Date: Thu, 7 Dec 2017 18:57:29 +0000
User-agent: Mutt/1.9.1 (2017-09-22)

* Maxime Coquelin (address@hidden) wrote:
> 
> 
> On 12/07/2017 07:23 PM, Dr. David Alan Gilbert wrote:
> > * Maxime Coquelin (address@hidden) wrote:
> > > 
> > > 
> > > On 12/07/2017 05:25 PM, Dr. David Alan Gilbert wrote:
> > > > * Maxime Coquelin (address@hidden) wrote:
> > > > > Hi David,
> > > > > 
> > > > > On 12/05/2017 06:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > Hi,
> > > > > >      Since I'm reworking the memory map update code I've been
> > > > > > trying to test it with hot adding RAM; but even on upstream
> > > > > > I'm finding that hot adding RAM causes the guest to stop passing
> > > > > > packets with vhost-user-bridge;  have either of you seen the same
> > > > > > thing?
> > > > > 
> > > > > No, I have never tried this.
> > > > 
> > > > Would you know if it works on dpdk?
> > > 
> > > We have a known issue in DPDK, the PMD threads might be accessing the
> > > guest memory while the vhost-user protocol thread is unmapping it.
> > > 
> > > We have a similar problem with dirty logging area, and Victor is working
> > > on a patch that will fix both issues.
> > > 
> > > Once ready, I'll have a try and let you know.
> > > 
> > > > > > I'm doing:
> > > > > > ./tests/vhost-user-bridge -u /tmp/vubrsrc.sock
> > > > > > $QEMU -enable-kvm -m 1G,maxmem=2G,slots=4 -smp 2 -object 
> > > > > > memory-backend-file,id=mem,size=1G,mem-path=/dev/shm,share=on -numa 
> > > > > > node,memdev=mem -mem-prealloc -trace events=vhost-trace-file 
> > > > > > -chardev socket,id=char0,path=/tmp/vubrsrc.sock -netdev 
> > > > > > type=vhost-user,id=mynet1,chardev=char0,vhostforce -device 
> > > > > > virtio-net-pci,netdev=mynet1 $IMAGE -net none
> > > > > > 
> > > > > > (with a f27 guest) and then doing:
> > > > > > (qemu) object_add 
> > > > > > memory-backend-file,id=mem1,size=256M,mem-path=/dev/shm
> > > > > > (qemu) device_add pc-dimm,id=dimm1,memdev=mem1
> > > > > > 
> > > > > > but then not getting any responses inside the guest.
> > > > > > 
> > > > > > I can see the code sending another set-mem-table with the
> > > > > > extra chunk of RAM and fd, and I think I can see the bridge
> > > > > > mapping it.
> > > > > 
> > > > > I think there are at least two problems.
> > > > > The first one is that vhost-user-bridge does not support vhost-user
> > > > > protocol's reply-ack feature. So when QEMU sends the requests, it 
> > > > > cannot
> > > > > know whether/when it has been handled by the backend.
> > > > 
> > > > Wouldn't you have to be unlucky to cause that a problem - i.e. the
> > > > descriptors would have to get allocated in the new RAM?
> > > 
> > > Yes, you may be right. I think it is worth to debug it to understand
> > > what is going on.
> > > 
> > > > > It had been fixed by sending a GET_FEATURE requests to be sure the
> > > > > SET_MEM_TABLE was handled, as messages are processed in order. The 
> > > > > problem
> > > > > is that it caused some test failures when using TCG, so it got
> > > > > reverted.
> > > > > 
> > > > > The initial fix:
> > > > > 
> > > > > commit 28ed5ef16384f12500abd3647973ee21b03cbe23
> > > > > Author: Prerna Saxena <address@hidden>
> > > > > Date:   Fri Aug 5 03:53:51 2016 -0700
> > > > > 
> > > > >       vhost-user: Attempt to fix a race with set_mem_table.
> > > > > 
> > > > > The revert:
> > > > > 
> > > > > commit 94c9cb31c04737f86be29afefbff401cd23bc24d
> > > > > Author: Michael S. Tsirkin <address@hidden>
> > > > > Date:   Mon Aug 15 16:35:24 2016 +0300
> > > > > 
> > > > >       Revert "vhost-user: Attempt to fix a race with set_mem_table."
> > > > > 
> > > > 
> > > > Do we know which tests fail?
> > > 
> > > vhost-user-test, but it should no more be failing now that it no more
> > > uses TCG.
> > > 
> > > I think we could consider reverting the revert. i.e. send get_features
> > > in set_mem_table toi be sure it has been handled.
> > 
> > How does it fail? Does it fail every time or only some times?
> > (The postcopy test in migration-test.c also fails under TCG under
> > very heavy load and I've not figured out why yet).
> 
> I'm trying to remember the analysis I did one year ago... I don't have
> yet the full picture, but found some notes I took at that time:
> 
> "
> I have managed to reproduce the hang by adding some debug prints into
> vhost_user_get_features().
> 
> Doing this the issue is reproducible quite easily.
> Another way to reproduce it in one shot is to strace (with following
> forks) /vhost-user-test execution.
> 
> So, by adding debug prints at vhost_user_get_features() entry and exit,
> we can see we never return from this function when hang happens.
> Strace of Qemu instance shows that its thread keeps retrying to receive
> GET_FEATURE reply:
> 
> write(1, "vhost_user_get_features IN: \n", 29) = 29
> sendmsg(11, {msg_name=NULL, msg_namelen=0,
>         msg_iov=[{iov_base="\1\0\0\0\1\0\0\0\0\0\0\0", iov_len=12}],
>         msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 12
> recvmsg(11, {msg_namelen=0}, MSG_CMSG_CLOEXEC) = -1 EAGAIN
> nanosleep({0, 100000}, 0x7fff29f8dd70)  = 0
> ...
> recvmsg(11, {msg_namelen=0}, MSG_CMSG_CLOEXEC) = -1 EAGAIN
> nanosleep({0, 100000}, 0x7fff29f8dd70)  = 0
> 
> The reason is that vhost-user-test never replies to Qemu,
> because its thread handling the GET_FEATURES command is waiting for
> the s->data_mutex lock.
> This lock is held by the other vhost-user-test thread, executing
> read_guest_mem().
> 
> The lock is never released because the thread is blocked in read
> syscall, when read_guest_mem() is doing the readl().
> 
> This is because on Qemu side, the thread polling the qtest socket is
> waiting for the qemu_global_mutex (in os_host_main_loop_wait()), but
> the mutex is held by the thread trying to get the GET_FEATURE reply
> (the TCG one).
> "
> 
> It does not explain why it would only fail with TCG, I would need to
> spend some time investigating the issue to find why I claimed this.

That sounds like a very repeatable bug; much more so than my postcopy
case;  is your case that it's just anything that slows it down - be it
some debug, an strace or a slow tcg?

Dave

> Maxime
> > Dave
> > 
> > > > > Another problem is that memory mmapped with previous call does not 
> > > > > seems
> > > > > to be unmapped, but that should not cause other problems than leaking
> > > > > virtual memory.
> > > > 
> > > > Oh, leaks are the least of our problem there!
> > > 
> > > Sure.
> > > 
> > > Maxime
> > > > Dave
> > > > 
> > > > > Maxime
> > > > > > Dave
> > > > > > 
> > > > > > --
> > > > > > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> > > > > > 
> > > > --
> > > > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> > > > 
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> > 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]