qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [bug] busy-loop in send_all()


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [bug] busy-loop in send_all()
Date: Fri, 23 May 2014 13:55:40 +0200
User-agent: Mutt/1.5.23 (2014-03-12)

On Thu, May 15, 2014 at 11:23:54AM -0600, Chris Friesen wrote:
> I've run into a situation that seems like a bug.  I'm using qemu 1.4.2 (with
> additional patches) from within openstack.
> 
> I'm using virtio-serial-pci to provide a channel between the guest and host.
> 
> On occasion when doing suspend/resume I run into a case where the main qemu
> thread ends up chewing 100% of a cpu.
> 
> I attached strace to the thread and it showed qemu just spitting messages:
> 
> write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> = -1 EAGAIN (Resource temporarily unavailable)
> write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> = -1 EAGAIN (Resource temporarily unavailable)
> write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> = -1 EAGAIN (Resource temporarily unavailable)
> write(35, "HRBT\0\1\0\3d<\230k\0\0\0\0\0\0\1\330\0\0\0\0enqueue\0"..., 472)
> = -1 EAGAIN (Resource temporarily unavailable)
> 
> File descriptor 35 is the unix socket corresponding to the virtio-serial
> port.
> 
> I broke in with gdb and got a backtrace showing it was in send_all().
> Looking at the implementation of send_all(), the core loop looks like:
> 
>      while (len > 0) {
>          ret = write(fd, buf, len);
>          if (ret < 0) {
>              if (errno != EINTR && errno != EAGAIN)
>                  return -1;
>          } else if (ret == 0) {
>              break;
>          } else {
>              buf += ret;
>              len -= ret;
>          }
>      }
> 
> 
> So if we get EAGAIN, we'll just immediately retry.
> 
> I'm not sure where the unix socket would get opened, but I'm assuming it's
> set as non-blocking?  And by default /proc/sys/net/unix/max_dgram_qlen is
> set to 10.
> 
> So if the other end of that unix socket is connected but isn't actually
> paying attention to the messages then the first 10 messages will get
> buffered but after that we'll end up with qemu spinning forever in a
> busy-loop trying to send a message into a full buffer.
> 
> This seems less than ideal.  Either we should block, or else we should
> discard the data.  And I don't think discarding the data makes sense.

CCed Amit Shah for virtio-serial.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]