qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-stable] Recent patches for 2.4


From: Peter Lieven
Subject: Re: [Qemu-devel] [Qemu-stable] Recent patches for 2.4
Date: Tue, 04 Aug 2015 14:53:13 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

Am 04.08.2015 um 14:29 schrieb Peter Lieven:
Am 04.08.2015 um 14:09 schrieb Paolo Bonzini:

On 04/08/2015 13:57, Peter Lieven wrote:
Okay, what I found out is that in aio_poll I get revents = POLLIN for
the nfs file descriptor. But there is no data available on the socket.
Does read return 0 or EAGAIN?

If it returns EAGAIN, the bug is in the QEMU main loop or the kernel.
It should never happen that poll returns POLLIN and read returns EAGAIN.

If it returns 0, it means the other side called shutdown(fd, SHUT_WR).
Then I think the bug is in the libnfs driver or more likely libnfs.  You
should stop polling the POLLIN event after read has returned 0 once.

You might be right. Ronnie originally used the FIONREAD ioctl before every read 
and considered
the socket as disconnected if the available bytes returned where 0.
I found that I get available bytes == 0 from that ioctl even if the socket was 
not closed.
This seems to be some kind of bug in Linux - at least what I have thought.

See BUGS in the select(2) manpage.

       Under Linux, select() may report a socket file descriptor as "ready for 
reading", while nevertheless a subsequent read blocks. This could for example happen 
when data  has  arrived  but
       upon  examination  has  wrong checksum and is discarded. There may be 
other circumstances in which a file descriptor is spuriously reported as ready. 
 Thus it may be safer to use O_NON‐
       BLOCK on sockets that should not block.

I will debug further, but it seems to be that I receive a POLLIN even if there 
is no data available. I see 0 bytes from the recv call inside libnfs and 
continue without a deadlock - at least
so far.

Would it be a good idea to count the number of 0 bytes from recv and react 
after I received 0 bytes for a number of consecutive times?

And then: stop polling POLLIN or reconnect?

Okay, got it. Ronnie was using FIONREAD without checking for EAGAIN or EINTR.

I will send a patch for libnfs to reconnect if count == 0. Libiscsi is not 
affected, it reconnects if count is 0.

Peter




reply via email to

[Prev in Thread] Current Thread [Next in Thread]