qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input


From: Jan Kiszka
Subject: Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
Date: Fri, 02 Aug 2013 21:41:24 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2013-08-02 14:45, Jan Kiszka wrote:
> On 2013-08-02 13:46, Stefan Hajnoczi wrote:
>> On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote:
>>> I was digging into the involved code and found something fishy:
>>>
>>> net/tap.c:
>>> static void tap_send(void *opaque)
>>> {
>>>     ...
>>>         size = qemu_send_packet_async(&s->nc, buf, size,
>>>                                       tap_send_completed);
>>>         if (size == 0) {
>>>             tap_read_poll(s, false);
>>>         }
>>>
>>> So, if tap_send is registered for the mainloop polling (ie. can_receive
>>> returned true before starting to poll) but qemu_send_packet_async
>>> returns 0 now as qemu_can_send_packet/can_receive happens to report
>>> false in the meantime, we will disable read polling. If also write
>>> polling is off, the fd will be completely removed from the iohandler
>>> list. But even if write polling remains on, I wonder what should bring
>>> read polling back?
>>
>> This behavior seems fine to me.  Once the peer (pcnet) is able to
>> receive again it must flush the queue, this will re-enable
>> tap_read_poll().
>>
>> Can you explain a bit more why this would be a problem?
> 
> The problem is that I don't see at all what will call tap_read_poll(s,
> 1), neither in theory nor in reality.
> 
> As long as the real test case is out of reach, I tried to emulate the
> faulty behaviour by letting tap_can_send always return 1. Result:
> reception stalls during boot as even qemu_flush_queued_packets cannot
> get it running again once tap_read_poll(s, 0) was called.

OK, false alarm. The issue was most likely fixed by commit 199ee608
(net: fix qemu_flush_queued_packets() in presence of a hub) which is
present in 1.5.x but not 1.3.x. We initially tried to test on 1.5 but
had to role back to 1.3 due to other issues - and missed this fix.

My understanding of the networking maze was confused by the unfortunate
naming of the incoming net client queues ("send_queue") - will propose a
renaming.

This still requires a confirmation on the target, but I'm quite
optimistic now.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]