qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] net/tap.c: Possibly a way to stall tap input
Date: Mon, 5 Aug 2013 13:38:12 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Aug 02, 2013 at 09:41:24PM +0200, Jan Kiszka wrote:
> On 2013-08-02 14:45, Jan Kiszka wrote:
> > On 2013-08-02 13:46, Stefan Hajnoczi wrote:
> >> On Thu, Aug 01, 2013 at 07:15:54PM +0200, Jan Kiszka wrote:
> >>> I was digging into the involved code and found something fishy:
> >>>
> >>> net/tap.c:
> >>> static void tap_send(void *opaque)
> >>> {
> >>>     ...
> >>>         size = qemu_send_packet_async(&s->nc, buf, size,
> >>>                                       tap_send_completed);
> >>>         if (size == 0) {
> >>>             tap_read_poll(s, false);
> >>>         }
> >>>
> >>> So, if tap_send is registered for the mainloop polling (ie. can_receive
> >>> returned true before starting to poll) but qemu_send_packet_async
> >>> returns 0 now as qemu_can_send_packet/can_receive happens to report
> >>> false in the meantime, we will disable read polling. If also write
> >>> polling is off, the fd will be completely removed from the iohandler
> >>> list. But even if write polling remains on, I wonder what should bring
> >>> read polling back?
> >>
> >> This behavior seems fine to me.  Once the peer (pcnet) is able to
> >> receive again it must flush the queue, this will re-enable
> >> tap_read_poll().
> >>
> >> Can you explain a bit more why this would be a problem?
> > 
> > The problem is that I don't see at all what will call tap_read_poll(s,
> > 1), neither in theory nor in reality.
> > 
> > As long as the real test case is out of reach, I tried to emulate the
> > faulty behaviour by letting tap_can_send always return 1. Result:
> > reception stalls during boot as even qemu_flush_queued_packets cannot
> > get it running again once tap_read_poll(s, 0) was called.
> 
> OK, false alarm. The issue was most likely fixed by commit 199ee608
> (net: fix qemu_flush_queued_packets() in presence of a hub) which is
> present in 1.5.x but not 1.3.x. We initially tried to test on 1.5 but
> had to role back to 1.3 due to other issues - and missed this fix.
> 
> My understanding of the networking maze was confused by the unfortunate
> naming of the incoming net client queues ("send_queue") - will propose a
> renaming.
> 
> This still requires a confirmation on the target, but I'm quite
> optimistic now.

Okay, good to hear.  It makes more sense now and I agree that
"send_queue" is not a great name.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]