qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] tap networking - how?


From: Max Filippov
Subject: Re: [Qemu-devel] tap networking - how?
Date: Thu, 13 Feb 2014 18:17:38 +0400

On Thu, Feb 13, 2014 at 6:06 PM, Alexey Kardashevskiy <address@hidden> wrote:
> On 02/14/2014 01:02 AM, Max Filippov wrote:
>> On Thu, Feb 13, 2014 at 5:42 PM, Alexey Kardashevskiy <address@hidden> wrote:
>>> On 02/13/2014 11:23 PM, Max Filippov wrote:
>>>> On Thu, Feb 13, 2014 at 2:34 PM, Alexey Kardashevskiy <address@hidden> 
>>>> wrote:
>>>>> On 02/13/2014 07:40 PM, Max Filippov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Thu, Feb 13, 2014 at 11:34 AM, Alexey Kardashevskiy <address@hidden> 
>>>>>> wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> I am debugging spapr-vlan and hit the following issue.
>>>>>>>
>>>>>>> When I run QEMU as below, the kernel's DHCP client does not continue 
>>>>>>> till I
>>>>>>> hit any key in console. If I replace spapr-vlan with
>>>>>>> e1000/rtl8139/virtio-net, everything is just fine. If I use "user" 
>>>>>>> network
>>>>>>> - everything is fine too. So the problem is with combination of 
>>>>>>> spapr-vlan
>>>>>>> + tap.
>>>>>>>
>>>>>>> The issue looks like - the guest kernel boots and then prints:
>>>>>>> Sending DHCP requests ..
>>>>>>> and it keeps printing dots till I press key or timeout expires. tcpdump
>>>>>>> (running on the tap interface) shows one DHCP request and one DHCP 
>>>>>>> response.
>>>>>>>
>>>>>>> What normally happens is that QEMU calls os_host_main_loop_wait() which
>>>>>>> calls qemu_poll_ns() and it is sitting there till eventfd signals.
>>>>>>> This eventfd is registered via qemu_init_main_loop() -> 
>>>>>>> aio_context_new()
>>>>>>> -> aio_set_event_notifier() but I cannot find where it gets passed to 
>>>>>>> the
>>>>>>> kernel (otherwise why would we need eventfd?).  When eventfd signals, 
>>>>>>> QEMU
>>>>>>> calls qemu_iohandler_poll() which checks if TAP device has something to
>>>>>>> read and eventually calls tap_send().
>>>>>>>
>>>>>>> However in my bad example QEMU does not exit qemu_poll_ns() on eventfd,
>>>>>>> only on stdin event.
>>>>>>>
>>>>>>> I can see AIO eventfd created and event_notifier_test_and_clear() is 
>>>>>>> called
>>>>>>> on it before the kernel starts using spapr-vlan.
>>>>>>>
>>>>>>> So. h_send_logical_lan() is called to sent a DHCP request packet. Now I
>>>>>>> expect eventfd to signal but this does not happen. Have I missed some 
>>>>>>> reset
>>>>>>> or notification request or "bottom half" (virtio-net uses them but
>>>>>>> e1000/rtl8139 do not)?
>>>>>>
>>>>>> Sounds pretty much like the problem I had recently with opencores
>>>>>> 10/100 MAC: 
>>>>>> https://lists.gnu.org/archive/html/qemu-devel/2014-02/msg00073.html
>>>>>>
>>>>>> Does the following help?:
>>>>>
>>>>> Yes, it does, thanks a lot!
>>>>>
>>>>> While we are here and you seem to understand this stuff -
>>>>> how is tap expected to work to deliver a packet from the external network
>>>>> to the guest? I mean what event should be triggered in what order? My 
>>>>> brain
>>>>> is melting :( I just cannot see how receiving a packet on "tap" in the 
>>>>> host
>>>>> kernel can make os_host_main_loop_wait() exit in QEMU so it could call
>>>>> qemu_iohandler_poll() and do the job. Thanks!
>>>>
>>>> I'm not very experienced in this area of QEMU, so the following may be not
>>>> 100% accurate.
>>>> Tap file descriptor is registered among other file descriptors in an array
>>>> that os_host_main_loop_wait use to poll for events. So normally packet
>>>> arrives to the host, fd becomes readable, poll function completes and
>>>> registered handler (see tap_update_fd_handler) is called. The handler reads
>>>> packets and calls the attached NIC's NetClientInfo::receive callback 
>>>> through
>>>> network queuing infrastructure. But once NIC doesn't process a packet or 
>>>> its
>>>> NetClientInfo::can_receive returns false it stops polling for new packets
>>>> by updating handlers associated with its fd. So NIC needs to inform the
>>>> networking core when it can receive more packets by calling
>>>> qemu_flush_queued_packets, which will also complete polling and deliver
>>>> already queued packets.
>>>
>>>
>>> I am more interested in details :)
>>> os_host_main_loop_wait() calls glib_pollfds_fill() which puts actual fds
>>> into gpollfds GArray thing. Before the tap device started, its fd is not
>>> there but after the patch you proposed, tap's fd gets to the list.
>>> The actual fds are put into array by g_main_context_query() (if I read gdb
>>> output correctly). So there must be some callback somewhere which tells
>>> this g_main_context_query() what to poll for. I put a million breakpoints
>>> to know what is called but to no avail.
>>
>> I see that qemu_iohandler_fill puts fds into this array. And it only puts 
>> those
>> that have write handler or read handler and can read at the moment.
>
>
> os_host_main_loop_wait() - when things work, it waits on the tap device
> too. Without your patch, it does not wait on the tap device fd (i.e. this
> fd is not put to the array of fds by glib_pollfds_fill()). Where does this
> difference happen - this is my question...

It is triggered by the guest adding new descriptor to the NIC RX ring.
Added qemu_flush_queued_packets completes poll that doesn't have
TAP fd in the array, and (assuming there were no packets queued)
the next main_loop_wait -> qemu_iohandler_fill puts the TAP fd into
that array:

        if (ioh->fd_read &&
            (!ioh->fd_read_poll ||
             ioh->fd_read_poll(ioh->opaque) != 0)) {
            events |= G_IO_IN | G_IO_HUP | G_IO_ERR;
        }

because now NIC's can_receive (called here through ioh->fd_read_poll)
returns true.

-- 
Thanks.
-- Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]