qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/2] Ignore RX tail kicks when RX disabled.


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH 1/2] Ignore RX tail kicks when RX disabled.
Date: Thu, 18 Oct 2012 16:31:47 +0200

On Thu, Oct 18, 2012 at 10:34 AM, Dmitry Fleytman <address@hidden> wrote:
> The real purpose of check_rxov it a bit confusing indeed, mainly
> because of unclear name (rename?),
> however it works as following:
>
> There are 2 possible when RDT == RDH for RX ring:
>     1. Device used all the buffers from ring, no empty buffers available
>     2. Driver fully refilled the ring and all buffers are empty and ready to 
> use
>
> check_rxov is used to distinguish these 2 cases:
>     1. It must be 1 initially (init, reset, etc.)
>     2. It must be set to one when device uses buffer
>     3. It must be set to 0 when driver adds buffer to the ring
> check_rxov == 1 - ring is empty
> check_rxov == 0 - ring is full
>
> Indeed, RX init sequence doesn't look logical, however this is the way
> all Intel driver behave from e1000 and up to ixgbe.
> Also see some explanation here:
> http://permalink.gmane.org/gmane.linux.kernel/1375917
>
> If we drop check_rxov and always treat RDH == RDT as empty ring we'll
> probably get correct behavior for current Linux driver's code (needs
> testing of course),
> however we have no idea how Windows drivers work.

Thanks, for the great explanation, Dmitry.

Alexander: I CCed you because I hope you might be able to explain what
the 82540EM card does when a driver sets RDT to the value of RDH.  The
QEMU NIC emulation code treats this as a full ring (i.e. the
descriptors are valid and will be filled in by the hardware).  Does
the real hardware act like this or will it treat this condition as
ring empty (i.e. if the driver sets RDT to the value of RDH then the
hardware stops receive because there are no descriptors available)?

I can't find a statement in the Intel datasheet about what happens
when the driver sets RDT = RDH.  The QEMU check_rxov variable is
trying to distinguish between ring empty (RDH has moved to RDT) and
ring full (driver has set RDH = RDT because the full descriptor ring
is available).

Dmitry: At this point we'd need to test what happens on real hardware
when RDH = RDT in order to be able to remove check_rxov.  As you
mentioned, with the Linux e1000 driver we don't see ring full RDH =
RDT:

        /* call E1000_DESC_UNUSED which always leaves
         * at least 1 descriptor unused to make sure
         * next_to_use != next_to_clean */
        for (i = 0; i < adapter->num_rx_queues; i++) {
                struct e1000_rx_ring *ring = &adapter->rx_ring[i];
                adapter->alloc_rx_buf(adapter, ring,
                                      E1000_DESC_UNUSED(ring));
        }

Here some sample output from a QEMU printf, notice how RDH is never
the same as RDT once rx begins:

set_rdt rdh=0 rdt_old=0 rdt_new=0
set_rdt rdh=0 rdt_old=0 rdt_new=254
set_rdt rdh=1 rdt_old=254 rdt_new=255
set_rdt rdh=2 rdt_old=255 rdt_new=0
set_rdt rdh=3 rdt_old=0 rdt_new=1
set_rdt rdh=4 rdt_old=1 rdt_new=2
set_rdt rdh=5 rdt_old=2 rdt_new=3
set_rdt rdh=6 rdt_old=3 rdt_new=4
set_rdt rdh=7 rdt_old=4 rdt_new=5
set_rdt rdh=9 rdt_old=5 rdt_new=7
set_rdt rdh=10 rdt_old=7 rdt_new=8
set_rdt rdh=11 rdt_old=8 rdt_new=9
set_rdt rdh=12 rdt_old=9 rdt_new=10
set_rdt rdh=13 rdt_old=10 rdt_new=11
set_rdt rdh=14 rdt_old=11 rdt_new=12

The iPXE 'intel' driver (supports e1000 cards) also does not set RDH =
RDT for full rx ring, instead it only uses 4 out of 8 descriptors at a
time.

The reason I'm digging into the need for check_rxov is because it's a
dangerous piece of code to have.  If check_rxov logic is ever out of
sync we risk memory corruption.  I'd really like to drop it
completely.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]