qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] e1000: Fix the unexpected assumption that the receive buffer


From: Jason Wang
Subject: Re: [PATCH] e1000: Fix the unexpected assumption that the receive buffer is full
Date: Mon, 8 Jul 2024 11:21:18 +0800

On Sat, Jul 6, 2024 at 4:30 AM Hyman Huang <yong.huang@smartx.com> wrote:
>
> Unexpected work by certain Windows guests equipped with the e1000
> interface can cause the network to go down and never come back up
> again unless the guest's interface is reset.
>
> To reproduce the failure:
> 1. Set up two guests with a Windows 2016 or 2019 server operating
>    system.

I vaguely remember e1000 support for Windows has been deprecated for
several years...

That's why e1000e or igb is implemented in Qemu.

> 2. Set up the e1000 interface for the guests.
> 3. Pressurize the network slightly between two guests using the iPerf tool.
>
> The network goes down after a few days (2-5days), and the issue
> is the result of not adhering to the e1000 specification. Refer
> to the details of the specification at the following link:
> https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf
>
> Chapter 3.2.6 describe the Receive Descriptor Tail register(RDT)
> as following:
> This register holds a value that is an offset from the base, and
> identifies the location beyond the last descriptor hardware can
> process. Note that tail should still point to an area in the
> descriptor ring (somewhere between RDBA and RDBA + RDLEN).
> This is because tail points to the location where software writes
> the first new descriptor.
>
> This means that if the provider—in this case, QEMU—has not yet
> loaded the packet,

What do you mean by "load" here?

> RDT should never point to that place.

And "that place"?

> When
> implementing the emulation of the e1000 interface, QEMU evaluates
> if the receive ring buffer is full once the RDT equals the RDH,
> based on the assumption that guest drivers adhere to this
> criterion strictly.
>
> We applied the following log patch to assist in analyzing the
> issue and eventually obtained the unexpected information.
>
> Log patch:
> -----------------------------------------------------------------
> |--- a/hw/net/e1000.c
> |+++ b/hw/net/e1000.c
> |@@ -836,6 +836,9 @@ e1000_set_link_status(NetClientState *nc)
> | static bool e1000_has_rxbufs(E1000State *s, size_t total_size)
> | {
> |     int bufs;
> |+    DBGOUT(RX, "rxbuf_size = %u, s->mac_reg[RDLEN] = %u, s->mac_reg[RDH] = 
> %u, s->mac_reg[RDT] = %u\n",
> |+           s->rxbuf_size, s->mac_reg[RDLEN], s->mac_reg[RDH], 
> s->mac_reg[RDT]);
> |+
> |     /* Fast-path short packets */
> |     if (total_size <= s->rxbuf_size) {
> |         if (s->mac_reg[RDH] == s->mac_reg[RDT] && s->last_overrun)
> |@@ -1022,6 +1025,9 @@ e1000_receive_iov(NetClientState *nc, const struct 
> iovec *iov, int iovcnt)
> |         s->rxbuf_min_shift)
> |         n |= E1000_ICS_RXDMT0;
> |
> |+    DBGOUT(RX, "rxbuf_size = %u, s->mac_reg[RDLEN] = %u, s->mac_reg[RDH] = 
> %u, s->mac_reg[RDT] = %u\n",
> |+           s->rxbuf_size, s->mac_reg[RDLEN], s->mac_reg[RDH], 
> s->mac_reg[RDT]);
> |+
> -----------------------------------------------------------------
>
> The last few logs of information when the network is down:
>
> e1000: total_size = 1, rxbuf_size = 2048, s->mac_reg[RDLEN] = 16384, 
> s->mac_reg[RDH] = 897, s->mac_reg[RDT] = 885
> <- the receive ring buffer is checked for fullness in the
> e1000_has_rxbufs function, not full.
>
> e1000: total_size = 64, rxbuf_size = 2048, s->mac_reg[RDLEN] = 16384, 
> s->mac_reg[RDH] = 898, s->mac_reg[RDT] = 885
> <- RDT stays the same, RDH updates to 898, and 1 descriptor
> utilized after putting the packet to ring buffer.
>
> e1000: total_size = 1, rxbuf_size = 2048, s->mac_reg[RDLEN] = 16384, 
> s->mac_reg[RDH] = 898, s->mac_reg[RDT] = 885
> <- the receive ring buffer is checked for fullness in the
> e1000_has_rxbufs function, not full.
>
> e1000: total_size = 64, rxbuf_size = 2048, s->mac_reg[RDLEN] = 16384, 
> s->mac_reg[RDH] = 899, s->mac_reg[RDT] = 885
> <- RDT stays the same, RDH updates to 899, and 1 descriptor
> utilized after putting the packet to ring buffer.
>
> e1000: total_size = 1, rxbuf_size = 2048, s->mac_reg[RDLEN] = 16384, 
> s->mac_reg[RDH] = 899, s->mac_reg[RDT] = 885
> <- the receive ring buffer is checked for fullness in the
> e1000_has_rxbufs function, not full.
>
> e1000: total_size = 64, rxbuf_size = 2048, s->mac_reg[RDLEN] = 16384, 
> s->mac_reg[RDH] = 900, s->mac_reg[RDT] = 885
> <- RDT stays the same, RDH updates to 900 , and 1 descriptor
> utilized after putting the packet to ring buffer.
>
> e1000: total_size = 1, rxbuf_size = 2048, s->mac_reg[RDLEN] = 16384, 
> s->mac_reg[RDH] = 900, s->mac_reg[RDT] = 900
> <- The ring is full, according to e1000_has_rxbufs, because
> of the RDT update to 900 and equals RDH !

Just to make sure I understand this, RDT==RDH means the ring is empty I think?

See commit:

commit e5b8b0d4ba29fe1268ba049519a1b0cf8552a21a
Author: Dmitry Fleytman <dmitry@daynix.com>
Date:   Fri Oct 19 07:56:55 2012 +0200

    e1000: drop check_rxov, always treat RX ring with RDH == RDT as empty

    Real HW always treats RX ring with RDH == RDT as empty.
    Emulation is supposed to behave the same.

    Reported-by: Chris Webb <chris.webb@elastichosts.com>
    Reported-by: Richard Davies <richard.davies@elastichosts.com>
    Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
    Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Or did you mean we need to revert the above commit in fact?

Thanks




reply via email to

[Prev in Thread] Current Thread [Next in Thread]