qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH


From: Richard Tollerton
Subject: Re: [Qemu-devel] [PATCH v2 2/2] e1000: decrement RDT if equal to RDH
Date: Tue, 13 Jan 2015 15:06:13 -0600
User-agent: Notmuch/0.19 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-unknown-linux-gnu)

Jason Wang <address@hidden> writes:

> On Tue, Jan 13, 2015 at 3:12 AM, Richard Tollerton <address@hidden> wrote:
>> On Thu, Dec 18, 2014 at 12:01:48AM -0500, Jason Wang wrote:
>> 
>>>  > Some drivers set RDT=RDH. Oddly, this works on real hardware. To
>>>  > work around this, autodecrement RDT when this happens.
>>>  
>>>  Please describe more details on the issue. The spec 3.2.6 said:
>>> "When the head pointer is equal to the tail pointer, the ring is
>>> empty." So RDT=RDH in fact empty the ring. No?
>> 
>> That is incorrect; the spec explicitly states that RDT=RDH means the
>> ring is full. The linux e1000 driver more or less implies the same
>> thing.
>> 
>> You forgot to include the sentence after that in section 3.2.6 :)
>> 
>> "When the head pointer is equal to the tail pointer, the ring is
>> empty. Hardware stops storing packets in system memory until software
>> advances the tail pointer, making more receive buffers available."
>> 
>> Yeah, this seems really poorly worded to me too. :( You appear to be
>> interpreting "ring is empty" in the usual sense, i.e. "all N elements
>> of the ring buffer are available for use by hardware". In fact, the
>> correct interpretation [1] is the exact opposite, "none of the
>> elements are available for use by hardware".
>
> Yes, I do think 'empty' means no available buffer for device to receive 
> :)

Ah, ok. But if you're concerned about breaking drivers with this... what
legitimate reason could a driver possibly have to set RDT=RDH? (Besides a
mistaken attempt to free the ring for hardware use, as I posit?)

The only reason I can think of is that maybe a driver is trying to
implement some sort of crude flow control. But if that actually worked,
then major packet loss would be observed under load, as any packets
received by hardware but not yet processed by software would get
dropped.

I'm going to go out (further) on a limb and assert that no driver ever
sets RDT=RDH to stop receives, because no hardware implements that
behavior. The driver I'm trying to get working appears to have been
setting RDT=RDH at the end of *every* iteration of the receive loop,
since it was originally written in 2003. If I am to trust the comments,
it's been ported/supported on 28 different chipset variants, and it's
definitely been tested for performance and packet loss under load for a
good number of those, including under polling modes where the ring is
only checked every few milliseconds. If RDT=RDH ever did anything except
free all buffers for hardware use, surely catastrophic packet loss would
have been observed?

>> The last sentence in the quote makes this explicit. See also linux
>> e1000 driver sources at [2] [3] [4].
>
> Btw, [2],[3],[4] are all codes that deal with driver's internal
> variable, not the one that the hardware use.

No, it's directly used by hardware -- current_count increments RDT. See
e1000_alloc_rx_buffers().

>> I'm *guessing* (?) that the DMA engine isn't correspondingly stopped
>> when software sets RDT=RDH, so that once packets start getting
>> received,
>
> Do you mean in qemu? I/O are single threaded, so looks like we are
> safe.

I'm referring to the possibility that physical hardware is doing this.

Interesting side note: while this driver sets RDT=RDH on every
iteration, it *initializes* RDT=0 and RDH=1...

>> the hardware can more or less ignore it. In this context, my patch
>> makes sense.
>> 
>> (Yes, this is totally an ex-post-facto justification for the patch; it
>> arrived to me secondhand, and I had not been familiar with the driver
>> source before now.)
>
> True, we've found many undocumented behavior in the past (some even 
> conflicts with spec). I don't have a 82540EM in my hand, but I think 
> the best thing is to check this behavior in real hardware to prevent 
> this patch from breaking many existing drivers.

Can you be more specific in regards to what information you're
requesting? Are you wanting additional confirmation (i.e. via
instrumented code in addition to code inspection) that setting RDT=RDH
does not stop packet receive once it has started?

>> [1] http://sourceforge.net/p/e1000/mailman/message/29280078/
>
> This issue mentioned in the thread seems solved.

Indeed; I was citing this thread (and the other thread) for the
discussions, not the issues themselves.

>> [2] 
>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L398
>> [3] 
>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000.h#L215
>> [4] 
>> http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000/e1000_main.c#L4302
>> [5] http://sourceforge.net/p/e1000/mailman/message/29969887/
>
> Looks like what mentioned in this thread was also solved.
>
> We check both RCTL and e1000_has_rxbufs() in e1000_can_receive().
> And flush queued packets in set_rx_control().
>> 
>>>>  > diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>  > index 44ae3a8..b8cbfc1 100644
>>>>  > --- a/hw/net/e1000.c
>>>>  > +++ b/hw/net/e1000.c
>>>>  > @@ -1152,6 +1152,12 @@ mac_writereg(E1000State *s, int index, 
>>>> uint32_t val)
>>>>  >  static void
>>>>  >  set_rdt(E1000State *s, int index, uint32_t val)
>>>>  >  {
>>>>  > +    if (val == s->mac_reg[RDH]) {     /* Decrement RDT if it's 
>>>> too big */
>>>>  > +        if (val == 0) {
>>>>  > +            val = s->mac_reg[RDLEN] / sizeof(struct 
>>>> e1000_rx_desc);
>>>>  > +        }
>>>>  > +        val--;
>>>>  > +    }
>>>>  >      s->mac_reg[index] = val & 0xffff;
>>>>  >      if (e1000_has_rxbufs(s, 1)) {
>>>>  >          qemu_flush_queued_packets(qemu_get_queue(s->nic));
>>>>  > --
>>>>  > 2.1.3
>>>>  > 
>>>>  > 
>>>>  > 
>> 
>> -- 
>> Richard Tollerton <address@hidden>
>> 
>

-- 
Richard Tollerton <address@hidden>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]