lwip-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-devel] Issue with Cortex Emac driver


From: Ivan Delamer
Subject: Re: [lwip-devel] Issue with Cortex Emac driver
Date: Thu, 05 Nov 2015 10:48:45 -0700

Hi,

What Cortex are you using?

I've had this problem for years with Atmel SAM7x512 EMAC driver. It is basically the same in newer Atmel cortex.

I added several checks and from time to time reset all buffers, because there are sync issues.

Cheers
Ivan

PS: not a big deal, but this message belongs in lwip-users, as it is not part of the internal LwIP development.

Date: Thu, 5 Nov 2015 06:51:15 -0700 (MST)
From: RAc <address@hidden>
To: address@hidden
Subject: [lwip-devel] Issue with Cortex Emac driver
Message-ID: <address@hidden>
Content-Type: text/plain; charset=us-ascii

Hi everybody,

I have come across this issue several times now and believe it may be of interest to some of you; if not or if this should be a dupe, sorry for the
noise...

When running lwip on top of an RTOS, you may experience that under heavy stress load (eg several concurrent instances of fast pings running several minutes), your network response may deteriorate, with your sniffer revealing that no packet is lost but some are delayed significantly. FOr example, if your system is in that state and you cut down the stress test to one fast ping, every n-1 out of n pings may time out, but the sniffer trace will
reveal that they are only processed way after the timeout expired.

As far as I can tell, this is simply a bug in the emac driver. Typically,
the control flow is something like this (pseudo code of course):

Ethernet ISR:

if (receiver has caused an interrupt) signal Rx semaphore;

Ethernet input task:

for(;;)
{
    wait on Rx semaphore;
    if (!(current_descriptor->Status & ETH_DMARXDESC_OWN))
    {
        copy descriptor contents to allocated pbuf;
        current_descriptor->Status |= ETH_DMARXDESC_OWN;
        forward current_descriptor to next in chain;
        process packet or signal tcp thread to process the packet
asynchronously
    }
}

(of course, there is more work due to possibly fragemented packets).

all of this works fine as long as the (leading) hardware descriptor pointer and the (trailing) software descriptor pointer are in sync. However, if for some reason (race conditions or the like) the software descriptor pointer goes out of sync and points to a descriptor not owned by DMA, the infinite loop will simply go to sleep and wait for the next interrupt to fire the semaphore - and that'll only process the packet once the DMA has completly
filled up the ring buffer, making the current Position of the trailing
pointer valid. This'll cause the sawtooth pattern in which one packet will
sort of push the entire chain of outstanding buffers to be processed.
Interestingly enough, in networks with lots of traffic (eg those with lots of ARP broadcasts) there will be no visible adverse effects as the chain of
packets will be retriggered frequently enough, eventually serving all
packets more or less in a timely fashion. It'll only Show up in well behaved
nets in which your traffic appears to "hang" until retriggered.

Even though the best solution would be to fix the race condition that caused
the ring buffer desriptor pointers to go out of sync, I found that the
following addition appears to solve the problem alright:

for(;;)
{
    wait on Rx semaphore;
    if (!(current_descriptor->Status & ETH_DMARXDESC_OWN))
    {
        copy descriptor contents to allocated pbuf;
        current_descriptor->Status |= ETH_DMARXDESC_OWN;
        forward current_descriptor to next in chain;
        process packet or signal tcp thread to process the packet
asynchronously
    }
    else       // NEW!!!
forward current_descriptor to next in chain until a descriptor owned
by DMA is encountered;
}

Ethernet input task:

Any input of suggestions for better fixes welcome, thanks!








reply via email to

[Prev in Thread] Current Thread [Next in Thread]