lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] sporadic PCB corruption


From: address@hidden
Subject: Re: [lwip-users] sporadic PCB corruption
Date: Wed, 25 Jan 2017 20:44:55 +0100
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2

Sandra,

I'm afraid problems like yours happend often in the past and only seldom there has actually been a problem in lwIP. Most of the time, it's wrong usage of lwIP (by the lwIP port). Now I don't know the zero copy driver from Stephane, but zero copy imposes some special requirements on alignment due to cache line size - maybe you have a problem there? (Unless your system has no data cache, of course - I don't know the Blackfin...)

Simon


Sandra Gilge wrote:
Hallo,

I'm using the LWIP stack for a kind of SIP-Phone application on Blackfin
BF527 and BF561 with VDK Operating System
I use the zero copy driver implemented by Stephane Lesage.
I currently use a GIT version from 2016-04-05 (2.0 was not released then)

It works well most of the time. But (at heavy load? Or just accidentally? I
think both on heavy load it happens more often) the tcp_input fails because
the pcb queue is corrupted.
There are several possible kinds the error shows:
- Either there is an endless loop because the queue reference to a previous
pcb (the queue is a ring then)
- Assert because the status of the PCB is not as expected (for example the
pcb is in the tw list but the status is not TIME_WAIT: tcp_input: TIME-WAIT
pcb->state == TIME-WAIT)

The error seems to be always in the TCP reception part (mainly when all
TCP_PCB are used, see stats below)

I checked really thoroughly if I (or the driver) do calls of callback-style
API functions outside the TCPIP Stack. But I don't see any calls from
outside TCP Stack!
But the implementation of the driver is not completely as proposed by LWIP.
But in my opinion it should be ok.
Here are the relevant lines in the code:

bfemac_init:
     bfemac_txmsg = tcpip_callbackmsg_new(bfemac_tx_callback, netif);
     bfemac_rxmsg = tcpip_callbackmsg_new(bfemac_rx_callback, netif);

     for (i=0; i<RX_DESCRIPTORS; i++)
     {
        rxdescriptor[i].packet = pbuf_alloc(PBUF_RAW, 1520, PBUF_POOL);
//alloc pbufs for later use
     }

RX-Interrupt:
     tcpip_trycallback(bfemac_rxmsg);

TX-Interrupt:
     tcpip_trycallback(bfemac_txmsg);


bfemac_rx_callback:
     ethernet_input(p, netif);                  //forward the packet to the
stack (pbuf should be then deallocated by the stack
     bfemac_alloc_rxdesc(rxdesc);                       //get a new pbuf for
further reception


low_level_output:
        // DMA supports only one contiguous packet buffer
        if (p->next)
        {
                q = pbuf_alloc(PBUF_RAW, p->tot_len, PBUF_RAM);
                if (!q) return ERR_MEM;
                pbuf_copy(q, p);
        }
        else
        {
                // Increment reference count because we queue the packet to
the DMA chain and return immedialety
                // After transmission complete, the interrupt callback will
free the pbuf
                pbuf_ref(q);
        }
        //Then it is appended to the TX descriptor for sending

bfemac_tx_callback:
        pbuf_free(txdesc->packet);


Does anybody see a problem in this kind of implementation, or could there be
other reasons for the pcb-chain to be corrupted?

The application strictly uses the sockets api (each thread has its own
sockets).

Best regards,
Sandra




_______________________________________________
lwip-users mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/lwip-users





reply via email to

[Prev in Thread] Current Thread [Next in Thread]