RE: [lwip-devel] long term stability - ping no response after fewdays- H

Even if you've done stress testing, I still would not rule out a driver issue - if your corporate LAN traffic is anything like ours, it's about as random a stress test as one could ever design. Tolerating a steady high rate of packets generated by another device is not the same as handling for example a sudden flood of several dozen or more broadcast packets in a few milliseconds. I found a bug, for example, that only occurred when I had completely filled my incoming packet queue (128 packets deep!) - and then got one more LAN interrupt between two specific lines of code in the driver. Since it is rare to even have the queue full, this was very difficult to reproduce and locate.

2008/11/6 Pettinato, Jim <address@hidden>

As far as I recall, all of the problems reported that were related to
long-term stability ended up being either driver issues or application
issues...

i'm using the debugger to understand where is the bug...

i.e. not handling incoming packet buffer overruns properly,

I had boards connected to business lan, and no request to the application was done for few days... i'd exclude application problem

not properly protecting critical code (both resulting in a corrupt pbuf
pool)

i didn't undestand... it could be related to my driver implementation?

or having application execution paths which did not free received
pbufs properly in all cases (memory leaks eventually leading to pool
depletion).

i'm using BSD socket, so, the high level api... i didn' write any piece of code related to the pbuf (except the driver)

Since these issues are often hard to nail down as they take time to
occur, I would suggest enabling the stats,

yes... i have stats enabled

and checking that you aren't
slowly leaking pools by exercising each application and verifying the
pbuf pool count returns to starting conditions afterward.

Sorry... can you explain me better??? How have i to search?? :O(

If the instability is a result of broadcast traffic only, I'd suspect a
hole in the driver...

ok... i'll check again my driver code.... but, in the past, i did some stress tests (sending a lot of packets very fast to my application, thought a tcp connection)... do you think that the bug in driver had to appear before?

If you do get the error to occur on your debug setup, if your stats show
that you should have pbufs remaining in the pool but your pbuf_pool
pointer is NULL (resulting in pbuf_alloc() always failing) - that's a
sure sign of pool corruption.

OK... i hope that the problem will apper again in my board with debugger, and i will post here all i'll see.

In this case i suppose i will need help of you.... if you can! :O)

Thanks

Piero

__

James M. Pettinato, Jr.
Software Engineer
E: address@hidden | P: 814 898 5250

FMC Technologies Measurement Solutions Inc.
1602 Wagner Avenue | Erie PA | 16510 USA
Phone: 814 898 5000 | Fax: 814 899-3414
www.fmctechnologies.com

-----Original Message-----
From: lwip-devel-bounces+jim.pettinato=fmcti.com@nongnu.org
[mailto:lwip-devel-bounces+jim.pettinato=fmcti.com@nongnu.org] On Behalf
Of Kieran Mansley
Sent: Thursday, November 06, 2008 3:49 AM
To: lwip-devel
Subject: Re: [lwip-devel] long term stability - ping no response after
fewdays - HELP ME

On Thu, 2008-11-06 at 09:22 +0100, Piero 74 wrote:

> Now, i want to know if some bugs found in lwip 130 and fixed in head
> version could be related to my trouble: ANY IDEA???

There was another issue reported a bit like this a few months ago.
Unfortunately I can't remember what the problem or solution was in more
detail, but searching the mailing lists might bring something up.

Kieran

_______________________________________________
lwip-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/lwip-devel

_______________________________________________
lwip-devel mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/lwip-devel

From:	Pettinato, Jim
Subject:	RE: [lwip-devel] long term stability - ping no response after fewdays- HELP ME
Date:	Fri, 7 Nov 2008 09:31:36 -0500