[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL
From: |
Jonathan Larmour |
Subject: |
Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL |
Date: |
Fri, 20 Oct 2006 23:37:23 +0100 |
User-agent: |
Mozilla Thunderbird 1.0.8-1.1.fc3.4.legacy (X11/20060515) |
[I saw Simon's recent mail but thought it was better replying to this one...]
Goldschmidt Simon wrote:
Hi,
1:
Sorry to be so 'unprofessional' :)
But:
The 1.1.0 version I was using was provided by Altera with the NIOS2
devkit. I just compared this version to the oringinal 1.1.0 and saw that
Altera changed that part... They simply just used the semaphore used by
PBUF_POOL_FREE() in pbuf_pool_alloc() as well. I can submit a patch
doing this if anyone cares...
I'm relatively new at lwIP and haven't done much with it. But this is my
take from what I've seen so far....
I don't think you can portably wait on a semaphore in pbuf_pool_alloc
which is probably why it isn't. You might be (and probably will be) in a
device driver context and so cannot block.
2:
The other question is: Why did nobody get corrupted pbufs like I do?
Have you all set SYS_LIGHTWEIGHT_PROT to 1? (since it's by default set
to 0...)
I don't know the history, but thread-safety is almost certainly why
SYS_LIGHTWEIGHT_PROT even exists. Without it, the code is inherently *not*
thread safe.
3:
OK, maybe the stack is not supposed to be multithreaded, BUT: I have a
separate thread for receiveing,
As does SLIP and PPP, so this would be a wider stack issue.
and that thread must allocate fresh
pbufs (PBUF_POOL in my case) so where should I do that if not in a
seperate thread? With SYS_LIGHTWEIGHT_PROT=1, INTs are diables (=
threadsafe); in PBUF_POOL_FREE(), a semaphore is used (=threadsafe)
With SYS_LIGHTWEIGHT_PROT=1, semaphores are not used in PBUF_POOL_FREE().
Just SYS_ARCH_PROTECT() etc.
, so
I still think the design goal is to make pbufs and memory access thread
safe!
4:
The reason pbuf_pool_free_lock is never set to anything except 0 is that
it is not used on the FREE-side. What strikes me more is, that
pbuf_pool_alloc_lock is never tested against, only set to a value. But
that seems only to be a typo (in v1.1.1, pbuf.c line 146 should use
pbuf_pool_alloc_lock instead of pbuf_pool_free_lock).
Although there is a problem here, I don't believe that is a typo. I
believe pbuf_pool_alloc was originally written expecting to be called from
drivers potentially in an interrupt and can't block. I believe that is the
expected model and is probably an important thing to preserve. And
pbuf_pool_free is only expected to be called from thread-level stack code.
So with that model an alloc doesn't have to worry about interrupting
another alloc, only a free. And if it finds it is interrupting a free, it
should just bomb out and return no buffer.
But first of all, indeed nothing increments pbuf_pool_free_lock. And
second of all, there are calls to pbuf_alloc for allocations from
PBUF_POOL in other places that may themselves then be interrupted by a
driver interrupt:
$ egrep -r 'pbuf_alloc.*POOL' .
./core/ipv4/ip_frag.c: p = pbuf_alloc(PBUF_LINK, ip_reasslen, PBUF_POOL);
./core/pbuf.c: q = pbuf_alloc(PBUF_RAW, p->len, PBUF_POOL);
./netif/ppp/ppp.c: tb = pbuf_alloc(PBUF_RAW, 0, PBUF_POOL);
./netif/ppp/ppp.c: headMB = pbuf_alloc(PBUF_RAW, 0, PBUF_POOL);
./netif/ppp/ppp.c: headMB = pbuf_alloc(PBUF_RAW, 0, PBUF_POOL);
./netif/ppp/ppp.c: nextNBuf = pbuf_alloc(PBUF_RAW, 0,
PBUF_POOL);
./netif/ppp/vj.c: np = pbuf_alloc(PBUF_RAW, n0->len +
cs->cs_hlen, PBUF_POOL);
./netif/ppp/vj.c: np = pbuf_alloc(PBUF_RAW, cs->cs_hlen,
PBUF_POOL);
./netif/slipif.c: p = pbuf_alloc(PBUF_LINK, PBUF_POOL_BUFSIZE,
PBUF_POOL);
So the (likely) intended form of protection is bogus. You can't shield an
allocation from another allocation. As both would set pbuf_pool_alloc_lock
to 1 and you couldn't tell which one "won". If you instead tries to do
"pbuf_pool_alloc_lock++" that wouldn't help as that is not guaranteed
atomic (and usually isn't).
The only solutions I can think of without !SYS_LIGHTWEIGHT_PROT would
involve adding new elements to the sys_arch abstraction. In which case you
may as well just implement SYS_LIGHWEIGHT_PROT.
With the current code, I think the only solution is SYS_LIGHTWEIGHT_PROT.
My opinion is that the !SYS_LIGHTWEIGHT_PROT stuff should probably just be
removed.
Unless someone knows that the above allocs could avoid using PBUF_POOL.
Why they're being allocated from there is a bit of a mystery to me.
Then again the above might be complete rubbish.
Jifl
--
eCosCentric http://www.eCosCentric.com/ The eCos and RedBoot experts
------["The best things in life aren't things."]------ Opinions==mine