lwip-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lwip-devel] Fast ARP for a "mini" system


From: Małowidzki , Marek
Subject: [lwip-devel] Fast ARP for a "mini" system
Date: Wed, 14 Jan 2009 12:45:39 +0100

Dear Developers,
 
first, I would like to say hello and thanks for such a great stuff.
 
We are considering using the lwIP stack for our embedded system. The device is a kind of a simple router that forwards packets between interfaces and does some additional magic with them :) Our system is specific in such a sense that it is rather a "mini" system - we have quite a strong processor and can afford even a few hundreds of kB RAM for the IP stack - rather than a "micro" one, where we would have to fit within some kBytes. Nonetheless, since there are other important and time-consuming tasks, we mainly worry about performance in terms of CPU usage.
 
The current ARP implementation 1.3.0 in etharp.c uses a fixed-size array of ARP entries:
 
static struct etharp_entry arp_table[ARP_TABLE_SIZE];
 
For ARP, we need the following operations:
1. ARP table lookup (also used in update)
2. ARP resend
3. ARP expiration
 
Currently, 1. requires going through the array and looking for the "right" or unused entry, 2. and 3. are handled together in etharp_tmr(). Ok. The problem appears when our system needs to "talk" to more than ARP_TABLE_SIZE systems "simultaneously". In such a case, the oldest (but still valid) entry will be removed and additional ARP messages will be sent. Of course, we can change the value of ARP_TABLE_SIZE to a greater value (say, 32, 64) but this would slow the lookup down considerably, especially when the cache gets full anyway (not only lookup but also more ARP requests, which would be painful).
 
Because of the specificity of our system, we have an idea of an alternative implementation that would trade some memory for processing speed. We assume we would handle C-class networks only (and its subnets, of course). The basic idea would be using one ARP entry per IP address (for a given IP Ethernet interface). Thus, the array would have 256 entries:
 
struct etharp_entry arp_table[256];                   // one ARP cache per IP Ethernet interface
 
Of course, for the last byte (the host part) of an IP address equal to X the corresponding entry would be at arp_table[X] (we in fact do not need entries at 0 and 255). Simple. Now, the entries being resolved (ETHARP_STATE_PENDING) would additionally be placed in a linked list that would be traversed in etharp_tmr(), mainly in order to resend the ARP query (optionally; currently not implemented in 1.3.0) and free resources (buffered packets if any) if there is no ARP response. Expiration would be handled together with lookup - if an entry has got obsolete, a refresh is needed (for stable entries, there would be no cached packets).
 
Memory overhead grows significantly (which we can afford) but - I believe - the ARP cache would perform a lot faster. A problem could be that in some situations there could be lots of cached packets (awaiting ARP responses), which could exhaust memory. However, a simple solution could be placing a simple limit on their number (counting buffered packets and dropping new ones if the limit has been reached).
 
What do you think? Please treat this post as an invitation to a discussion. We would like to find out whether someone has already considered such an approach, whether there are any issues that we do not currently see. Of course, we would - assuming we will sucessfully port lwIP to our system - provide a reference implementation in the future.
 
Best regards,
 
Marek

reply via email to

[Prev in Thread] Current Thread [Next in Thread]