lwip-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-devel] [task #7896] Support zero-copy drivers


From: Jonathan Larmour
Subject: Re: [lwip-devel] [task #7896] Support zero-copy drivers
Date: Thu, 17 Jun 2010 16:30:54 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-2.fc12 Lightning/1.0b2pre Thunderbird/3.0.4

On 24/05/10 13:44, Bill Auerbach wrote:
> I solved this in the Ethernet driver.  I used a chained DMA.  I copied
> 8 bytes to a stack-based array (which is aligned) and set up DMA to
> copy for 4 plus what it takes to align the second segment.  So it's not
> 0 copy, but takes only 4 instructions to do the copy to the array.
> Then I set up the second DMA transfer to start at the aligned address
> in the data for count minus what is in the first DMA segment.  For a 1k
> or larger transmit, this was far more efficient than copying the whole
> packet to an aligned stack-based array (which is what the driver I
> started with was doing). 

I'm coming to this late because I've not been in a position to work on
lwIP for over a year. But OOI I have already implemented zero-copy for
coldfire m68k and lwIP, but it isn't in a sufficiently generic way.

> I think it would still be more efficient if
> you don't have chained DMA if you had to do this in 2 separate
> transfers to the MAC.

Not all MACs would be able to do that.

>> Follow-up Comment #2, task #7896 (project lwip): Anyway, with those
>> limitations I cannot see how zero-copy transmission could work
>> without imposing severe restrictions on how the application supplies
>> data to lwip. Basically, it would not be practical. Am I right in
>> this?

It's not true that it isn't practical. It's true that if an application
sends non-aligned data a lower layer (perhaps the driver) will have to
fall back to copying it before transmission.

What can be done is to change how pbufs are allocated so that they _do_
fit the alignment requirements of the MAC. So when the user (via raw API
or netconn API) allocates space for their data, it will already be copied
in correctly (although I'm glossing over a few issues to do with
scatter-gather even then, and then what if you have multiple devices with
different constraints). We can also provide a means for the user to get
the alignment constraints from the stack if they want zero-copy.

It's pretty much intrinsic with the BSD sockets API that it can almost
never be zero-copy. That is one of its limitations and why netconn/raw
APIs can be superior (unless you're using e.g. netbuf_ref() ).

I note that there was also discussion about the alignment requirements of
the MAC. But that is not the only issue - with DMA in use, you also have
to consider the alignment requirements of the processor data cache (for
those processors with data cache anyway, which is more likely if they're
high-end enough to have DMA, but isn't true for all coldfires e.g. 5272).

Jifl
-- 
eCosCentric Limited      http://www.eCosCentric.com/     The eCos experts
Barnwell House, Barnwell Drive, Cambridge, UK.       Tel: +44 1223 245571
Registered in England and Wales: Reg No 4422071.
------["Si fractum non sit, noli id reficere"]------       Opinions==mine



reply via email to

[Prev in Thread] Current Thread [Next in Thread]