octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: more permute optimizations


From: David Bateman
Subject: Re: more permute optimizations
Date: Wed, 25 Mar 2009 19:20:20 +0100
User-agent: Mozilla-Thunderbird 2.0.0.17 (X11/20081018)

Jaroslav Hajek wrote:
Unless David (or anyone else) objects, I'll replace the transpose
implementation by a call to the routine using pointers. I also
tinkered with the block size a bit, but it seems to me that 8 was a
good guess from David.
The best block size seems to depend on the problem dimensions, and I
bet that dependence is nontrivial.

If its faster I have no objections.. The 8x8 block size arises as a typical cache typically has 16 independent lines, and not all are available available as some will be taken by your code.. Each line of the block takes a cache line and so 8x8 is the largest reasonable size before cache misses start happening within the copying of the block itself..

D.

--
David Bateman                                address@hidden
35 rue Gambetta                              +33 1 46 04 02 18 (Home)
92100 Boulogne-Billancourt FRANCE            +33 6 72 01 06 33 (Mob)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]