[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: strange FFT timing
From: |
Sergei Steshenko |
Subject: |
Re: strange FFT timing |
Date: |
Mon, 21 Apr 2008 04:06:16 -0700 (PDT) |
--- David Bateman <address@hidden> wrote:
> Sergei Steshenko wrote:
> > 1e6 is faster than 2^20.
> >
> > FFTW3 is with SSE2 support.
> >
> But SSE2 instructions are only used if the block of memory passed to
> FFTW3 of type fftw_complex is 16 byte aligned. If its not then FFTW3
> falls back to a slower non SSE2 version... Octave attempts to flag
> memory blocks that have the correctly alignment for use of the SSE2
> instruction and force FFTW3 to use it. However Octave makes no attempt
> to enforce a 16byte alignment in its own Array class and so SSE2 can not
> always be used with Octave.. This is what I meant by "got lucky"..
>
> D.
>
>
> --
> David Bateman address@hidden
> Motorola Labs - Paris +33 1 69 35 48 04 (Ph)
> Parc Les Algorithmes, Commune de St Aubin +33 6 72 01 06 33 (Mob)
> 91193 Gif-Sur-Yvette FRANCE +33 1 69 35 77 01 (Fax)
>
> The information contained in this communication has been classified as:
>
> [x] General Business Information
> [ ] Motorola Internal Use Only
> [ ] Motorola Confidential Proprietary
>
>
David, I know what you've said, I simply put info on SSE2 for clarity.
Regards,
Sergei.
Applications From Scratch: http://appsfromscratch.berlios.de/
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ