[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A faster FFT of real matrices ??
From: |
Dmitri A. Sergatskov |
Subject: |
Re: A faster FFT of real matrices ?? |
Date: |
Wed, 28 Jan 2004 13:14:06 -0700 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040115 |
It appears to be fftw problem. First of all here is my results:
octave:5> testfft2
Loading Data done
Testing fft( 512, 512) 2.46e-02 sec (4.51e-01) rerr 2.19e-13
Testing fft2( 512, 512) 4.52e-02 sec (2.64e-01) rerr 1.83e-13
Testing fft( 513, 513) 5.20e-02 sec (5.78e-01) rerr 4.07e-13
Testing fft2( 513, 513) 9.10e-02 sec (7.07e-01) rerr 9.38e-14
Testing fft( 514, 512) 2.20e-01 sec (2.57e+00) rerr 1.54e-13
Testing fft2( 514, 512) 2.41e-01 sec (2.68e+00) rerr 1.35e-12
Testing fft( 512, 514) 2.47e-02 sec (4.49e-01) rerr 1.13e-13
Testing fft2( 512, 514) 6.34e-02 sec (3.65e-01) rerr 3.39e-13
Testing fft(65536, 1) 2.64e-02 sec (7.76e-01) rerr 1.37e-13
Testing fft2(65536, 1) 2.67e-02 sec (7.42e-01) rerr 1.37e-13
=====
If you build fftw it will make few test programs. Running them I get the
following:
address@hidden tests]$ ./fftw_test -s 514x514
Please wait (and remember, this is faster than Java).
SPEED TEST: 514x514, FFTW_FORWARD, in place, generic
time for one fft: 97.128188 ms (367.636859 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 244.959784
SPEED TEST: 514x514, FFTW_BACKWARD, in place, generic
time for one fft: 97.292187 ms (368.257610 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 244.546869
SPEED TEST: 514x514, FFTW_FORWARD, in place, specific
time for one fft: 95.786844 ms (362.559780 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 248.390060
SPEED TEST: 514x514, FFTW_BACKWARD, in place, specific
time for one fft: 96.811188 ms (366.436992 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 245.761884
address@hidden tests]$ ./rfftw_test -s 514x514
Please wait (and remember, this is faster than Java).
SPEED TEST: 514x514, FFTW_FORWARD, in place, generic
time for one fft: 228.993500 ms (863.396601 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 52.168528
SPEED TEST: 514x514, FFTW_BACKWARD, in place, generic
time for one fft: 226.757937 ms (854.967641 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 52.682847
SPEED TEST: 514x514, FFTW_FORWARD, in place, specific
time for one fft: 229.113437 ms (863.848813 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 52.141218
SPEED TEST: 514x514, FFTW_BACKWARD, in place, specific
time for one fft: 226.630375 ms (854.486679 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 52.712501
=======
Now 512x512 for comparison
=======
address@hidden tests]$ ./rfftw_test -s 512x512
Please wait (while Windows NT reboots).
SPEED TEST: 512x512, FFTW_FORWARD, in place, generic
time for one fft: 16.340711 ms (62.092317 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 724.953801
SPEED TEST: 512x512, FFTW_BACKWARD, in place, generic
time for one fft: 16.710453 ms (63.497284 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 708.913182
SPEED TEST: 512x512, FFTW_FORWARD, in place, specific
time for one fft: 16.597938 ms (63.069741 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 713.718828
SPEED TEST: 512x512, FFTW_BACKWARD, in place, specific
time for one fft: 16.689492 ms (63.417635 ns/point)
"mflops" = 5/2 (N log2 N) / (t in microseconds) = 709.803532
address@hidden tests]$ ./fftw_test -s 512x512
Please wait (exorcising evil spirits).
SPEED TEST: 512x512, FFTW_FORWARD, in place, generic
time for one fft: 146.983063 ms (560.695887 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 160.514821
SPEED TEST: 512x512, FFTW_BACKWARD, in place, generic
time for one fft: 147.316875 ms (561.969280 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 160.151103
SPEED TEST: 512x512, FFTW_FORWARD, in place, specific
time for one fft: 48.528656 ms (185.122132 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 486.165532
SPEED TEST: 512x512, FFTW_BACKWARD, in place, specific
time for one fft: 47.631297 ms (181.698978 ns/point)
"mflops" = 5 (N log2 N) / (t in microseconds) = 495.324745