discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Problem with adopting ccomplex_dotprod_sse() in m


From: Eric Blossom
Subject: Re: [Discuss-gnuradio] Problem with adopting ccomplex_dotprod_sse() in my block
Date: Wed, 14 Mar 2007 06:59:20 -0700
User-agent: Mutt/1.5.9i

On Wed, Mar 14, 2007 at 05:26:02PM +0800, hanwen wrote:
> Hi, everyone,
> 
> Recently, I'm implementing some synchronization algorithm in block, which
> require higher efficiency. I try to use the function ccomplex_dotprod_sse()
> to speed up the block, but I always get "segment fault".
> I'm using a PC with Pentium D CPU, and I'm sure the ccomplex_dotprod_sse()
> works well in the fir filter blocks.
> I just simply include the ccomplex_dotprod_x86.h and call the
> ccmplex_dotprod_sse() function. Maybe I missed something but I have no idea
> of what it is? Please give me some hints.


You're probably not honoring its alignment requirements.


# input and taps are guarenteed to be 16 byte aligned.
# n_2_ccomplex_blocks is != 0
#       
#
#  ccomplex_dotprod_generic (const float *input,
#                            const float *taps, unsigned n_2_ccomplex_blocks, 
float *result)
#  {
#    float sum0 = 0;
#    float sum1 = 0;
#    float sum2 = 0;
#    float sum3 = 0;
#  
#    do {
#
#      sum0 += input[0] * taps[0] - input[1] * taps[1];
#      sum1 += input[0] * taps[1] + input[1] * taps[0];
#      sum2 += input[2] * taps[2] - input[3] * taps[3];
#      sum3 += input[2] * taps[3] + input[3] * taps[2];
#  
#      input += 4;
#      taps += 4;  
#  
#    } while (--n_2_ccomplex_blocks != 0);
#  
#  
#    result[0] = sum0 + sum2;
#    result[1] = sum1 + sum3;
#  }
#

There's a reason for all that other code that you are ignoring...

Eric




reply via email to

[Prev in Thread] Current Thread [Next in Thread]