[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Help with Hand-Optimized Assembly
From: |
Markus Wichmann |
Subject: |
Re: Help with Hand-Optimized Assembly |
Date: |
Wed, 28 Mar 2012 18:29:58 -0000 |
User-agent: |
Mozilla/5.0 (X11; Linux i686 on x86_64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1 |
On 12.01.2012 22:38, Bill Woessner wrote:
> I'm a 100% total newbie at writing assembly. But I figured it would
> be a good exercise. And besides, this tiny chunk of code is
> definitely in the critical path of something I'm working on. Any and
> all advice would be appreciated.
>
> I'm trying to rewrite the following function in x86 assembly:
>
> inline double DiffAngle(double theta1, double theta2)
> {
> double delta(theta1 - theta2);
>
> return std::abs(delta) <= M_PI ? delta : delta - copysign(2 * M_PI,
> delta);
> }
>
> To my great surprise, I've actually been somewhat successful. Here's
> what I have so far:
>
> double DiffAngle(double theta1, double theta2)
> {
> asm(
> "fldl 4(%esp);"
> "fsubl 12(%esp);"
> "fxam;"
> "fnstsw %ax;"
> "fldl TWO_PI;"
> "testb $2, %ah;"
> "fldl NEG_TWO_PI;"
> "fcmovne %st(1), %st;"
> "fstp %st(1);"
> "fsubr %st(1), %st;"
> "fldpi;"
> "fld %st(2);"
> "fabs;"
> "fcomip %st(1), %st;"
> "fstp %st(0);"
> "fcmovbe %st(1), %st;"
> "fstp %st(1);"
> "rep;"
> "ret;"
> "NEG_TWO_PI:;"
> ".long 1413754136;"
> ".long 1075388923;"
> "TWO_PI:;"
> ".long 1413754136;"
> ".long -1072094725;"
> );
> }
>
> This compiles, runs and produces the correct answers. But I have a
> few issues with it:
>
> 1) If I declare this function inline, it gives me garbage (like
> 10^-304)
That is because you actually require a real call to the function. If the
above assembly is inlined, the compiler doesn't really know where to put
the input and output variables.
I'm rewriting your C++ first, so I can put it into assembly more easily:
double DiffAngle(double theta1, double theta2)
{
double diff = theta1 - theta2;
if (abs(diff) <= M_PI)
return diff;
else if (diff < 0)
return diff + 2 * M_PI;
else
return diff - 2 * M_PI;
//Or, in a more SSE-like manor:
double subtract;
subtract = copysign(2*M_PI, diff);
if (abs(diff) <= M_PI) subtract = 0;
return diff - subtract;
}
Because you might want to rewrite the stuff anyway in SSE2, I'd change
it to something like:
double DiffAngle(double theta1, double theta2)
{
double res;
const uint64_t no_sign_mask = 0x7fffffffffffffff;
asm("movsd %1, %xmm0": : "m" (theta1) : );
asm("subsd %1, %xmm0": : "m" (theta2) : );
asm("movsd %xmm0, %xmm1" : : );
asm("movq %1, %xmm2" : : "m" (no_sign_mask) : );
asm("andpd %xmm2, %xmm0" : : ); //xmm0 = abs
asm("cmpgtsd %1, %xmm0": : "m" (M_PI) : ); //if abs(diff) <= M_PI
// %xmm0 = 0, else %xmm0 == 0xffff...
asm("movsd %1, %xmm3": : "m" (2 * M_PI) : );
asm("movsd %1, %xmm2": : "m" (~no_sign_mask) : );
asm("movsd %xmm1, %xmm4" : : : );
asm("andpd %xmm2, %xmm4" : : : );
asm("orpd %xmm4, %xmm3" : : : );
asm("andpd %xmm0, %xmm3" : : : );
asm("subsd %xmm3, %xmm1" : : :);
asm("movsd %xmm1, %0" : "=m" (res) : : );
return res;
}
Does that work for you? It's untested!
> 2) If I compile with -Wall, I get a warning that the function doesn't
> return a value, which is absolutely true, but I don't know how to fix
> it.
double ret;
asm("fld %1; fld %2; blablabla; fstp %0"
: "m" (theta1), "m" (theta2)
: "=m" (ret)
: );
return ret;
This should also clear your previous question.
> 3) I don't like how TWO_PI and NEG_TWO_PI are defined. I had to steal
> it from some generated assembly. It would be nice to use M_PI,
> 4*atan(1) or something like that.
>
Just define it as new inputs and let the compiler worry. Like:
double ret;
asm("fld %1; fld %2; blabla; fld %3; blabli; fld %4; bla; fstp %0"
: "=m" (ret)
: "m" (theta1), "m" (theta2), "m" (2*M_PI), "m" (-2*M_PI)
: );
return ret;
The "m" means "memory operand" (let the compiler worry about the
addresses!), the "=" means "write only operand".
> Thanks in advance,
> Bill
HTH,
Markus
- Re: Help with Hand-Optimized Assembly, (continued)
- Re: Help with Hand-Optimized Assembly, Terje Mathisen, 2012/03/28
- Re: Help with Hand-Optimized Assembly, Tim Roberts, 2012/03/28
- Re: Help with Hand-Optimized Assembly, Terje Mathisen, 2012/03/28
- Re: Help with Hand-Optimized Assembly, io_x, 2012/03/28
- Re: Help with Hand-Optimized Assembly, io_x, 2012/03/28
Re: Help with Hand-Optimized Assembly, Bob Masta, 2012/03/28
Re: Help with Hand-Optimized Assembly, James Harris, 2012/03/28
Re: Help with Hand-Optimized Assembly,
Markus Wichmann <=
Re: Help with Hand-Optimized Assembly, Jan Seiffert, 2012/03/28
Re: Help with Hand-Optimized Assembly, Bill Woessner, 2012/03/28