freetype-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ft-devel] FT_MulFix assembly


From: Miles Bader
Subject: Re: [ft-devel] FT_MulFix assembly
Date: Mon, 06 Sep 2010 18:49:13 +0900

Incidentally, you wrote:

> The assembly generated by the C code is 45 lines and 158 octets long,
> contains six conditional jumps, three each of explicit compares and
> tests, and still benchmarks are just as fast.  Out-of-order processing
> wins out over hand-coded asm. :-/

... but when I follow your original suggestion, and just do the following:

   typedef long s64;
   typedef unsigned int u32;

   static __inline__ u32
   FT_MulFix_x86_64 (u32 a, u32 b)
   {
     return (((s64)a * (s64)b) + 0x8000) >> 16;
   }

The compiler generates the following assembly:

        mov     %esi, %eax
        mov     %edi, %edi
        imulq   %rdi, %rax
        addq    $32768, %rax
        shrq    $16, %rax

The movs there are obviously a bit silly (compiler bug?), but that
output seems reasonably close to the asm() version, and obviously much
more schedulable since the compiler knows what the insns do...

[I tried it with gcc-4.x for a few different xs, and the results are the
same, and optimization flags don't seem to make much difference either.]

-Miles

-- 
Scriptures, n. The sacred books of our holy religion, as distinguished from
the false and profane writings on which all other faiths are based.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]