rapp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Rapp-dev] Full SIMD implementation of integral


From: Willie Betschart
Subject: [Rapp-dev] Full SIMD implementation of integral
Date: Fri, 14 Dec 2012 16:28:55 +0100

Hello RAPP -Dev!

Here's a patch containing a full SIMD implementation of integral. 

I also added SWAR macros for type conversions, mainly because benchmarking 
build didn't went through SWAR build, not sure why this was built.

I added two new macro's, SPLAT_U16 and SPLAT_U32. I have tested this separately 
but got difficulties adding unit tests so I wait with that. Splat is used to 
add the previous state when next buffers are processed.

The integral was faster than generic but still not as efficient as the hybrid 
of hardcoded swar and SIMD. Also I needed ssse3's align, sse2 wasn't that fast.
I added a description in the source /compute/vector/rc_integral.c how it works.

Best wishes
Willie

Attachment: full_simd_integral.patch
Description: full_simd_integral.patch


reply via email to

[Prev in Thread] Current Thread [Next in Thread]