qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Why some ARM NEON helper functions need mask?


From: Max Filippov
Subject: Re: [Qemu-devel] Why some ARM NEON helper functions need mask?
Date: Sun, 30 Oct 2011 16:06:07 +0400
User-agent: KMail/1.13.7 (Linux/2.6.40.6-0.fc15.x86_64; KDE/4.6.5; x86_64; ; )

>   I am looking into QEMU's implementation for ARM NEON instructions
> (target-arm/neon_helper.c). Some helper functions will do mask
> operation, neon_add_u8, for example. I thought simply adding a and b
> is enough and can't figure out why the mask operation is needed.

These are SIMD instructions acting upon independent data 'lanes' packed into 
bigger data item.
Lane operations must not interfere with each other.
 
> ---
> uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b)
> {
>     uint32_t mask;
>1:     mask = (a ^ b) & 0x80808080u;
>2:     a &= ~0x80808080u;
>3:     b &= ~0x80808080u;
>4:     return (a + b) ^ mask;
> }
> ---

In your example there are four 8-bit lanes packed into 32-bit word.
If we add whole 32-bit words then care must be taken to prevent overflow 
propagation between the lanes.
This is done by putting zero at the top bit of each 8-bit operand (steps 2 and 
3).
These top bits are summed modulo 2 separately (step 1) and then added back 
(step4).

Thanks.
-- Max



reply via email to

[Prev in Thread] Current Thread [Next in Thread]