[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH v2 1/3] target-arm: Use Neon for zero checki
From: |
Peter Maydell |
Subject: |
Re: [Qemu-devel] [RFC PATCH v2 1/3] target-arm: Use Neon for zero checking |
Date: |
Thu, 7 Apr 2016 11:44:53 +0100 |
On 7 April 2016 at 10:58, <address@hidden> wrote:
> From: Vijay <address@hidden>
>
> Use Neon instructions to perform zero checking of
> buffer. This is helps in reducing downtime during
> live migration.
>
> Signed-off-by: Vijaya Kumar K <address@hidden>
> Signed-off-by: Suresh <address@hidden>
> ---
> util/cutils.c | 74
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 74 insertions(+)
>
> diff --git a/util/cutils.c b/util/cutils.c
> index 43d1afb..bb61c91 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -352,6 +352,80 @@ static void
> *can_use_buffer_find_nonzero_offset_ifunc(void)
> return func;
> }
> #pragma GCC pop_options
> +
> +#elif defined __aarch64__
> +#include "arm_neon.h"
> +
> +#define NEON_VECTYPE uint64x2_t
> +#define NEON_LOAD_N_ORR(v1, v2) (vld1q_u64(&v1) | vld1q_u64(&v2))
> +#define NEON_ORR(v1, v2) ((v1) | (v2))
> +#define NEON_NOT_EQ_ZERO(v1) \
> + ((vgetq_lane_u64(v1, 0) != 0) || (vgetq_lane_u64(v1, 1) != 0))
> +
> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR_NEON 16
This says 16 lots of loads of uint64x2_t...
> + for (i = 0; i < len; i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR_NEON)
> {
> + qword0 = NEON_LOAD_N_ORR(data[i], data[i + 2]);
> + qword1 = NEON_LOAD_N_ORR(data[i + 4], data[i + 6]);
> + qword2 = NEON_LOAD_N_ORR(data[i + 8], data[i + 10]);
> + qword3 = NEON_LOAD_N_ORR(data[i + 12], data[i + 14]);
> + qword4 = NEON_ORR(qword0, qword1);
> + qword5 = NEON_ORR(qword2, qword3);
> + qword6 = NEON_ORR(qword4, qword5);
...but the loop is only loading 8 lots of uint64x2_t.
thanks
-- PMM
- [Qemu-devel] [RFC PATCH v2 0/3] ARM64: Live migration optimization, vijayak, 2016/04/07
- [Qemu-devel] [RFC PATCH v2 1/3] target-arm: Use Neon for zero checking, vijayak, 2016/04/07
- [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, vijayak, 2016/04/07
- Re: [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, Peter Maydell, 2016/04/07
- Re: [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, Vijay Kilari, 2016/04/07
- Re: [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, Peter Maydell, 2016/04/07
- Re: [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, Vijay Kilari, 2016/04/08
- Re: [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, Peter Maydell, 2016/04/08
- Re: [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, Vijay Kilari, 2016/04/11
- Re: [Qemu-devel] [RFC PATCH v2 2/3] utils: Add cpuinfo helper to fetch /proc/cpuinfo, Suzuki K Poulose, 2016/04/11