[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero c
From: |
Peter Lieven |
Subject: |
Re: [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer |
Date: |
Mon, 25 Mar 2013 09:56:32 +0100 |
Am 25.03.2013 um 09:53 schrieb Orit Wasserman <address@hidden>:
> On 03/22/2013 02:46 PM, Peter Lieven wrote:
>> this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
>> optimized function that searches for non-zero content in a
>> buffer.
>>
>> due to the optimizations used in the function there are restrictions
>> on buffer address and search length. the function
>> can_use_buffer_find_nonzero_content() can be used to check if
>> the function can be used safely.
>>
>> Signed-off-by: Peter Lieven <address@hidden>
>> ---
>> include/qemu-common.h | 13 +++++++++++++
>> util/cutils.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 58 insertions(+)
>>
>> diff --git a/include/qemu-common.h b/include/qemu-common.h
>> index e76ade3..078e535 100644
>> --- a/include/qemu-common.h
>> +++ b/include/qemu-common.h
>> @@ -472,4 +472,17 @@ void hexdump(const char *buf, FILE *fp, const char
>> *prefix, size_t size);
>> #define ALL_EQ(v1, v2) ((v1) == (v2))
>> #endif
>>
>> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
>> +static inline bool
>> +can_use_buffer_find_nonzero_offset(const void *buf, size_t len)
>> +{
>> + if (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
>> + * sizeof(VECTYPE)) == 0
>> + && ((uintptr_t) buf) % sizeof(VECTYPE) == 0) {
>> + return true;
>> + }
>> + return false;
>> +}
>> +size_t buffer_find_nonzero_offset(const void *buf, size_t len);
>> +
>> #endif
>> diff --git a/util/cutils.c b/util/cutils.c
>> index 1439da4..41c627e 100644
>> --- a/util/cutils.c
>> +++ b/util/cutils.c
>> @@ -143,6 +143,51 @@ int qemu_fdatasync(int fd)
>> }
>>
>> /*
>> + * Searches for an area with non-zero content in a buffer
>> + *
>> + * Attention! The len must be a multiple of
>> + * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * sizeof(VECTYPE)
>> + * and addr must be a multiple of sizeof(VECTYPE) due to
>> + * restriction of optimizations in this function.
>> + *
>> + * can_use_buffer_find_nonzero_offset() can be used to check
>> + * these requirements.
>> + *
>> + * The return value is the offset of the non-zero area rounded
>> + * down to BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * sizeof(VECTYPE).
>> + * If the buffer is all zero the return value is equal to len.
>> + */
>> +
>> +size_t buffer_find_nonzero_offset(const void *buf, size_t len)
>> +{
>> + VECTYPE *p = (VECTYPE *)buf;
>> + VECTYPE zero = ZERO_SPLAT;
>> + size_t i;
>> +
>> + assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
>> + * sizeof(VECTYPE)) == 0);
>> + assert(((uintptr_t) buf) % sizeof(VECTYPE) == 0);
>> +
>> + if (*((const long *) buf)) {
>> + return 0;
>> + }
>> +
>> + for (i = 0; i < len / sizeof(VECTYPE);
> Why not put len/sizeof(VECTYPE) in a variable?
are you afraid that there is a division at each iteration?
sizeof(VECTYPE) is a power of 2 so i think the compiler will optimize it
to a >> at compile time.
I would also be ok with writing len /= sizeof(VECTYPE) before the loop.
Peter
> Orit
>> + i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
>> + VECTYPE tmp0 = p[i + 0] | p[i + 1];
>> + VECTYPE tmp1 = p[i + 2] | p[i + 3];
>> + VECTYPE tmp2 = p[i + 4] | p[i + 5];
>> + VECTYPE tmp3 = p[i + 6] | p[i + 7];
>> + VECTYPE tmp01 = tmp0 | tmp1;
>> + VECTYPE tmp23 = tmp2 | tmp3;
>> + if (!ALL_EQ(tmp01 | tmp23, zero)) {
>> + break;
>> + }
>> + }
>> + return i * sizeof(VECTYPE);
>> +}
>> +
>> +/*
>> * Checks if a buffer is all zeroes
>> *
>> * Attention! The len must be a multiple of 4 * sizeof(long) due to
>>
>
[Qemu-devel] [PATCHv4 7/9] migration: do not sent zero pages in bulk stage, Peter Lieven, 2013/03/22
Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/22