[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero c
From: |
Orit Wasserman |
Subject: |
Re: [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer |
Date: |
Mon, 25 Mar 2013 11:26:58 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 03/25/2013 10:56 AM, Peter Lieven wrote:
>
> Am 25.03.2013 um 09:53 schrieb Orit Wasserman <address@hidden>:
>
>> On 03/22/2013 02:46 PM, Peter Lieven wrote:
>>> this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
>>> optimized function that searches for non-zero content in a
>>> buffer.
>>>
>>> due to the optimizations used in the function there are restrictions
>>> on buffer address and search length. the function
>>> can_use_buffer_find_nonzero_content() can be used to check if
>>> the function can be used safely.
>>>
>>> Signed-off-by: Peter Lieven <address@hidden>
>>> ---
>>> include/qemu-common.h | 13 +++++++++++++
>>> util/cutils.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 58 insertions(+)
>>>
>>> diff --git a/include/qemu-common.h b/include/qemu-common.h
>>> index e76ade3..078e535 100644
>>> --- a/include/qemu-common.h
>>> +++ b/include/qemu-common.h
>>> @@ -472,4 +472,17 @@ void hexdump(const char *buf, FILE *fp, const char
>>> *prefix, size_t size);
>>> #define ALL_EQ(v1, v2) ((v1) == (v2))
>>> #endif
>>>
>>> +#define BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR 8
>>> +static inline bool
>>> +can_use_buffer_find_nonzero_offset(const void *buf, size_t len)
>>> +{
>>> + if (len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
>>> + * sizeof(VECTYPE)) == 0
>>> + && ((uintptr_t) buf) % sizeof(VECTYPE) == 0) {
>>> + return true;
>>> + }
>>> + return false;
>>> +}
>>> +size_t buffer_find_nonzero_offset(const void *buf, size_t len);
>>> +
>>> #endif
>>> diff --git a/util/cutils.c b/util/cutils.c
>>> index 1439da4..41c627e 100644
>>> --- a/util/cutils.c
>>> +++ b/util/cutils.c
>>> @@ -143,6 +143,51 @@ int qemu_fdatasync(int fd)
>>> }
>>>
>>> /*
>>> + * Searches for an area with non-zero content in a buffer
>>> + *
>>> + * Attention! The len must be a multiple of
>>> + * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * sizeof(VECTYPE)
>>> + * and addr must be a multiple of sizeof(VECTYPE) due to
>>> + * restriction of optimizations in this function.
>>> + *
>>> + * can_use_buffer_find_nonzero_offset() can be used to check
>>> + * these requirements.
>>> + *
>>> + * The return value is the offset of the non-zero area rounded
>>> + * down to BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * sizeof(VECTYPE).
>>> + * If the buffer is all zero the return value is equal to len.
>>> + */
>>> +
>>> +size_t buffer_find_nonzero_offset(const void *buf, size_t len)
>>> +{
>>> + VECTYPE *p = (VECTYPE *)buf;
>>> + VECTYPE zero = ZERO_SPLAT;
>>> + size_t i;
>>> +
>>> + assert(len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
>>> + * sizeof(VECTYPE)) == 0);
>>> + assert(((uintptr_t) buf) % sizeof(VECTYPE) == 0);
>>> +
>>> + if (*((const long *) buf)) {
>>> + return 0;
>>> + }
>>> +
>>> + for (i = 0; i < len / sizeof(VECTYPE);
>> Why not put len/sizeof(VECTYPE) in a variable?
>
> are you afraid that there is a division at each iteration?
>
> sizeof(VECTYPE) is a power of 2 so i think the compiler will optimize it
> to a >> at compile time.
true, but it still is done every iteration.
>
> I would also be ok with writing len /= sizeof(VECTYPE) before the loop.
I would prefer it :)
Orit
>
> Peter
>
>> Orit
>>> + i += BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
>>> + VECTYPE tmp0 = p[i + 0] | p[i + 1];
>>> + VECTYPE tmp1 = p[i + 2] | p[i + 3];
>>> + VECTYPE tmp2 = p[i + 4] | p[i + 5];
>>> + VECTYPE tmp3 = p[i + 6] | p[i + 7];
>>> + VECTYPE tmp01 = tmp0 | tmp1;
>>> + VECTYPE tmp23 = tmp2 | tmp3;
>>> + if (!ALL_EQ(tmp01 | tmp23, zero)) {
>>> + break;
>>> + }
>>> + }
>>> + return i * sizeof(VECTYPE);
>>> +}
>>> +
>>> +/*
>>> * Checks if a buffer is all zeroes
>>> *
>>> * Attention! The len must be a multiple of 4 * sizeof(long) due to
>>>
>>
>
[Qemu-devel] [PATCHv4 7/9] migration: do not sent zero pages in bulk stage, Peter Lieven, 2013/03/22
Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/22