qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize


From: Peter Lieven
Subject: Re: [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize find_next_bit()
Date: Tue, 19 Mar 2013 20:40:50 +0100

Am 19.03.2013 um 17:49 schrieb Eric Blake <address@hidden>:

> On 03/15/2013 09:50 AM, Peter Lieven wrote:
>> this patch adds the usage of buffer_find_nonzero_offset()
>> to skip large areas of zeroes.
>> 
>> compared to loop unrolling presented in an earlier
>> patch this adds another 50% performance benefit for
>> skipping large areas of zeroes. loop unrolling alone
>> added close to 100% speedup.
>> 
>> Signed-off-by: Peter Lieven <address@hidden>
>> ---
>> util/bitops.c |   26 +++++++++++++++++++++++---
>> 1 file changed, 23 insertions(+), 3 deletions(-)
> 
>> +    while (size >= BITS_PER_LONG) {
>> +        if ((tmp = *p)) {
>> +             goto found_middle;
>> +        }
>> +        if (((uintptr_t) p) % sizeof(VECTYPE) == 0 
>> +                && size >= BITS_PER_BYTE * sizeof(VECTYPE)
>> +                   * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
> 
> Another instance where a helper function to check for alignment would be
> nice.  Except this time you have a BITS_PER_BYTE factor, so you would be
> calling something like buffer_can_use_vectors(buf, size / BITS_PER_BYTE)
> 
>> +            unsigned long tmp2 =
>> +                buffer_find_nonzero_offset(p, ((size / BITS_PER_BYTE) & 
>> +                           ~(BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * 
>> +                             sizeof(VECTYPE) - 1)));
> 
> Type mismatch - buffer_find_nonzero_offset returns size_t, which isn't
> necessarily the same size as unsigned long.  I'm not sure if it can bite
> you.

I will look into it.

> 
>> +            result += tmp2 * BITS_PER_BYTE;
>> +            size -= tmp2 * BITS_PER_BYTE;
>> +            p += tmp2 / sizeof(unsigned long);
>> +            if (!size) {
>> +                return result;
>> +            }
>> +            if (tmp2) {
> 
> Do you really need this condition, or would it suffice to just
> 'continue;' the loop?  Once buffer_find_nonzero_offset returns anything
> that leaves size as non-zero, we are guaranteed that the loop will goto
> found_middle without any further calls to buffer_find_nonzero_offset.

Note in all cases. It will do if the nonzero content is in the first 
sizeof(unsigned long)
bytes. If not, buffer_find_nonzero_offset() is called again. It will return 0 
because
in the first sizeof(VECTYPE)*BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
bytes is a non-zero byte. To avoid this I placed this check.

Peter


> 
>> +                if ((tmp = *p)) {
>> +                    goto found_middle;
>> +                }
>> +            }
>>         }
>> +        p++;
>>         result += BITS_PER_LONG;
>>         size -= BITS_PER_LONG;
>>     }
>> 
> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]