qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize find_next_bit()
Date: Tue, 19 Mar 2013 10:49:25 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4

On 03/15/2013 09:50 AM, Peter Lieven wrote:
> this patch adds the usage of buffer_find_nonzero_offset()
> to skip large areas of zeroes.
> 
> compared to loop unrolling presented in an earlier
> patch this adds another 50% performance benefit for
> skipping large areas of zeroes. loop unrolling alone
> added close to 100% speedup.
> 
> Signed-off-by: Peter Lieven <address@hidden>
> ---
>  util/bitops.c |   26 +++++++++++++++++++++++---
>  1 file changed, 23 insertions(+), 3 deletions(-)

> +    while (size >= BITS_PER_LONG) {
> +        if ((tmp = *p)) {
> +             goto found_middle;
> +        }
> +        if (((uintptr_t) p) % sizeof(VECTYPE) == 0 
> +                && size >= BITS_PER_BYTE * sizeof(VECTYPE)
> +                   * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {

Another instance where a helper function to check for alignment would be
nice.  Except this time you have a BITS_PER_BYTE factor, so you would be
calling something like buffer_can_use_vectors(buf, size / BITS_PER_BYTE)

> +            unsigned long tmp2 =
> +                buffer_find_nonzero_offset(p, ((size / BITS_PER_BYTE) & 
> +                           ~(BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR * 
> +                             sizeof(VECTYPE) - 1)));

Type mismatch - buffer_find_nonzero_offset returns size_t, which isn't
necessarily the same size as unsigned long.  I'm not sure if it can bite
you.

> +            result += tmp2 * BITS_PER_BYTE;
> +            size -= tmp2 * BITS_PER_BYTE;
> +            p += tmp2 / sizeof(unsigned long);
> +            if (!size) {
> +                return result;
> +            }
> +            if (tmp2) {

Do you really need this condition, or would it suffice to just
'continue;' the loop?  Once buffer_find_nonzero_offset returns anything
that leaves size as non-zero, we are guaranteed that the loop will goto
found_middle without any further calls to buffer_find_nonzero_offset.

> +                if ((tmp = *p)) {
> +                    goto found_middle;
> +                }
> +            }
>          }
> +        p++;
>          result += BITS_PER_LONG;
>          size -= BITS_PER_LONG;
>      }
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]