[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv4 4/9] bitops: use vector algorithm to optimize
From: |
Orit Wasserman |
Subject: |
Re: [Qemu-devel] [PATCHv4 4/9] bitops: use vector algorithm to optimize find_next_bit() |
Date: |
Mon, 25 Mar 2013 11:04:33 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 03/22/2013 02:46 PM, Peter Lieven wrote:
> this patch adds the usage of buffer_find_nonzero_offset()
> to skip large areas of zeroes.
>
> compared to loop unrolling presented in an earlier
> patch this adds another 50% performance benefit for
> skipping large areas of zeroes. loop unrolling alone
> added close to 100% speedup.
>
> Signed-off-by: Peter Lieven <address@hidden>
> Reviewed-by: Eric Blake <address@hidden>
> ---
> util/bitops.c | 24 +++++++++++++++++++++---
> 1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/util/bitops.c b/util/bitops.c
> index e72237a..9bb61ff 100644
> --- a/util/bitops.c
> +++ b/util/bitops.c
> @@ -42,10 +42,28 @@ unsigned long find_next_bit(const unsigned long *addr,
> unsigned long size,
> size -= BITS_PER_LONG;
> result += BITS_PER_LONG;
> }
> - while (size & ~(BITS_PER_LONG-1)) {
> - if ((tmp = *(p++))) {
> - goto found_middle;
> + while (size >= BITS_PER_LONG) {
> + tmp = *p;
> + if (tmp) {
> + goto found_middle;
> + }
> + if (can_use_buffer_find_nonzero_offset(p, size / BITS_PER_BYTE)) {
> + size_t tmp2 =
> + buffer_find_nonzero_offset(p, size / BITS_PER_BYTE);
> + result += tmp2 * BITS_PER_BYTE;
> + size -= tmp2 * BITS_PER_BYTE;
> + p += tmp2 / sizeof(unsigned long);
> + if (!size) {
> + return result;
> + }
> + if (tmp2) {
> + tmp = *p;
> + if (tmp) {
> + goto found_middle;
> + }
> + }
> }
> + p++;
> result += BITS_PER_LONG;
> size -= BITS_PER_LONG;
> }
>
Reviewed-by: Orit Wasserman <address@hidden>
- Re: [Qemu-devel] [PATCHv4 5/9] migration: search for zero instead of dup pages, (continued)
- [Qemu-devel] [PATCHv4 3/9] buffer_is_zero: use vector optimizations if possible, Peter Lieven, 2013/03/22
- [Qemu-devel] [PATCHv4 8/9] migration: do not search dirty pages in bulk stage, Peter Lieven, 2013/03/22
- [Qemu-devel] [PATCHv4 1/9] move vector definitions to qemu-common.h, Peter Lieven, 2013/03/22
- [Qemu-devel] [PATCHv4 9/9] migration: use XBZRLE only after bulk stage, Peter Lieven, 2013/03/22
- [Qemu-devel] [PATCHv4 4/9] bitops: use vector algorithm to optimize find_next_bit(), Peter Lieven, 2013/03/22
- Re: [Qemu-devel] [PATCHv4 4/9] bitops: use vector algorithm to optimize find_next_bit(),
Orit Wasserman <=
- [Qemu-devel] [PATCHv4 6/9] migration: add an indicator for bulk state of ram migration, Peter Lieven, 2013/03/22
- [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer, Peter Lieven, 2013/03/22
- Re: [Qemu-devel] [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer, Orit Wasserman, 2013/03/25