[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations
From: |
Peter Lieven |
Subject: |
Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations |
Date: |
Fri, 22 Mar 2013 20:20:29 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120827 Thunderbird/15.0 |
Am 22.03.2013 18:25, schrieb Paolo Bonzini:
> Il 22/03/2013 13:46, Peter Lieven ha scritto:
>> this is v4 of my patch series with various optimizations in
>> zero buffer checking and migration tweaks.
>>
>> thanks especially to Eric Blake for reviewing.
>>
>> v4:
>> - do not inline buffer_find_nonzero_offset()
>> - inline can_usebuffer_find_nonzero_offset() correctly
>> - readd asserts in buffer_find_nonzero_offset() as profiling
>> shows they do not hurt.
>> - change last occurences of scalar 8 by
>> BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
>> - avoid deferencing p already in patch 5 where we
>> know that the page (p) is zero
>> - explicitly set bytes_sent = 0 if we skip a zero page.
>> bytes_sent was 0 before, but it was not obvious.
>> - add accounting information for skipped zero pages
>> - fix errors reported by checkpatch.pl
>>
>> v3:
>> - remove asserts, inline functions and add a check
>> function if buffer_find_nonzero_offset() can be used.
>> - use above check function in buffer_is_zero() and
>> find_next_bit().
>> - use buffer_is_nonzero_offset() directly to find
>> zero pages. we know that all requirements are met
>> for memory pages.
>> - fix C89 violation in buffer_is_zero().
>> - avoid derefencing p in ram_save_block() if we already
>> know the page is zero.
>> - fix initialization of last_offset in reset_ram_globals().
>> - avoid skipping pages with offset == 0 in bulk stage in
>> migration_bitmap_find_and_reset_dirty().
>> - compared to v1 check for zero pages also after bulk
>> ram migration as there are guests (e.g. Windows) which
>> zero out large amount of memory while running.
>>
>> v2:
>> - fix description, add trivial zero check and add asserts
>> to buffer_find_nonzero_offset.
>> - add a constant for the unroll factor of buffer_find_nonzero_offset
>> - replace is_dup_page() by buffer_is_zero()
>> - added test results to xbzrle patch
>> - optimize descriptions
>>
>> Peter Lieven (9):
>> move vector definitions to qemu-common.h
>> cutils: add a function to find non-zero content in a buffer
>> buffer_is_zero: use vector optimizations if possible
>> bitops: use vector algorithm to optimize find_next_bit()
>> migration: search for zero instead of dup pages
>> migration: add an indicator for bulk state of ram migration
>> migration: do not sent zero pages in bulk stage
>> migration: do not search dirty pages in bulk stage
>> migration: use XBZRLE only after bulk stage
>>
>> arch_init.c | 74
>> +++++++++++++++++++----------------------
>> hmp.c | 2 ++
>> include/migration/migration.h | 2 ++
>> include/qemu-common.h | 37 +++++++++++++++++++++
>> migration.c | 3 +-
>> qapi-schema.json | 6 ++--
>> qmp-commands.hx | 3 +-
>> util/bitops.c | 24 +++++++++++--
>> util/cutils.c | 50 ++++++++++++++++++++++++++++
>> 9 files changed, 155 insertions(+), 46 deletions(-)
>>
> I think patch 4 is a bit overengineered. I would prefer the simple
> patch you had using three/four non-vectorized accesses. The setup cost
> of the vectorized buffer_is_zero is quite high, and 64 bits are just
> 256k RAM; if the host doesn't touch 256k RAM, it will incur the overhead.
I think you are right. I was a little to eager to utilize
buffer_find_nonzero_offset()
as much as possible. The performance gain by unrolling was impressive enough.
The gain by the vector functions is not that big that it would justify a
possible
slow down by the high setup costs. My testings revealed that in most cases
buffer_find_nonzero_offset()
returns 0 or a big offset. All the 0 return values would have increased setup
costs with
the vectorized version of patch 4.
>
> I would prefer some more benchmarking for patch 5, but it looks ok.
What would you like to see? Statistics how many pages of a real system
are not zero, but zero in the first sizeof(long) bytes?
>
> The rest are fine, thanks!
Thank you for reviewing. If we are done with this patches I will continue with
the block migration optimizations next week.
Peter
- Re: [Qemu-devel] indentation hints [was: [PATCHv4 2/9] cutils: add a function to find non-zero content in a buffer], (continued)
[Qemu-devel] [PATCHv4 7/9] migration: do not sent zero pages in bulk stage, Peter Lieven, 2013/03/22
Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/22
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations,
Peter Lieven <=
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/22
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/23
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Paolo Bonzini, 2013/03/25
- Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/25