[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 10/10] cutils: Rewrite x86 buffer zero checking
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH 10/10] cutils: Rewrite x86 buffer zero checking |
Date: |
Tue, 13 Sep 2016 18:33:50 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 |
On 13/09/2016 18:27, Richard Henderson wrote:
> On 09/13/2016 09:10 AM, Paolo Bonzini wrote:
>> @@ -177,16 +231,15 @@ bool test_buffer_is_zero_next_accel(void)
>>
>> static bool select_accel_fn(const void *buf, size_t len)
>> {
>> - uintptr_t ibuf = (uintptr_t)buf;
>> #ifdef CONFIG_AVX2_OPT
>> - if (len % 128 == 0 && ibuf % 32 == 0 && (cpuid_cache & CACHE_AVX2)) {
>> + if (len >= 128 && (cpuid_cache & CACHE_AVX2)) {
>> return buffer_zero_avx2(buf, len);
>> }
>> - if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE4)) {
>> + if (len >= 64 && (cpuid_cache & CACHE_SSE4)) {
>> return buffer_zero_sse4(buf, len);
>> }
>> #endif
>> - if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE2)) {
>> + if (len >= 64 && (cpuid_cache & CACHE_SSE2)) {
>> return buffer_zero_sse2(buf, len);
>> }
>
> You've dropped a major change to select_accel_fn here.
>
> (1) The avx2 routine, as written, can support len >= 64, therefore a common
> test works for all of the vectorized functions.
>
> (2) I had saved the pointer to the routine, so that we didn't have to
> repeatedly test multiple cpuid_cache bits.
Can you send a replacement for this patch only?
Thanks,
Paolo
- [Qemu-devel] [PATCH 04/10] cutils: Rearrange buffer_is_zero acceleration, (continued)
- [Qemu-devel] [PATCH 04/10] cutils: Rearrange buffer_is_zero acceleration, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 03/10] cutils: Export only buffer_is_zero, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 05/10] cutils: Remove aarch64 buffer zero checking, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 02/10] cutils: Remove SPLAT macro, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 07/10] cutils: Add test for buffer_is_zero, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 06/10] cutils: Remove ppc buffer zero checking, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 08/10] cutils: Add SSE4 version, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 09/10] cutils: Add generic prefetch, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PATCH 10/10] cutils: Rewrite x86 buffer zero checking, Paolo Bonzini, 2016/09/13