qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] atomic: using memory_order_relaxed for refcn


From: liu ping fan
Subject: Re: [Qemu-devel] [PATCH v2] atomic: using memory_order_relaxed for refcnt inc/dec ops
Date: Sun, 14 Jul 2013 18:23:45 +0800

On Sun, Jul 14, 2013 at 1:57 PM, Paolo Bonzini <address@hidden> wrote:
> Il 14/07/2013 04:53, Liu Ping Fan ha scritto:
>> Refcnt's atomic inc/dec ops are frequent and its idiom need no seq_cst
>> order. So to get better performance, it worth to adopt _relaxed
>> other than _seq_cst memory model on them.
>>
>> We resort to gcc builtins. If gcc supports C11 memory model, __atomic_*
>> buitlins is used, otherwise __sync_* builtins.
>>
>> Signed-off-by: Liu Ping Fan <address@hidden>
>
> No, not at all. :(
>
> First of all, I'd like to understand how you benchmarked this.  For
> inc/dec, relaxed vs. seq_cst has no effect except on PPC and ARM.  And

Not based on benchmark. Just based on the code releated with barrier
implement, since PPC and ARM can just lock cacheline without flushing
write buffer.
> if the refcount ops are frequent enough, I strongly suspect cacheline
> bouncing has a bigger effect than the memory barriers.
>
When out of biglock, object_ref/unref to pin the Device will be quite
often, and can it be marked "frequent"? Or how can we say something is
frequent?

> Second, it is wrong because you need a further read memory barrier when
> you are removing the last reference
>
Oh, yes, the last one.

> Third, it is making the API completely unorthogonal, and "tend to be
> exceptions" is not a justification.
>
> The justification here could be, rather than the performance, having to
> remember how to use atomic_fetch_dec in the unref side.  I don't really
> buy that, but if you really care, do something like
>
> #define atomic_ref(ptr, field) \
>   __atomic_fetch_add(&((ptr)->field), 1, __ATOMIC_RELAXED)
> #define atomic_unref(ptr, field, releasefn) ( \
>   __atomic_fetch_add(&((ptr)->field), -1, __ATOMIC_RELEASE) == 1 \
>   ? (__atomic_thread_fence(__ATOMIC_ACQUIRE), (releasefn)(ptr)) : false)
>
> i.e. define a new interface similar to kref_get/kref_put and, since you
> are at it, optimize it.
>
Thanks, a abstract layer for refct is what I need.

Regards
Pingfan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]