qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1 3/5] include/qemu/atomic.h: default to __atom


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH v1 3/5] include/qemu/atomic.h: default to __atomic functions
Date: Mon, 4 Apr 2016 10:14:59 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0


On 01/04/2016 22:35, Pranith Kumar wrote:; barrier(); })
> I could not really understand why we need to wrap the fence with
> barrier()'s. There are three parts to my confusion. Let me ask one after the
> other.
> 
> On x86, __atomic_thread_fence(__ATOMIC_SEQ_CST) will generate an mfence
> instruction. On ARM, this will generate the dmb instruction. Both these
> serializing instructions also act as compiler barriers. Is there any
> architecture which does not generate such a serializing instruction?

(More on this later).

>> +#define smp_wmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_RELEASE); 
>> barrier(); })
>> +#define smp_rmb()   ({ barrier(); __atomic_thread_fence(__ATOMIC_ACQUIRE); 
>> barrier(); })
> 
> Second, why do you need barrier() on both sides? One barrier() seems to be
> sufficient to prevent the compiler from reordering across the macro. Am I
> missing something?

Yes, that's true.

> Finally, I tried looking at the gcc docs but could find nothing regarding
> __atomic_thread_fence() not being considered as a memory barrier. What I did
> find mentions about it being treated as a function call during the main
> optimization stages and not during later stages:
> 
> http://www.spinics.net/lists/gcchelp/msg39798.html
> 
> AFAIU, in these later stages, even adding a barrier() as we are doing will
> have no effect.
> 
> Can you point me to any docs which talk more about this?

The issue is that atomic_thread_fence() only affects other atomic
operations, while smp_rmb() and smp_wmb() affect normal loads and stores
as well.

In the GCC implementation, atomic operations (even relaxed ones) access
memory as if the pointer was volatile.  By doing this, GCC can remove
the acquire and release fences altogether on TSO architectures.  We
actually observed a case where the compiler subsequently inverted the
order of two writes around a smp_wmb().  It was fixed in commit 3bbf572
("atomics: add explicit compiler fence in __atomic memory barriers",
2015-06-05).

In principle it could do the same on architectures that are sequentially
consistent; even if none exists in practice, keeping the barriers for
smp_mb() is consistent with the other barriers.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]