qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] virtio: Make memory barriers be memory barriers


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH] virtio: Make memory barriers be memory barriers
Date: Mon, 05 Sep 2011 09:41:19 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0

On 09/04/2011 11:16 AM, Michael S. Tsirkin wrote:
>  I mean argue for a richer set of barriers, with per-arch minimal
>  implementations instead of the large but portable hammer of
>  sync_synchronize, if you will.

That's what I'm saying really. On x86 the richer set of barriers
need not insert code at all for both wmb and rmb macros. All we
might need is an 'optimization barrier'- e.g. linux does
  __asm__ __volatile__("": : :"memory")
ppc needs something like sync_synchronize there.

No, rmb and wmb need to generate code. You are right that in some places there will be some extra barriers.

If you want a richer set of barriers, that must be something like {rr,rw,wr,ww}_mb{_acq,_rel,} (again not counting the Alpha). On x86, then, all the rr/rw/ww barriers will be compiler barriers because the hardware already enforces ordering. The other three map to lfence/sfence/mfence:

   barrier     assembly  why?
   ---------------------------------------------------------------------
   wr_mb_acq   lfence    prevents the read from moving up -> acquire
   wr_mb_rel   sfence    prevents the write from moving down -> release
   wr_mb       mfence    (full barrier)

But if you stick to rmb/wmb/mb, then the correct definition of rmb is "the least strict barrier that provides all three of rr_mb(), rw_mb_rel() and wr_mb_acq()". This is, as expected, an lfence. Similarly, wmb must provide all three of ww_mb(), wr_mb_rel() and rw_mb_acq(), and this is an sfence.

So the right place to put an #ifdef is not "wmb()", but the _uses_ of wmb() where you know you need a barrier that is less strict. That's why I say David patch is correct; on top of that you may change the particular uses of wmb() in virtio.c to compiler barriers, for example when you only care about ordering writes after writes.

Likewise, there may even be places in which you could #ifdef out a full memory barrier. For example, if you only care about ordering writes with respect to reads, x86 hardware is already providing that and you could omit the mb().

I think in general it is premature optimization, though.

Regarding specific examples in virtio where lfence and sfence could be used, there may be one when using event signaling. In the backend you write first the index of your response, then you check whether to generate an event. (I think) the following requirements hold:

* if you read the event-index too early, you might skip an event and deadlock. So you need at least a read barrier.

* you can write the response-index after reading the event-index, as long as you write it before waking up the guest.

So, in that case an x86 lfence should be enough, though again without more consideration I would use a full barrier just to be sure.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]