qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv2 3/3] virtio: order index/descriptor reads


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCHv2 3/3] virtio: order index/descriptor reads
Date: Tue, 24 Apr 2012 17:38:28 +0300

On Tue, Apr 24, 2012 at 03:48:27PM +0200, Paolo Bonzini wrote:
> Il 23/04/2012 15:19, Michael S. Tsirkin ha scritto:
> > virtio has the equivalent of:
> > 
> >     if (vq->last_avail_index != vring_avail_idx(vq)) {
> >             read descriptor head at vq->last_avail_index;
> >     }
> > 
> > In theory, processor can reorder descriptor head
> > read to happen speculatively before the index read.
> > this would trigger the following race:
> > 
> >     host descriptor head read <- reads invalid head from ring
> >             guest writes valid descriptor head
> >             guest writes avail index
> >     host avail index read <- observes valid index
> > 
> > as a result host will use an invalid head value.
> > This was not observed in the field by me but after
> > the experience with the previous two races
> > I think it is prudent to address this theoretical race condition.
> 
> I think your fix is right, but it is not needed on x86.  lfence is only
> required for weakly-ordered memory types, so says Intel,


I see this in spec:

The LFENCE instruction establishes a memory fence for loads. It
guarantees ordering between two loads and prevents speculative loads
from passing the load fence (that is, no speculative loads are allowed
until all loads specified before the load fence have been carried out).

Speculative load is exactly what we are handling here.

> and AMD is even
> clearer: "Loads from differing memory types may
> be performed out of order, in particular between WC/WC+ and other memory
> types".

This does not say loads from same memory type are ordered.
Does it say so anywhere else? Anyway, worth special-casing AMD?


> > Signed-off-by: Michael S. Tsirkin <address@hidden>
> > ---
> >  hw/virtio.c    |    5 +++++
> >  qemu-barrier.h |    7 ++++++-
> >  2 files changed, 11 insertions(+), 1 deletions(-)
> > 
> > diff --git a/hw/virtio.c b/hw/virtio.c
> > index def0bf1..c081e1b 100644
> > --- a/hw/virtio.c
> > +++ b/hw/virtio.c
> > @@ -287,6 +287,11 @@ static int virtqueue_num_heads(VirtQueue *vq, unsigned 
> > int idx)
> >                       idx, vring_avail_idx(vq));
> >          exit(1);
> >      }
> > +    /* On success, callers read a descriptor at vq->last_avail_idx.
> > +     * Make sure descriptor read does not bypass avail index read. */
> > +    if (num_heads) {
> > +        smp_rmb();
> > +    }
> >  
> >      return num_heads;
> >  }
> > diff --git a/qemu-barrier.h b/qemu-barrier.h
> > index f6722a8..39aa0b0 100644
> > --- a/qemu-barrier.h
> > +++ b/qemu-barrier.h
> > @@ -24,10 +24,13 @@
> >  #define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory")
> >  #endif
> >  
> > +#define smp_rmb() smp_mb()
> > +
> >  #elif defined(__x86_64__)
> >  
> >  #define smp_wmb()   barrier()
> >  #define smp_mb() asm volatile("mfence" ::: "memory")
> > +#define smp_rmb() asm volatile("lfence" ::: "memory")
> >  
> >  #elif defined(_ARCH_PPC)
> >  
> > @@ -38,6 +41,7 @@
> >   */
> >  #define smp_wmb()   asm volatile("eieio" ::: "memory")
> >  #define smp_mb()   asm volatile("eieio" ::: "memory")
> > +#define smp_rmb()   asm volatile("eieio" ::: "memory")
> 
> rmb() is lwsync on PPC, not eieio.
> 
> I would be grateful if, instead of fixing the qemu-barrier.h parts of
> the patches, you picked up the (sole) patch in the atomics branch of
> git://github.com/bonzini/qemu.git.  The constructs there are more
> complete than what we have in qemu-barrier.h,

Sorry this is just a bugfix in virtio, don't see a reason to make
it depend on a wholesale rework of atomics.
We can switch to qemu/atomics when it is ready.

> and the memory barriers
> are based on http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
> (written by people who know much more than me :)).
> 
> Paolo
> >  
> >  #else
> >  
> > @@ -45,10 +49,11 @@
> >   * For (host) platforms we don't have explicit barrier definitions
> >   * for, we use the gcc __sync_synchronize() primitive to generate a
> >   * full barrier.  This should be safe on all platforms, though it may
> > - * be overkill for wmb().
> > + * be overkill for wmb() and rmb().
> >   */
> >  #define smp_wmb()   __sync_synchronize()
> >  #define smp_mb()   __sync_synchronize()
> > +#define smp_rmb()   __sync_synchronize()
> >  
> >  #endif
> >  



reply via email to

[Prev in Thread] Current Thread [Next in Thread]