qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] aio: reg. smp_read_barrier_depends() in aio_bh_poll()


From: Paolo Bonzini
Subject: Re: [Qemu-devel] aio: reg. smp_read_barrier_depends() in aio_bh_poll()
Date: Fri, 2 Sep 2016 17:38:22 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0


On 02/09/2016 16:33, Pranith Kumar wrote:
> 
> Hi Paolo,
> 
> This is in reference to the discussion we had yesterday on IRC. I am trying to
> understand the need for smp_read_barrier_depends() and how it prevents the
> following race condition. I think a regular barrier() should suffice instead
> of smp_read_barrier_depends(). Consider:
> 
>            P0                        P1
>      ----------------------------------------
>      bh = ctx->first_bh;        
>      smp_read_barrier_depends();  // barrier() should be sufficient since bh
>                                   // is local variable
>      next = bh->next;
>                                   lock(bh_lock);
>                                   new_bh->next = ctx->first_bh;
>                                   smp_wmb();
>                                   ctx->first_bh = new_bh;
>                                   unlock(bh_lock);
>                                   
>      if (bh) {
>         // do something
>      }
> 
> Why do you think smp_read_barrier_depends() is necessary here? If bh was a
> shared variable I would understand, but here bh is local and a regular
> barrier() would make sure that we are not optimizing the initial load into bh.

Honestly, I don't think you understand why memory barriers exist...
They are used to synchronize writes to shared *data*, not to shared
variables.

It doesn't matter whether bh is a shared variable.  The *data that it
points to* is shared with other threads.  ctx->first_bh and bh->next are
both shared by P0 and P1.

P1 must make sure that ctx->first_bh is written after all of its context
(which in aio_bh_new includes new_bh->next) is ready.  It uses smp_wmb
for that.  A "release store" for ctx->first_bh would be okay too.  This
is easy.

P0 must make sure that bh->next is read after ctx->first_bh.  The
simplest way to ensure this is an "acquire load" for ctx->first_bh and
bh->next place an smp_rmb where there is currently
smp_read_barrier_depends().  This is easy too, but a bit overkill
because bh->next is really ctx->first_bh->next and data dependent reads
do not need full-blown acquire semantics.

However, you still need to make sure that bh->next is read from
_exactly_ the ctx->first_bh that was assigned to bh, and not for example
a value that was changed in the meanwhile by another processor.  Most
processors promise this (except the Alpha!) but compilers might reload
values if they think it's useful.  For this reason Linux and QEMU have
smp_read_barrier_depends(), and for this reason C11/C++11 introduce the
"consume" memory order.  smp_read_barrier_depends() is the same as C11's
atomic_thread_fence(MEMORDER_CONSUME).  We didn't make it up.

So instead of smp_read_barrier_depends() you could load ctx->first_bh
and bh->next with the consume memory order, but you do need _something_.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]