qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RESEND PATCH] hw/dma: fix crash caused by race condition


From: Stefan Hajnoczi
Subject: Re: [RESEND PATCH] hw/dma: fix crash caused by race condition
Date: Thu, 2 Jun 2022 06:29:12 +0100

On Thu, Jun 2, 2022, 02:04 Tong Zhang <ztong0001@gmail.com> wrote:
>
> Hi Stefan,
>
> On Wed, Jun 1, 2022 at 6:56 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >
> > > > This patch makes sense to me. Can you rephrase your concern?
> > >
> > > The locking is around dbs->io_func().
> > >
> > > aio_context_acquire(dbs->ctx);
> > > dbs->acb = dbs->io_func()
> > > aio_context_release(dbs->ctx);
> > >
> > >
> > > So where exactly would the lock that's now still held stop someone from
> > > modifying dbs->acb = NULL at the beginning of the function, which seems
> > > to be not protected by that lock?
> > >
> > > Maybe I'm missing some locking magic due to the lock being a recursive 
> > > lock.
> >
> > Tong Zhang: Can you share a backtrace of all threads when the
> > assertion failure occurs?
> >
> Sorry I couldn't get the trace now -- but I can tell that we have some
> internal code uses
> this dma related code and will grab dbs->ctx lock in another thread
> and could overwrite dbs->acb.
>
> From my understanding, one of the reasons that the lock is required
> here is to protect dbs->acb,
> we could not reliably test io_func()'s return value after releasing
> the lock here.
>
> Since this code affects our internal code base and I did not reproduce
> on master branch,
> feel free to ignore it.

If this patch is unnecessary on qemu.git/master it raises the question
whether aio_context_acquire/release() should be removed from
dma_blk_cb(). It was added by:

commit 1919631e6b5562e474690853eca3c35610201e16
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Mon Feb 13 14:52:31 2017 +0100

    block: explicitly acquire aiocontext in bottom halves that need it

Paolo: Is dma_blk_cb() called without the AioContext lock (by
virtio-scsi or any code path with IOThreads)? David pointed out that
if that's the case then the dbs->acb is accessed without the lock at
the start of dma_blk_cb().

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]