[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH 11/11] block/snapshot: do not take AioContext lo
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-block] [PATCH 11/11] block/snapshot: do not take AioContext lock |
Date: |
Tue, 11 Jul 2017 10:43:56 +0100 |
User-agent: |
Mutt/1.8.0 (2017-02-23) |
On Mon, Jul 10, 2017 at 06:27:27PM +0200, Paolo Bonzini wrote:
> On 10/07/2017 18:24, Stefan Hajnoczi wrote:
> > On Thu, Jul 06, 2017 at 06:38:28PM +0200, Paolo Bonzini wrote:
> >> Snapshots are only created/destroyed/loaded under the BQL, while no
> >> other I/O is happening. Snapshot information could be accessed while
> >> other I/O is happening, but also under the BQL so they cannot be
> >> modified concurrently. The AioContext lock is overkill. If needed,
> >> in the future the BQL could be split to a separate lock covering all
> >> snapshot operations, and the create/destroy/goto callbacks changed
> >> to run in a coroutine (so the driver can do mutual exclusion as usual).
> >>
> >> Signed-off-by: Paolo Bonzini <address@hidden>
> >> ---
> >> block/snapshot.c | 28 +---------------------------
> >> blockdev.c | 43 ++++++++++++-------------------------------
> >> hmp.c | 7 -------
> >> include/block/block_int.h | 5 +++++
> >> include/block/snapshot.h | 4 +---
> >> migration/savevm.c | 22 ----------------------
> >> monitor.c | 10 ++--------
> >> 7 files changed, 21 insertions(+), 98 deletions(-)
> >>
> >> diff --git a/block/snapshot.c b/block/snapshot.c
> >> index a46564e7b7..08c59d6166 100644
> >> --- a/block/snapshot.c
> >> +++ b/block/snapshot.c
> >> @@ -384,9 +384,7 @@ int
> >> bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
> >> }
> >>
> >>
> >> -/* Group operations. All block drivers are involved.
> >> - * These functions will properly handle dataplane (take
> >> aio_context_acquire
> >> - * when appropriate for appropriate block drivers) */
> >> +/* Group operations. All block drivers are involved. */
> >
> > Perhaps "These functions must be called under the BQL"?
> >
> > My concern with this patch series in general is that it will lead to
> > bugs due to inconsistencies and lack of locking documentation:
> >
> > bdrv_all_delete_snapshot() is called by hmp_delvm() outside a
> > bdrv_drained_begin() region. That's okay because internally
> > bdrv_snapshot_delete() will call bdrv_drained_begin() for the crucial
> > operations that require a quiesced BDS.
> >
> > Compare that with bdrv_all_goto_snapshot(), which is called inside a
> > bdrv_drained_begin() region by load_snapshot(). Internally it doesn't
> > drain.
>
> I think generally we should move bdrv_drained_begin/end calls _out_ of
> block.c and into qmp_*. If you agree, I can add this before this patch.
Yes, I think drained regions should be in callers. That way callers can
perform multiple operations in sequence and the result is atomic.
> > Previously the bdrv_all_*() functions behaved consistently. We could
> > say that they will acquire AioContexts themselves. Now they behave
> > inconsistently and while the code currently happens to work, there is no
> > structure that will keep it working as it is modified.
> >
> > I think we're reaching a point where every BlockDriver callback and
> > every bdrv_*() function needs the following information:
> >
> > 1. Must be called under BQL?
> > 2. Can I/O requests be in flight?
> > 3. Is it thread-safe?
> >
> > Otherwise it will be a nightmare to modify the code since these
> > constraints are not enforced by the compiler and undocumented.
>
> Good point. Are (1) and (3) different ways to say the same thing or do
> you have other differences in mind?
There is a difference between (1) and (3). If a function is thread-safe
then callers can invoke it without any constraints. If a function
requires a specific lock to be held then that it puts a constraint on
the caller.
We do have thread-safe functions in the block layer: simple functions
that fetch a BDS field atomically. There is no need to hold the BQL for
them.
signature.asc
Description: PGP signature
- Re: [Qemu-block] [PATCH 06/11] block: add a few more notes on locking, (continued)
- [Qemu-block] [PATCH 07/11] block: do not acquire AioContext in check_to_replace_node, Paolo Bonzini, 2017/07/06
- [Qemu-block] [PATCH 09/11] block/replication: do not acquire AioContext, Paolo Bonzini, 2017/07/06
- [Qemu-block] [PATCH 08/11] block: drain I/O around key management, Paolo Bonzini, 2017/07/06
- [Qemu-block] [PATCH 10/11] block: do not take AioContext around reopen, Paolo Bonzini, 2017/07/06
- [Qemu-block] [PATCH 11/11] block/snapshot: do not take AioContext lock, Paolo Bonzini, 2017/07/06
Re: [Qemu-block] [Qemu-devel] [RFC PATCH 00/11] Block layer thread-safety, next part, no-reply, 2017/07/06
Re: [Qemu-block] [RFC PATCH 00/11] Block layer thread-safety, next part, Stefan Hajnoczi, 2017/07/10