[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 0/2] virtio: Keep notifications disabled during drain
|
From: |
Stefan Hajnoczi |
|
Subject: |
Re: [PATCH 0/2] virtio: Keep notifications disabled during drain |
|
Date: |
Thu, 25 Jan 2024 13:05:00 -0500 |
On Wed, Jan 24, 2024 at 06:38:28PM +0100, Hanna Czenczek wrote:
> Hi,
>
> When registering callbacks via aio_set_event_notifier_poll(), the
> io_poll_end() callback is only invoked when polling actually ends. If
> the notifiers are removed while in a polling section, it is not called.
> Therefore, io_poll_start() is not necessarily followed up by
> io_poll_end().
>
> It is not entirely clear whether this is good or bad behavior. On one
> hand, it may be unexpected to callers. On the other, it may be
> counterproductive to call io_poll_end() when the polling section has not
> ended yet.
>
> Right now, there is only one user of aio_set_event_notifier(), which is
> virtio_queue_aio_attach_host_notifier(). It does not expect this
> behavior, which leads to virtqueue notifiers remaining disabled if
> virtio_queue_aio_detach_host_notifier() is called while polling. That
> can happen e.g. through virtio_scsi_drained_begin() or
> virtio_blk_drained_begin() (through virtio_blk_data_plane_detach()).
> In such a case, the virtqueue may not be processed for a while, letting
> the guest driver hang. This can be reproduced by repeatedly
> hot-plugging and -unplugging a virtio-scsi device with a scsi-hd disk,
> because the guest will try to enumerate the virtio-scsi device while
> we’re attaching the scsi-hd disk, which causes a drain, which can cause
> the virtio-scsi virtqueue to stall as described.
>
> Stefan has suggested ensuring we always follow up io_poll_start() by
> io_poll_end():
>
> https://lists.nongnu.org/archive/html/qemu-block/2023-12/msg00163.html
>
> I prefer changing the caller instead, because I don’t think we actually
> want the virtqueue notifier to be force-enabled when removing our AIO
> notifiers. So I believe we actually only want to take care to
> force-enable it when we re-attach the AIO notifiers, and to kick
> virtqueue processing once, in case we missed any events while the AIO
> notifiers were not attached.
>
> That is done by patch 2. We have already discussed a prior version of
> it here:
>
> https://lists.nongnu.org/archive/html/qemu-block/2024-01/msg00001.html
>
> And compared to that, based on the discussion, there are some changes:
> 1. Used virtio_queue_notify() instead of virtio_queue_notify_vq(), as
> suggested by Paolo, because it’s thread-safe
> 2. Moved virtio_queue_notify() into
> virtio_queue_aio_attach_host_notifier*(), because we always want it
> 3. Dropped virtio_queue_set_notification(vq, 0) from
> virtio_queue_aio_detach_host_notifier(): Paolo wasn’t sure whether
> that was safe to do from any context. We don’t really need to call
> it anyway, so I just dropped it.
> 4. Added patch 1:
>
> Patch 1 fixes virtio_scsi_drained_end() so it won’t attach polling
> notifiers for the event virtqueue. That didn’t turn out to be an issue
> so far, but with patch 2, Fiona saw the virtqueue processing queue
> spinning in a loop as described in
> 38738f7dbbda90fbc161757b7f4be35b52205552 ("virtio-scsi: don't waste CPU
> polling the event virtqueue").
>
>
> Note that as of eaad0fe26050c227dc5dad63205835bac4912a51 ("scsi: only
> access SCSIDevice->requests from one thread") there’s a different
> problem when trying to reproduce the bug via hot-plugging and
> -unplugging a virtio-scsi device, specifically, when unplugging, qemu
> may crash with an assertion failure[1]. I don’t have a full fix for
> that yet, but in case you need a work-around for the specific case of
> virtio-scsi hot-plugging and -unplugging, you can use this patch:
>
> https://czenczek.de/0001-DONTMERGE-Fix-crash-on-scsi-unplug.patch
>
>
> [1] https://lists.nongnu.org/archive/html/qemu-block/2024-01/msg00317.html
>
>
> Hanna Czenczek (2):
> virtio-scsi: Attach event vq notifier with no_poll
> virtio: Keep notifications disabled during drain
>
> include/block/aio.h | 7 ++++++-
> hw/scsi/virtio-scsi.c | 7 ++++++-
> hw/virtio/virtio.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 54 insertions(+), 2 deletions(-)
This patch series also fixes RHEL-7356.
Buglink: https://issues.redhat.com/browse/RHEL-7356.
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
signature.asc
Description: PGP signature
Re: [PATCH 0/2] virtio: Keep notifications disabled during drain,
Stefan Hajnoczi <=