qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/1] virtio: fallback from irqfd to non-irqfd no


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH 1/1] virtio: fallback from irqfd to non-irqfd notify
Date: Wed, 1 Mar 2017 18:53:54 +0200

On Wed, Mar 01, 2017 at 05:08:39PM +0100, Halil Pasic wrote:
> 
> 
> On 03/01/2017 03:29 PM, Paolo Bonzini wrote:
> > 
> > 
> > On 01/03/2017 14:22, Halil Pasic wrote:
> >> Here a trace:
> >>
> >> address@hidden:virtio_blk_req_complete req 0x2aa6b117e10 status 0
> >> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> address@hidden:virtio_blk_req_complete req 0x2aa6b118980 status 0
> >> address@hidden:virtio_blk_req_complete req 0x2aa6b119260 status 0
> >> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> address@hidden:virtio_blk_req_complete req 0x2aa6b118980 status 0
> >> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> address@hidden:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 
> >> 0x2aa6b4de880
> >> address@hidden:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 
> >> 0x2aa6b4de8f8
> >> address@hidden:virtio_blk_data_plane_stop dataplane 0x2aa6b0e5540
> >>                          ^== DATAPLANE STOP  
> >> address@hidden:virtio_blk_req_complete req 0x2aa6b117e10 status 0
> >> address@hidden:virtio_guest_notifier_read vdev 0x2aa6b0e19d8 vq 
> >> 0x2aa6b4c0870
> >>                          ^== comes from 
> >> k->set_guest_notifiers(qbus->parent, nvqs, false);
> >>                              in virtio_blk_data_plane_stop and done 
> >> immediately after
> >>                              irqfd is cleaned up by the transport
> >> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
> >> halil: error in event_notifier_set: Bad file descriptor
> >>                          ^== here we have the problem
> >>
> >> If you want a stacktrace that can be arranged to.
> >>
> >>> like a reset should cause it (the only call in virtio-blk is from
> >>> virtio_blk_data_plane_stop), and then the guest doesn't care anymore
> >>> about interrupts.
> >> I do not understand this with 'doesn't care anymore about interrupts'.
> >> I was debugging a virtio-blk device being stuck waiting for a host
> >> notification (interrupt) after migration.
> > 
> > Ok, this explains it better then.  The issue is that
> > virtio_blk_data_plane_stop doesn't flush the bottom half, which you want
> > to do when the caller is, for example, virtio_ccw_vmstate_change.
> > 
> > Does it work if you call to qemu_bh_cancel(s->bh) and notify_guest_bh(s)
> > after
> > 
> >     blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
> > 
> > ?
> > 
> 
> With
> 
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -260,6 +260,8 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev)
>  
>      /* Drain and switch bs back to the QEMU main loop */
>      blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
> +    qemu_bh_cancel(s->bh);
> +    notify_guest_bh(s);
>  
> applied I do not see the problem any more. I will most likely
> turn this into a patch tomorrow. I would like to give it some more testing and
> thinking (see questions below) until tomorrow.
> 
> I should probably cc stable, or?
> 
> I would also like to do some diagnostic stuff if virtio_notify_irqfd fails.
> Maybe assert success for event_notifier_set. Would that be OK with you?

Main reason it can't fail is because we don't close the fd.
Given no callers check the return status, I'd be inclined to go further
and stick that assert into event_notifier_set, convert that
function to int.

> I have a couple of questions about the ways of the dataplane code. If
> you are too busy, feel free to not answer -- I will keep thinking myself.
> 
> Q1. For this to work correctly, it seems to me, we need to be sure that
> virtio_blk_req_complete can not be happen between the newly added
> notify_guest_bh(s);
> and 
> vblk->dataplane_started = false; 
> becomes visible. How is this ensured?
> Q2. The virtio_blk_data_plane_stop should be from the thread/context
> associated with the main event loop, and with that
> vblk->dataplane_started = false too. But I think dataplane_started
> may end up being used form a different thread (e.g. req_complete).
> How does the sequencing work there and/or is it even important?
> 
> Regards,
> Halil



reply via email to

[Prev in Thread] Current Thread [Next in Thread]