qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 2/2] vhost-user: fix lost reconnect again


From: Li Feng
Subject: Re: [PATCH v3 2/2] vhost-user: fix lost reconnect again
Date: Wed, 15 May 2024 13:47:02 +0800


> 2024年5月14日 21:58,Raphael Norwitz <raphael@enfabrica.net> 写道:
> 
> Code looks good. Just a question on the error case you're trying to fix.
> 
> On Tue, May 14, 2024 at 2:12 AM Li Feng <fengli@smartx.com> wrote:
>> 
>> When the vhost-user is reconnecting to the backend, and if the vhost-user 
>> fails
>> at the get_features in vhost_dev_init(), then the reconnect will fail
>> and it will not be retriggered forever.
>> 
>> The reason is:
>> When the vhost-user fail at get_features, the vhost_dev_cleanup will be 
>> called
>> immediately.
>> 
>> vhost_dev_cleanup calls 'memset(hdev, 0, sizeof(struct vhost_dev))'.
>> 
>> The reconnect path is:
>> vhost_user_blk_event
>>   vhost_user_async_close(.. vhost_user_blk_disconnect ..)
>>     qemu_chr_fe_set_handlers <----- clear the notifier callback
>>       schedule vhost_user_async_close_bh
>> 
>> The vhost->vdev is null, so the vhost_user_blk_disconnect will not be
>> called, then the event fd callback will not be reinstalled.
>> 
>> With this patch, the vhost_user_blk_disconnect will call the
>> vhost_dev_cleanup() again, it's safe.
>> 
>> In addition, the CLOSE event may occur in a scenario where connected is 
>> false.
>> At this time, the event handler will be cleared. We need to ensure that the
>> event handler can remain installed.
> 
> Following on from the prior patch, why would "connected" be false when
> a CLOSE event happens?

In OPEN event handling, vhost_user_blk_connect calls vhost_dev_init and 
encounters
an error such that s->connected remains false.
Next, after the CLOSE event arrives, it is found that s->connected is false, so 
nothing
is done, but the event handler will be cleaned up in `vhost_user_async_close`
before the CLOSE event is executed.

Thanks,
Li

> 
>> 
>> All vhost-user devices have this issue, including vhost-user-blk/scsi.
>> 
>> Fixes: 71e076a07d ("hw/virtio: generalise CHR_EVENT_CLOSED handling")
>> 
>> Signed-off-by: Li Feng <fengli@smartx.com>
>> ---
>> hw/block/vhost-user-blk.c   |  3 ++-
>> hw/scsi/vhost-user-scsi.c   |  3 ++-
>> hw/virtio/vhost-user-base.c |  3 ++-
>> hw/virtio/vhost-user.c      | 10 +---------
>> 4 files changed, 7 insertions(+), 12 deletions(-)
>> 
>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>> index 41d1ac3a5a..c6842ced48 100644
>> --- a/hw/block/vhost-user-blk.c
>> +++ b/hw/block/vhost-user-blk.c
>> @@ -353,7 +353,7 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>>     VHostUserBlk *s = VHOST_USER_BLK(vdev);
>> 
>>     if (!s->connected) {
>> -        return;
>> +        goto done;
>>     }
>>     s->connected = false;
>> 
>> @@ -361,6 +361,7 @@ static void vhost_user_blk_disconnect(DeviceState *dev)
>> 
>>     vhost_dev_cleanup(&s->dev);
>> 
>> +done:
>>     /* Re-instate the event handler for new connections */
>>     qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, vhost_user_blk_event,
>>                              NULL, dev, NULL, true);
>> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
>> index 48a59e020e..b49a11d23b 100644
>> --- a/hw/scsi/vhost-user-scsi.c
>> +++ b/hw/scsi/vhost-user-scsi.c
>> @@ -181,7 +181,7 @@ static void vhost_user_scsi_disconnect(DeviceState *dev)
>>     VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>> 
>>     if (!s->connected) {
>> -        return;
>> +        goto done;
>>     }
>>     s->connected = false;
>> 
>> @@ -189,6 +189,7 @@ static void vhost_user_scsi_disconnect(DeviceState *dev)
>> 
>>     vhost_dev_cleanup(&vsc->dev);
>> 
>> +done:
>>     /* Re-instate the event handler for new connections */
>>     qemu_chr_fe_set_handlers(&vs->conf.chardev, NULL, NULL,
>>                              vhost_user_scsi_event, NULL, dev, NULL, true);
>> diff --git a/hw/virtio/vhost-user-base.c b/hw/virtio/vhost-user-base.c
>> index 4b54255682..11e72b1e3b 100644
>> --- a/hw/virtio/vhost-user-base.c
>> +++ b/hw/virtio/vhost-user-base.c
>> @@ -225,13 +225,14 @@ static void vub_disconnect(DeviceState *dev)
>>     VHostUserBase *vub = VHOST_USER_BASE(vdev);
>> 
>>     if (!vub->connected) {
>> -        return;
>> +        goto done;
>>     }
>>     vub->connected = false;
>> 
>>     vub_stop(vdev);
>>     vhost_dev_cleanup(&vub->vhost_dev);
>> 
>> +done:
>>     /* Re-instate the event handler for new connections */
>>     qemu_chr_fe_set_handlers(&vub->chardev,
>>                              NULL, NULL, vub_event,
>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
>> index c929097e87..c407ea8939 100644
>> --- a/hw/virtio/vhost-user.c
>> +++ b/hw/virtio/vhost-user.c
>> @@ -2781,16 +2781,8 @@ typedef struct {
>> static void vhost_user_async_close_bh(void *opaque)
>> {
>>     VhostAsyncCallback *data = opaque;
>> -    struct vhost_dev *vhost = data->vhost;
>> 
>> -    /*
>> -     * If the vhost_dev has been cleared in the meantime there is
>> -     * nothing left to do as some other path has completed the
>> -     * cleanup.
>> -     */
>> -    if (vhost->vdev) {
>> -        data->cb(data->dev);
>> -    }
>> +    data->cb(data->dev);
>> 
>>     g_free(data);
>> }
>> --
>> 2.45.0
>> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]