[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 4/5] qapi: introduce device-sync-config
From: |
Markus Armbruster |
Subject: |
Re: [PATCH v3 4/5] qapi: introduce device-sync-config |
Date: |
Mon, 29 Apr 2024 15:04:33 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
> On 29.04.24 13:51, Markus Armbruster wrote:
>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>>
>>> On 24.04.24 14:48, Markus Armbruster wrote:
>>>> Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> writes:
>>>>
>>>>> Add command to sync config from vhost-user backend to the device. It
>>>>> may be helpful when VHOST_USER_SLAVE_CONFIG_CHANGE_MSG failed or not
>>>>> triggered interrupt to the guest or just not available (not supported
>>>>> by vhost-user server).
>>>>>
>>>>> Command result is racy if allow it during migration. Let's allow the
>>>>> sync only in RUNNING state.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>
>> [...]
>>
>>>>> diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
>>>>> index 0117d243c4..296af52322 100644
>>>>> --- a/include/sysemu/runstate.h
>>>>> +++ b/include/sysemu/runstate.h
>>>>> @@ -5,6 +5,7 @@
>>>>> #include "qemu/notify.h"
>>>>>
>>>>> bool runstate_check(RunState state);
>>>>> +const char *current_run_state_str(void);
>>>>> void runstate_set(RunState new_state);
>>>>> RunState runstate_get(void);
>>>>> bool runstate_is_running(void);
>>>>> diff --git a/qapi/qdev.json b/qapi/qdev.json
>>>>> index facaa0bc6a..e8be79c3d5 100644
>>>>> --- a/qapi/qdev.json
>>>>> +++ b/qapi/qdev.json
>>>>> @@ -161,3 +161,24 @@
>>>>> ##
>>>>> { 'event': 'DEVICE_UNPLUG_GUEST_ERROR',
>>>>> 'data': { '*device': 'str', 'path': 'str' } }
>>>>> +
>>>>> +##
>>>>> +# @device-sync-config:
>>>>> +#
>>>>> +# Synchronize config from backend to the guest. The command notifies
>>>>> +# re-read the device config from the backend and notifies the guest
>>>>> +# to re-read the config. The command may be used to notify the guest
>>>>> +# about block device capcity change. Currently only vhost-user-blk
>>>>> +# device supports this.
>>>>
>>>> I'm not sure I understand this. To work towards an understanding, I
>>>> rephrase it, and you point out the errors.
>>>>
>>>> Synchronize device configuration from host to guest part. First,
>>>> copy the configuration from the host part (backend) to the guest
>>>> part (frontend). Then notify guest software that device
>>>> configuration changed.
>>>
>>> Correct, thanks
>>
>> Perhaps
>>
>> Synchronize guest-visible device configuration with the backend's
>> configuration, and notify guest software that device configuration
>> changed.
>>
>> This may be useful to notify the guest of a block device capacity
>> change. Currenrly, only vhost-user-blk devices support this.
>
> Sounds good
Except I fat-fingered "Currently".
>>
>> Next question: what happens when the device *doesn't* support this?
>
> An error "device-sync-config is not supported ..."
Okay.
>>>> I wonder how configuration can get out of sync. Can you explain?
>>>>
>>>
>>> The example (and the original feature, which triggered developing this) is
>>> vhost disk resize. If vhost-server (backend) doesn't support
>>> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG, neither QEMU nor guest will know that
>>> disk capacity changed.
>>
>> Sounds like we wouldn't need this command if we could make the
>> vhost-server support VHOST_USER_SLAVE_CONFIG_CHANGE_MSG. Is making it
>> support it impractical? Or are there other uses for this command?
>
> Qemu's internal vhost-server do support it. But that's not the only
> vhost-user server) So the command is useful for those servers which doesn't
> support VHOST_USER_SLAVE_CONFIG_CHANGE_MSG. Note, that this message requires
> setting up additional channel of server -> client communication. That was the
> reason, why the "change-msg" solution was rejected in our downstream: it's
> safer to reuse existing channel (QMP), than to add and support an additional
> channel.
>
> Also, the command may help to debug the system, when
> VHOST_USER_SLAVE_CONFIG_CHANGE_MSG doesn't work for some reason.
Suggest to work this into the commit message.
>>>>> +#
>>>>> +# @id: the device's ID or QOM path
>>>>> +#
>>>>> +# Features:
>>>>> +#
>>>>> +# @unstable: The command is experimental.
>>>>> +#
>>>>> +# Since: 9.1
>>>>> +##
>>>>> +{ 'command': 'device-sync-config',
>>>>> + 'features': [ 'unstable' ],
>>>>> + 'data': {'id': 'str'} }
>>>>> diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
>>>>> index 7e075d91c1..cb35ea0b86 100644
>>>>> --- a/system/qdev-monitor.c
>>>>> +++ b/system/qdev-monitor.c
>>>>> @@ -23,6 +23,7 @@
>>>>> #include "monitor/monitor.h"
>>>>> #include "monitor/qdev.h"
>>>>> #include "sysemu/arch_init.h"
>>>>> +#include "sysemu/runstate.h"
>>>>> #include "qapi/error.h"
>>>>> #include "qapi/qapi-commands-qdev.h"
>>>>> #include "qapi/qmp/dispatch.h"
>>>>> @@ -969,6 +970,52 @@ void qmp_device_del(const char *id, Error **errp)
>>>>> }
>>>>> }
>>>>>
>>>>> +int qdev_sync_config(DeviceState *dev, Error **errp)
>>>>> +{
>>>>> + DeviceClass *dc = DEVICE_GET_CLASS(dev);
>>>>> +
>>>>> + if (!dc->sync_config) {
>>>>> + error_setg(errp, "device-sync-config is not supported for '%s'",
>>>>> + object_get_typename(OBJECT(dev)));
>>>>> + return -ENOTSUP;
>>>>> + }
>>>>> +
>>>>> + return dc->sync_config(dev, errp);
>>>>> +}
>>>>> +
>>>>> +void qmp_device_sync_config(const char *id, Error **errp)
>>>>> +{
>>>>> + DeviceState *dev;
>>>>> +
>>>>> + /*
>>>>> + * During migration there is a race between syncing`config and
>>>>> + * migrating it, so let's just not allow it.
>>>>
>>>> Can you briefly explain the race?
>>>
>>> If at the moment of qmp command, corresponding config already migrated to
>>> the target, we'll change only the config on source, but on the target we'll
>>> still have outdated config.
>>
>> For RAM, dirty tracking ensures the change gets sent. But this is
>> device memory. Correct?
>
> Yes. It's stored in malloced buffer VirtIIODevice::config, and accessed
> through handlers virtio_pci_config_read()/virtio_pci_config_write(). As I
> understand, no kind of dirty tracking here..
>
> And I see, it's migrated in virtio_save():
> ...
> qemu_put_be32(f, vdev->config_len);
> qemu_put_buffer(f, vdev->config, vdev->config_len);
> ...
Suggest to explain the race in the comment. Perhaps like this:
Guest-visible configuration is stored in device memory. There is a
race between updating and migrating it: if we update it before we
migrate it, it's migrated fine, but if any later updates are lost.
>>>>> + *
>>>>> + * Moreover, let's not rely on setting up interrupts in paused
>>>>> + * state, which may be a part of migration process.
>>>>
>>>> What dependence exactly are you avoiding? Config synchronization
>>>> depending on guest interrupt delivery?
>>>
>>> Right, guest is notified by pci_set_irq.
>>
>> If we allowed it in paused state, the delivery of the interrupt would be
>> delayed until the guest resumes running. Correct?
>
> I think so. But this will not work, if we do offline migration like pause ->
> migrate -> resume on target. So I decided that better be more safe. The
> restrictions may be relaxed in future if needed.
Sounds like we'd make an interrupt pending on the source, but migration
failed to make it pending on the target as well, so it gets lost.
Is that the case?
If yes, question for migration experts: why isn't this a problem
elsewhere, too?
[...]
Thanks!