[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH V7 19/29] vfio-pci: cpr part 1 (fd and dma)
From: |
Steven Sistare |
Subject: |
Re: [PATCH V7 19/29] vfio-pci: cpr part 1 (fd and dma) |
Date: |
Fri, 11 Mar 2022 11:22:01 -0500 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.2 |
On 3/10/2022 5:30 PM, Alex Williamson wrote:
> On Thu, 10 Mar 2022 14:55:50 -0500
> Steven Sistare <steven.sistare@oracle.com> wrote:
>
>> On 3/10/2022 1:35 PM, Alex Williamson wrote:
>>> On Thu, 10 Mar 2022 10:00:29 -0500
>>> Steven Sistare <steven.sistare@oracle.com> wrote:
>>>
>>>> On 3/7/2022 5:16 PM, Alex Williamson wrote:
>>>>> On Wed, 22 Dec 2021 11:05:24 -0800
>>>>> Steve Sistare <steven.sistare@oracle.com> wrote:
>>>>> [...]
>>>>>> diff --git a/migration/cpr.c b/migration/cpr.c
>>>>>> index 37eca66..cee82cf 100644
>>>>>> --- a/migration/cpr.c
>>>>>> +++ b/migration/cpr.c
>>>>>> @@ -7,6 +7,7 @@
>>>>>>
>>>>>> #include "qemu/osdep.h"
>>>>>> #include "exec/memory.h"
>>>>>> +#include "hw/vfio/vfio-common.h"
>>>>>> #include "io/channel-buffer.h"
>>>>>> #include "io/channel-file.h"
>>>>>> #include "migration.h"
>>>>>> @@ -101,7 +102,9 @@ void qmp_cpr_exec(strList *args, Error **errp)
>>>>>> error_setg(errp, "cpr-exec requires cpr-save with restart
>>>>>> mode");
>>>>>> return;
>>>>>> }
>>>>>> -
>>>>>> + if (cpr_vfio_save(errp)) {
>>>>>> + return;
>>>>>> + }
>>>>>
>>>>> Why is vfio so unique that it needs separate handlers versus other
>>>>> devices? Thanks,
>>>>
>>>> In earlier patches these functions fiddled with more objects, but at this
>>>> point
>>>> they are simple enough to convert to pre_save and post_load vmstate
>>>> handlers for
>>>> the container and group objects. However, we would still need to call
>>>> special
>>>> functons for vfio from qmp_cpr_exec:
>>>>
>>>> * validate all containers support CPR before we start blasting vaddrs
>>>> However, I could validate all, in every call to pre_save for each
>>>> container.
>>>> That would be less efficient, but fits the vmstate model.
>>>
>>> Would it be a better option to mirror the migration blocker support, ie.
>>> any device that doesn't support cpr registers a blocker and generic
>>> code only needs to keep track of whether any blockers are registered.
>>
>> We cannot specifically use migrate_add_blocker(), because it is checked in
>> the migration specific function migrate_prepare(), in a layer of functions
>> above the simpler qemu_save_device_state() used in cpr. But yes, we could
>> do something similar for vfio. Increment a global counter in vfio_realize
>> if the container does not support cpr, and decrement it when the container is
>> destroyed. pre_save could just check the counter.
>
> Right, not suggesting to piggyback on migrate_add_blocker() only to use
> a similar mechanism. Only drivers that can't support cpr need register
> a blocker but testing for blockers is done generically, not just for
> vfio devices.
>
>>>> * restore all vaddr's if qemu_save_device_state fails.
>>>> However, I could recover for all containers inside pre_save when one
>>>> container fails.
>>>> Feels strange touching all objects in a function for one, but there is
>>>> no real
>>>> downside.
>>>
>>> I'm not as familiar as I should be with migration callbacks, thanks to
>>> mostly not supporting it for vfio devices, but it seems strange to me
>>> that there's no existing callback or notifier per device to propagate
>>> save failure. Do we not at least get some sort of resume callback in
>>> that case?
>>
>> We do not:
>> struct VMStateDescription {
>> int (*pre_load)(void *opaque);
>> int (*post_load)(void *opaque, int version_id);
>> int (*pre_save)(void *opaque);
>> int (*post_save)(void *opaque);
>>
>> The handler returns an error, which stops further saves and is propagated
>> back
>> to the top level caller qemu_save_device_state().
>>
>> The vast majority of handlers do not have side effects, with no need to
>> unwind
>> anything on failure.
>>
>> This raises another point. If pre_save succeeds for all the containers,
>> but fails for some non-vfio object, then the overall operation is abandoned,
>> but we do not restore the vaddr's. To plug that hole, we need to call the
>> unwind code from qmp_cpr_save, or implement your alternative below.
>
> We're trying to reuse migration interfaces, are we also triggering
> migration state change notifiers? ie.
> MIGRATION_STATUS_{CANCELLING,CANCELLED,FAILED}
No. That happens in the migration layer which we do not use.
> We already hook vfio
> devices supporting migration into that notifier to tell the driver to
> move the device back to the running state on failure, which seems a bit
> unique to vfio devices. Containers could maybe register their own
> callbacks.
>
>>> As an alternative, maybe each container could register a vm change
>>> handler that would trigger reloading vaddrs if we move to a running
>>> state and a flag on the container indicates vaddrs were invalidated?
>>> Thanks,
>>
>> That works and is modular, but I dislike that it adds checks on the
>> happy path for a case that will rarely happen, and it pushes recovery from
>> failure further away from the original failure, which would make debugging
>> cascading failures more difficult.
>
> Would using the migration notifier move us sufficiently closer to the
> failure point? Otherwise I think you're talking about unwinding all
> the containers when any one fails, where you didn't like that object
> overreach, or maybe adding an optional callback... but I wonder if the
> above notifier essentially already does that.
>
> In any case, I think we have options to either implement new or use
> existing notifier-like functionality to avoid all these vfio specific
> callouts. Thanks,
Yes, defining a cpr notifier for failure and cleanup is a good solution.
I'll work on that and a cpr blocker. I'll use the latter for vfio and
the chardevs.
- Steve