[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v7 09/12] hw/cxl/events: Add qmp interfaces to add/release dy
From: |
Markus Armbruster |
Subject: |
Re: [PATCH v7 09/12] hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents |
Date: |
Tue, 04 Jun 2024 14:28:54 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Jonathan Cameron <Jonathan.Cameron@Huawei.com> writes:
> On Tue, 04 Jun 2024 11:18:18 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> I finally got around to read this slowly. Thank you, Fan and Jonathan!
>>
>> I'm getting some "incomplete" vibes: "if we ever implement", "patches
>> for this on list", "we aren't emulating this yet at all", and ...
>
> Absolutely. There is a bunch of stuff that we reject today but
> the interfaces allow you to specify it and align with the CXL specification
> Fabric Management API definition which is the spec used to control this
> stuff from a BMC etc. If that doesn't work we have a hardware errata
> problem, so hopefully that is fairly stable.
>
> I think I can publicly confirm there are some related errata in flight,
> seeing as we said we'd raise questions on some aspects in the kernel and
> QEMU series preceding this one (so no IP secrecy issues). These are as a
> result of this work from Fan, but we have carefully avoided implementing
> anything that 'may' change.
>
>
>>
>> Jonathan Cameron <Jonathan.Cameron@Huawei.com> writes:
>>
>> [...]
>>
>> > Only thing I'd add is that for now (because we don't need it for testing
>> > the kernel flows) is that this does not provide any way for external
>> > agents (e.g. our 'fabric manager' to find out what the state is - i.e.
>> > if the extents have been accepted by the host etc). That stuff is all
>> > defined by the spec, but not yet in the QMP interface. At somepoint
>> > we may want to add that as a state query type interface.
>>
>> ... this, too.
>>
>> In review of v5, I asked whether this interface needs to be stable.
>>
>> "Not stable" doesn't mean we change an interface without thought. It
>> merely means we can change it much, much faster, and with much less
>> overhead.
>>
>> I understand you want it chiefly for CXL development. Development aids
>> commonly don't need to be stable.
>
> Ok. If it makes sense to make this unstable for now I'm fine with that.
> I don't see why it would change beyond in backwards compatible fashion
> with new optional fields to cover future specification updates allowing
> for more information. However I've been wrong on such things before.
>
> So I'm fine adding a patch on top of v8 marking them unstable for now.
I'd squash it into v8, but follow-up patch works for me, too.
>> If you're aiming for stable, you need to convince us the interface can
>> support the foreseeable purposes without incompatible changes. In
>> particular, I'd like to see how you intend to support "finding out what
>> the state is". I suspect that's related to my question in review of v8:
>> how to detect completion, and maybe track progress.
>
> There is a need for completion information but given it's completely
> asynchronous to the commands defined here (Can be out of order, lots of
> ongoing capacity add / remove flows in flight etc) and for the hardware
> definition the feedback occurs via an event record log I'm not expecting it
> to affect the interfaces added so far.
>
>>
>> There's infrastructure for background jobs: job.json. We might be
>> better off using it, unless completion is trivial and progress tracking
>> unnecessary.
>
> I'll take a look, but these are not conventional background commands
> (We do have those as well, but that's a whole different set of future
> problems!)
>
> The actual command itself completes synchronously but not the flow
> it kicked off which is not background as such because it may never
> finish and involves lots of moving parts. This is similar to any
> form of error injection. We inject the error synchronously and that
> creates a bunch of records in various registers / firmware etc but
> the actions the host OS takes are asynchronous and may only happen
> when it decides to poll some register or a driver loads much later.
>
> So I'm not sure if job.json approach fits.
Neither am I, but I want you to be aware of it, so you can make an
informed decision :)
>> [...]