[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 0/4] QOM: Singleton interface
From: |
Peter Xu |
Subject: |
Re: [PATCH 0/4] QOM: Singleton interface |
Date: |
Wed, 11 Dec 2024 17:10:11 -0500 |
On Wed, Dec 11, 2024 at 09:19:32AM +0100, Markus Armbruster wrote:
> Looked at this thread again to refresh my memory on the proposed
> singleton interface, and found I have something to add.
>
> Peter Xu <peterx@redhat.com> writes:
>
> > On Tue, Oct 29, 2024 at 04:04:50PM +0000, Daniel P. Berrangé wrote:
> >> I tend to feel that having MigrationState exist for the whole lifetime
> >> of QEMU is a bug, forced on us by the unfortunate need to call
> >> migrate-set-parameters/capabilities separately from the migrate
> >> command, and by the need to query migrate info an arbitrary amount of
> >> time after it finishes.
> >>
> >> This puts libvirt in the awkward position of having to manually reset
> >> all migration parameters, just to ensure earlier settings don't
> >> accidentally affect a future migration operation :-( This is a design
> >> that encourages mistakes.
> >
> > I think it would still be easy to add "cap" & "params" arguments support
> > for the "migrate" QMP command without breaking the current API, iff that
> > helps in whatever form. When present, it simply applies the caps and/or
> > param list first before invoking the migrate command, fail the command if
> > cap / param check fails.
> >
> > But I'm not sure whether that's a concern at all for Libvirt, if what
> > Libvirt currently does is having separate "migrate-set-*" commands prior to
> > the "migrate". I may have overlooked the real issue behind on how that
> > could complicate Libvirt.
>
> I think Daniel's point is that the interface's reliance on global state
> makes it awkward to use.
>
> Migration configuration is global state. It's split into "capabilities"
> and "parameters", but that's detail. We have commands to query and
> change this state.
>
> When Libvirt connects to a QEMU process, it has no idea what the global
> migration configuration is. To get it into a known state, it has to set
> *everything*. It cannot rely on defaults.
>
> It even has to set things it doesn't know! When we add a new parameter
> to QEMU, libvirt needs to be updated to reset it to its default even
> when libvirt has no need for it. When you use a version of libvirt that
> lacks this update, it remains whatever it was. The migration interface
> becomes accidentally stateful at the libvirt level, which is
> undesirable.
>
> Compare this to the more modern interface we have for other long-running
> tasks: jobs.
>
> There is a job-specific command that creates the job: blockdev-backup,
> block-commit, blockdev-mirror, block-stream, blockdev-create,
> snapshot-save, snapshot-load, snapshot-delete, ... Each command takes
> the entire job configuration as arguments. Libvirt does not need
> updating for new parameters: these simply remain at their default
> values.
>
> Bonus: there are generic commands to control and monitor jobs:
> job-pause, job-resume, job-cancel, job-complete, job-dismiss,
> job-finalize, query-jobs.
Yes, migration is state-ful from that regard. IMHO it is still ok because
unlike most jobs, migration task cannot have more than one.
Reusing jobs interface may work, but migation existed for so long a time
with its own APIs, I am not sure we'll get real benefit by reusing them.
At the meantime it may not 100% map to what migration wants (e.g.,
migrate_start_postcopy, postcopy recoveries, etc.).
OTOH, we definitely don't want to use the internal impl of jobs, because we
don't want to add either AIO or more coroutines into migration core - we
need to still use coroutine on dest QEMU but that's mostly only because of
legacy reasons.. and besides that and some very corner use case
(e.g. channel setups), migration is almost thread-based now. A mixture of
threads and coroutines is too error prone and undebuggable, IMHO.
Going back to the "allow migrate QMP command take caps / parameters", we
still try to do that at some point. But I recall we discussed about this
offlist (between Dan, probably Peter Krempa and myself), I believe the
conclusion is it'll make the API cleaner, but without no real benefit yet
so far. Meanwhile there're some parameters that must be still stateful,
like a few max*-bandwidth or downtime_limit parameters. They need to be
able to be changed on the fly, especially during migration task running.
Thanks,
--
Peter Xu