[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 06/29] migration: Add auto-pause capability
|
From: |
Daniel P . Berrangé |
|
Subject: |
Re: [PATCH v2 06/29] migration: Add auto-pause capability |
|
Date: |
Wed, 25 Oct 2023 15:20:16 +0100 |
|
User-agent: |
Mutt/2.2.9 (2022-11-12) |
On Wed, Oct 25, 2023 at 10:57:12AM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
>
> > On Mon, Oct 23, 2023 at 05:35:45PM -0300, Fabiano Rosas wrote:
> >> Add a capability that allows the management layer to delegate to QEMU
> >> the decision of whether to pause a VM and perform a non-live
> >> migration. Depending on the type of migration being performed, this
> >> could bring performance benefits.
> >
> > I'm not really see what problem this is solving.
> >
>
> Well, this is the fruit of your discussion with Peter Xu in the previous
> version of the patch.
>
> To recap: he thinks QEMU is doing useless work with file migrations
> because they are always asynchronous. He thinks we should always pause
> before doing fixed-ram migration. You said that libvirt would rather use
> fixed-ram for a more broad set of savevm-style commands, so you'd rather
> not always pause. I'm trying to cater to both of your wishes. This new
> capability is the middle ground I came up with.
>
> So fixed-ram would always pause the VM, because that is the primary
> use-case, but libvirt would be allowed to say: don't pause this time.
If the VM is going to be powered off immediately after saving
a snapshot then yes, you might as well pause it, but we can't
assume that will be the case. An equally common use case
would be for saving periodic snapshots of a running VM. This
should be transparent such that the VM remains running the
whole time, except a narrow window at completion of RAM/state
saving where we flip the disk snapshots, so they are in sync
with the RAM snapshot.
IOW, save/restore to disk can imply paused, but snapshotting
should not imply paused. So I don't see an unambiguous
rationale that we should diverge when fixed-ram is set and
auto-pause the VM.
> > Mgmt apps are perfectly capable of pausing the VM before issuing
> > the migrate operation.
> >
>
> Right. But would QEMU be allowed to just assume that if a VM is paused
> at the start of migration it can then go ahead and skip all dirty page
> mechanisms?
Skipping dirty page tracking would imply that the mgmt app cannot
resume CPUs without either letting the operation complete, or
aborting it.
That is probably a reasonable assumption, as I can't come up with
a use case for starting out paused and then later resuming, unless
there was a scearnio where you needed to synchronous something
external with the start of migration. Sychronizing storage though
is something that happens at the end of migration instead.
> Without pausing, we're basically doing *live* migration into a static
> file that will be kept on disk for who knows how long before being
> restored on the other side. We could release the src QEMU resources (a
> bit) earlier if we paused the VM beforehand.
Can we really release resources early ? If the save operation fails
right at the end, we want to be able to resume execution of CPUs,
which assumes all resources are still available, otherwise we have
a failure scenario where we've not successfully saved to disk and
also don't still have the running QEMU.
> We're basically talking about whether we want the VM to be usable in the
> (hopefully) very short time between issuing the migration command and
> the migration being finished. We might be splitting hairs here, but we
> need some sort of consensus.
The time may not be very short for large VMs.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
- Re: [PATCH v2 02/29] tests/qtest: Move QTestMigrationState to libqtest, (continued)
- [PATCH v2 04/29] migration: Return the saved state from global_state_store, Fabiano Rosas, 2023/10/23
- [PATCH v2 05/29] migration: Introduce global_state_store_once, Fabiano Rosas, 2023/10/23
- [PATCH v2 06/29] migration: Add auto-pause capability, Fabiano Rosas, 2023/10/23
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Daniel P . Berrangé, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Fabiano Rosas, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability,
Daniel P . Berrangé <=
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Peter Xu, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Daniel P . Berrangé, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Peter Xu, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Daniel P . Berrangé, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Peter Xu, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Daniel P . Berrangé, 2023/10/25
- Re: [PATCH v2 06/29] migration: Add auto-pause capability, Peter Xu, 2023/10/25
[PATCH v2 07/29] migration: Run "file:" migration with a stopped VM, Fabiano Rosas, 2023/10/23
[PATCH v2 08/29] tests/qtest: File migration auto-pause tests, Fabiano Rosas, 2023/10/23
[PATCH v2 09/29] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file, Fabiano Rosas, 2023/10/23