[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions
From: |
Prasad Pandit |
Subject: |
Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions |
Date: |
Mon, 2 Dec 2024 15:14:41 +0530 |
Hello Fabiano,
On Thu, 28 Nov 2024 at 18:50, Fabiano Rosas <farosas@suse.de> wrote:
>>> We currently have a mess of conditionals to achieve the correct
>>> combination of multifd local flushes, where we sync the local
>>> (send/recv) multifd threads between themselves, and multifd remote
>>> flushes, where we put a flag on the stream to inform the recv side to
>>> perform a local flush.
...
> >> + if (do_local_flush) {
> >> + ret = multifd_ram_flush_and_sync();
> >> + if (ret < 0) {
> >> + return ret;
> >> + }
> >> + }
> >> +
...
> >> + /* Put a flag on the stream to trigger a remote flush */
> >> + if (do_remote_flush) {
> >> + qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> >> + qemu_fflush(f);
> >> + }
> >> +
> >> + if (do_local_flush) {
> >> + multifd_recv_sync_main();
> >> + }
...
>These syncs are all related and we'd like to be able to reason about
> them in a single place, not in several little pieces all over the place.
> When a new feature comes up, like it did when mapped-ram was introduced,
> we don't want to go around having to squint at conditionals to know whether
> it applies to the new case or not.
>
> Also, ram.c is not the place for open-coded multifd code. The same
> mapped-ram example applies: having to add if (migrate_mapped_ram())
> throughout the ram code was a pain and we had some iterations of
> flipping the logic until we got it right.
* I see. If it is the larger complexity of how multifd threads do
flush/sync, not just about long conditionals with 3-4 sub-expressions,
I think this can be done separately, instead of as part of this patch
series.
> There is no fflush() going on here. This code is about the multifd
> flush, which sends the remaining data from MultiFDPages_t and the
> multifd sync, which synchronizes the multifd threads. That qemu_fflush
> is just to make sure the destination sees flag on the stream.
* Yes, there is no fflush(3) call. I mentioned fflush(3) as indicative
of the operation performed. ie. the description above reads the same
as what fflush(3) does to streams.
"...fflush() forces a write of all user-space buffered data for
the given output or update stream via the stream's underlying write
function."
In the multifd case we are sending remaining data from MultiFDPages_t
buffers onto the respective channels IIUC. The
multifd_send_sync_main() function sets the 'p->pending_sync' field and
when it is set miltifd_send_thread() function calls
ret = qio_channel_write_all(p->c, (void *)p->packet,
p->packet_len, &local_err);
multifd_send_sync_main() also has 'flush_zero_copy', but that only
happens when using --zerocopy option is used -> $ virsh migrate
--zerocopy ...
> There is no flush on the receive side. The RAM_SAVE_FLAG_MULTIFD_FLUSH
> flag is there to indicate to the destination that at that point in the
> stream the source has done a flush + sync operation and the destination
> should sync it's threads as well.
* The comment around where 'RAM_SAVE_FLAG_MULTIFD_FLUSH' gets written
above, says -> "...trigger remote flush."
* We seem to use terms 'flush' and 'sync' quite freely and
interchangeably. ie. variables (ex: do_local_flush) and constants are
named with _FLUSH and functions and fields are named as _sync_main()
and &p->pending_sync.
if (do_local_flush) {
multifd_send/_recv_sync_main(); <= do the 'flush' and
'sync' mean the same thing here?
}
Even in multifd_ram_flush_and_sync() routine, it is named with _flush_
and eventually multifd_send() sets the '&p->pending_job' variable to
true. There is no field in MultiFDSendParams structure named with
'flush'. It defines 'pending_sync' and 'sem_sync'.
* Such free usage of these terms is bound to create confusion. Because
while reading code the reader may relate flush with fflush(3) and sync
with fsync(2) calls/operations. It will help if we use these terms
more judiciously.
* Coming back to the issue of simplifying multifd threads 'flush or
sync' operation:
1. I think it can be done separately, outside the scope of this patch series.
2. Must we tie the flush/sync operations with specific spots? Isn't
there any other way, where we call multifd_*_sync() unconditionally,
without any side-effects? As I see it, we have those conditions,
because of the side-effects.
Thank you.
---
- Prasad
- Re: [PATCH v1 3/4] migration: refactor ram_save_target_page functions,
Prasad Pandit <=