qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 7/7] migration/multifd: Document the reason to sync for sa


From: Fabiano Rosas
Subject: Re: [PATCH v2 7/7] migration/multifd: Document the reason to sync for save_setup()
Date: Fri, 06 Dec 2024 14:01:01 -0300

Peter Xu <peterx@redhat.com> writes:

> On Fri, Dec 06, 2024 at 11:40:27AM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > It's not straightforward to see why src QEMU needs to sync multifd during
>> > setup() phase.  After all, there's no page queued at that point.
>> >
>> > For old QEMUs, there's a solid reason: EOS requires it to work.  While it's
>> > clueless on the new QEMUs which do not take EOS message as sync requests.
>> >
>> > One will figure that out only when this is conditionally removed.  In fact,
>> > the author did try it out.  Logically we could still avoid doing this on
>> > new machine types, however that needs a separate compat field and that can
>> > be an overkill in some trivial overhead in setup() phase.
>> >
>> > Let's instead document it completely, to avoid someone else tries this
>> > again and do the debug one more time, or anyone confused on why this ever
>> > existed.
>> >
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> > ---
>> >  migration/ram.c | 27 +++++++++++++++++++++++++++
>> >  1 file changed, 27 insertions(+)
>> >
>> > diff --git a/migration/ram.c b/migration/ram.c
>> > index 5d4bdefe69..ddee703585 100644
>> > --- a/migration/ram.c
>> > +++ b/migration/ram.c
>> > @@ -3036,6 +3036,33 @@ static int ram_save_setup(QEMUFile *f, void 
>> > *opaque, Error **errp)
>> >          migration_ops->ram_save_target_page = ram_save_target_page_legacy;
>> >      }
>> >  
>> > +    /*
>> > +     * This operation is unfortunate..
>> > +     *
>> > +     * For legacy QEMUs using per-section sync
>> > +     * =======================================
>> > +     *
>> > +     * This must exist because the EOS below requires the SYNC messages
>> > +     * per-channel to work.
>> > +     *
>> > +     * For modern QEMUs using per-round sync
>> > +     * =====================================
>> > +     *
>> > +     * Logically this sync is not needed (because we know there's nothing
>> > +     * in the multifd queue yet!).
>> 
>> This is a bit misleading because even today we could split the
>> multifd_ram_flush_and_sync() into _flush and _sync (haven't I seen a
>> patch doing this somewhere? Maybe from Maciej...) and call just the
>> _sync here, which is unrelated to any multifd queue.
>
> Yeah you have a point, maybe at least I shouldn't mention the queues, they
> can be irrelevant.
>
>> 
>> I think we shouldn't tie "sync" with "wait for multifd threads to finish
>> sending their data (a kind of flush)" as this implies. The sync is as
>> much making sure the threads are ready to receive as it is making sure
>> the data is received in order with relation to ram scanning rounds.
>> 
>> IOW, the local sync is what ensures multifd send threads are flushed
>> while this code deals with the sync of src&dst threads, which is "just"
>> a synchronization point between the two QEMUs.
>
> This is a remote sync, not a local sync.  But yes, it can be the
> synchronization point.
>
> As I mentioned below, such sync point has nothing to do with src, so it can
> be implemented in dest alone without such sync message.  It works, though.
>
>> 
>> > However as a side effect, this makes
>> > +     * sure the dest side won't receive any data before it properly 
>> > reaches
>> > +     * ram_load_precopy().
>> 
>> I'm not sure it's a side-effect. It seems deliberate to me, seeing that
>> multifd usually does its own synchronization. For instance, on the send
>> side we also need some sync to make sure ram.c doesn't send data to
>> multifd send threads that are not ready yet (i.e. the wait on
>> channels_ready at the start of multifd_send()).
>
> Yes, and that's exactly what I wanted to express.  If dest has that
> "channels_ready" thing, I'm pretty sure we don't need this remote sync.
> We're using a heavier sync to service the purpose for "local sync for
> dest".  It's ok, but it's very unclear on what it really does.
>
>> 
>> > +     *
>> > +     * Without this sync, src QEMU can send data too soon so that dest may
>> > +     * not have been ready to receive it (e.g., rb->receivedmap may be
>> > +     * uninitialized, for example).
>> > +     *
>> > +     * Logically "wait for recv setup ready" shouldn't need to involve src
>> > +     * QEMU at all, however to be compatible with old QEMUs, let's stick
>> 
>> I don't understand this statement, you're saying that QEMU src could
>> just start dumping data on the channel without a remote end? Certainly
>> for file migrations, but socket as well?
>
> Yes.
>
> When reaching here on sender side, all multifd channels are already there:
> multifd_send_setup() guaranteed it.  We can start dump things, AFAICT,
> irrelevant of what dest is doing.
>
> Maybe the recv threads are not even created, but that isn't relevant, IMO,
> as long as they'll be there at some point and start collecting socket
> buffers.
>
>> 
>> > +     * with this.  Fortunately the overhead is low to sync during setup
>> > +     * because the VM is running, so at least it's not accounted as part 
>> > of
>> > +     * downtime.
>> > +     */
>> >      bql_unlock();
>> >      ret = multifd_ram_flush_and_sync(f);
>> >      bql_lock();
>> 
>
> I removed ambiguous wordings on "queue is empty", and simplified it a bit.
> How's this one look?
>
>     /*
>      * This operation is unfortunate..
>      *
>      * For legacy QEMUs using per-section sync
>      * =======================================
>      *
>      * This must exist because the EOS below requires the SYNC messages
>      * per-channel to work.
>      *
>      * For modern QEMUs using per-round sync
>      * =====================================
>      *
>      * Logically such sync is not needed, and recv threads should not run
>      * until setup ready (using things like channels_ready on src).  Then
>      * we should be all fine.
>      *
>      * However even if we add channels_ready to recv side in new QEMUs, old
>      * QEMU won't have them so this sync will still be needed to make sure
>      * multifd recv threads won't start processing guest pages early before
>      * ram_load_setup() is properly done.
>      *
>      * Let's stick with this.  Fortunately the overhead is low to sync
>      * during setup because the VM is running, so at least it's not
>      * accounted as part of downtime.
>      */

Looks ok, thanks!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]