qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 00/13] Multifd 🔀 device state transfer support with VFIO c


From: Peter Xu
Subject: Re: [PATCH v1 00/13] Multifd 🔀 device state transfer support with VFIO consumer
Date: Wed, 17 Jul 2024 14:49:48 -0400

On Tue, Jul 16, 2024 at 10:10:12PM +0200, Maciej S. Szmigiero wrote:
> On 27.06.2024 16:56, Peter Xu wrote:
> > On Thu, Jun 27, 2024 at 11:14:28AM +0200, Maciej S. Szmigiero wrote:
> > > On 26.06.2024 18:23, Peter Xu wrote:
> > > > On Wed, Jun 26, 2024 at 05:47:34PM +0200, Maciej S. Szmigiero wrote:
> > > > > On 26.06.2024 03:51, Peter Xu wrote:
> > > > > > On Wed, Jun 26, 2024 at 12:44:29AM +0200, Maciej S. Szmigiero wrote:
> > > > > > > On 25.06.2024 19:25, Peter Xu wrote:
> > > > > > > > On Mon, Jun 24, 2024 at 09:51:18PM +0200, Maciej S. Szmigiero 
> > > > > > > > wrote:
> > > > > > > > > Hi Peter,
> > > > > > > > 
> > > > > > > > Hi, Maciej,
> > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On 23.06.2024 22:27, Peter Xu wrote:
> > > > > > > > > > On Tue, Jun 18, 2024 at 06:12:18PM +0200, Maciej S. 
> > > > > > > > > > Szmigiero wrote:
> > > > > > > > > > > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> > > > > > > > > > > 
> > > > > > > > > > > This is an updated v1 patch series of the RFC (v0) series 
> > > > > > > > > > > located here:
> > > > > > > > > > > https://lore.kernel.org/qemu-devel/cover.1713269378.git.maciej.szmigiero@oracle.com/
> > > > > > > > > > 
> > > > > > > > > > OK I took some hours thinking about this today, and here's 
> > > > > > > > > > some high level
> > > > > > > > > > comments for this series.  I'll start with which are more 
> > > > > > > > > > relevant to what
> > > > > > > > > > Fabiano has already suggested in the other thread, then 
> > > > > > > > > > I'll add some more.
> > > > > > > > > > 
> > > > > > > > > > https://lore.kernel.org/r/20240620212111.29319-1-farosas@suse.de
> > > > > > > > > 
> > > > > > > > > That's a long list, thanks for these comments.
> > > > > > > > > 
> > > > > > > > > I have responded to them inline below.
> > > > > > > > > (..)
> > > > > > > 
> > > > > > > 2) Submit this operation to the thread pool and wait for it to 
> > > > > > > complete,
> > > > > > 
> > > > > > VFIO doesn't need to have its own code waiting.  If this pool is for
> > > > > > migration purpose in general, qemu migration framework will need to 
> > > > > > wait at
> > > > > > some point for all jobs to finish before moving on.  Perhaps it 
> > > > > > should be
> > > > > > at the end of the non-iterative session.
> > > > > 
> > > > > So essentially, instead of calling save_live_complete_precopy_end 
> > > > > handlers
> > > > > from the migration code you would like to hard-code its current VFIO
> > > > > implementation of calling 
> > > > > vfio_save_complete_precopy_async_thread_thread_terminate().
> > > > > 
> > > > > Only it wouldn't be then called VFIO precopy async thread terminate 
> > > > > but some
> > > > > generic device state async precopy thread terminate function.
> > > > 
> > > > I don't understand what did you mean by "hard code".
> > > 
> > > "Hard code" wasn't maybe the best expression here.
> > > 
> > > I meant the move of the functionality that's provided by
> > > vfio_save_complete_precopy_async_thread_thread_terminate() in this patch 
> > > set
> > > to the common migration code.
> > 
> > I see.  That function only does a thread_join() so far.
> > 
> > So can I understand it as below [1] should work for us, and it'll be clean
> > too (with nothing to hard-code)?
> 
> It will need some signal to the worker threads pool to terminate before
> waiting for them to finish (as the code in [1] just waits).
> 
> In the case of current vfio_save_complete_precopy_async_thread() 
> implementation,
> this signal isn't necessary as this thread simply terminates when it has read
> all the date it needs from the device.
> 
> In a worker threads pool case there will be some threads waiting for
> jobs to be queued to them and so they will need to be somehow signaled
> to exit.

Right.  We may need something like multifd_send_should_exit() +
MultiFDSendParams.sem.  It'll be nicer if we can generalize that part so
multifd threads can also rebase to that thread model, but maybe I'm asking
too much.

> 
> > The time to join() the worker threads can be even later, until
> > migrate_fd_cleanup() on sender side.  You may have a better idea on when
> > would be the best place to do it when start working on it.
> > 
> > > 
> > > > What I was saying is if we target the worker thread pool to be used for
> > > > "concurrently dump vmstates", then it'll make sense to make sure all the
> > > > jobs there were flushed after qemu dumps all non-iterables (because this
> > > > should be the last step of the switchover).
> > > > 
> > > > I expect it looks like this:
> > > > 
> > > >     while (pool->active_threads) {
> > > >         qemu_sem_wait(&pool->job_done);
> > > >     }
> > 
> > [1]
> > 
> (..)
> > > I think that with this thread pool introduction we'll unfortunately 
> > > almost certainly
> > > need to target this patch set at 9.2, since these overall changes (and 
> > > Fabiano
> > > patches too) will need good testing, might uncover some performance 
> > > regressions
> > > (for example related to the number of buffers limit or Fabiano multifd 
> > > changes),
> > > bring some review comments from other people, etc.
> > > 
> > > In addition to that, we are in the middle of holiday season and a lot of 
> > > people
> > > aren't available - like Fabiano said he will be available only in a few 
> > > weeks.
> > 
> > Right, that's unfortunate.  Let's see, but still I really hope we can also
> > get some feedback from Fabiano before it lands, even with that we have
> > chance for 9.1 but it's just challenging, it's the same condition I
> > mentioned since the 1st email.  And before Fabiano's back (he's the active
> > maintainer for this release), I'm personally happy if you can propose
> > something that can land earlier in this release partly.  E.g., if you want
> > we can at least upstream Fabiano's idea first, or some more on top.
> > 
> > For that, also feel to have a look at my comment today:
> > 
> > https://lore.kernel.org/r/Zn15y693g0AkDbYD@x1n
> > 
> > Feel free to comment there too.  There's a tiny uncertainty there so far on
> > specifying "max size for a device state" if do what I suggested, as multifd
> > setup will need to allocate an enum buffer suitable for both ram + device.
> > But I think that's not an issue and you'll tackle that properly when
> > working on it.  It's more about whether you agree on what I said as a
> > general concept.
> > 
> 
> Since it seems that the discussion on Fabiano's patch set has subsided I think
> I will start by basing my updated patch set on top of his RFC and then if
> Fabiano wants to submit v1/v2 of his patch set then I will rebase mine on top
> of it.
> 
> Otherwise, you can wait until I have a v2 ready and then we can work with 
> that.

Oh I thought you already started modifying his patchset.

In this case, AFAIR Fabiano has plan to rework that RFC series, so maybe
you want to double check with him, and can also wait for his new version if
that's easier, because I do expect there'll be major changes.

Fabiano?

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]