qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 00/13] Multifd 🔀 device state transfer support with VFIO c


From: Fabiano Rosas
Subject: Re: [PATCH v1 00/13] Multifd 🔀 device state transfer support with VFIO consumer
Date: Wed, 17 Jul 2024 17:19:15 -0300

Peter Xu <peterx@redhat.com> writes:

> On Tue, Jul 16, 2024 at 10:10:12PM +0200, Maciej S. Szmigiero wrote:
>> On 27.06.2024 16:56, Peter Xu wrote:
>> > On Thu, Jun 27, 2024 at 11:14:28AM +0200, Maciej S. Szmigiero wrote:
>> > > On 26.06.2024 18:23, Peter Xu wrote:
>> > > > On Wed, Jun 26, 2024 at 05:47:34PM +0200, Maciej S. Szmigiero wrote:
>> > > > > On 26.06.2024 03:51, Peter Xu wrote:
>> > > > > > On Wed, Jun 26, 2024 at 12:44:29AM +0200, Maciej S. Szmigiero 
>> > > > > > wrote:
>> > > > > > > On 25.06.2024 19:25, Peter Xu wrote:
>> > > > > > > > On Mon, Jun 24, 2024 at 09:51:18PM +0200, Maciej S. Szmigiero 
>> > > > > > > > wrote:
>> > > > > > > > > Hi Peter,
>> > > > > > > > 
>> > > > > > > > Hi, Maciej,
>> > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > On 23.06.2024 22:27, Peter Xu wrote:
>> > > > > > > > > > On Tue, Jun 18, 2024 at 06:12:18PM +0200, Maciej S. 
>> > > > > > > > > > Szmigiero wrote:
>> > > > > > > > > > > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>> > > > > > > > > > > 
>> > > > > > > > > > > This is an updated v1 patch series of the RFC (v0) 
>> > > > > > > > > > > series located here:
>> > > > > > > > > > > https://lore.kernel.org/qemu-devel/cover.1713269378.git.maciej.szmigiero@oracle.com/
>> > > > > > > > > > 
>> > > > > > > > > > OK I took some hours thinking about this today, and here's 
>> > > > > > > > > > some high level
>> > > > > > > > > > comments for this series.  I'll start with which are more 
>> > > > > > > > > > relevant to what
>> > > > > > > > > > Fabiano has already suggested in the other thread, then 
>> > > > > > > > > > I'll add some more.
>> > > > > > > > > > 
>> > > > > > > > > > https://lore.kernel.org/r/20240620212111.29319-1-farosas@suse.de
>> > > > > > > > > 
>> > > > > > > > > That's a long list, thanks for these comments.
>> > > > > > > > > 
>> > > > > > > > > I have responded to them inline below.
>> > > > > > > > > (..)
>> > > > > > > 
>> > > > > > > 2) Submit this operation to the thread pool and wait for it to 
>> > > > > > > complete,
>> > > > > > 
>> > > > > > VFIO doesn't need to have its own code waiting.  If this pool is 
>> > > > > > for
>> > > > > > migration purpose in general, qemu migration framework will need 
>> > > > > > to wait at
>> > > > > > some point for all jobs to finish before moving on.  Perhaps it 
>> > > > > > should be
>> > > > > > at the end of the non-iterative session.
>> > > > > 
>> > > > > So essentially, instead of calling save_live_complete_precopy_end 
>> > > > > handlers
>> > > > > from the migration code you would like to hard-code its current VFIO
>> > > > > implementation of calling 
>> > > > > vfio_save_complete_precopy_async_thread_thread_terminate().
>> > > > > 
>> > > > > Only it wouldn't be then called VFIO precopy async thread terminate 
>> > > > > but some
>> > > > > generic device state async precopy thread terminate function.
>> > > > 
>> > > > I don't understand what did you mean by "hard code".
>> > > 
>> > > "Hard code" wasn't maybe the best expression here.
>> > > 
>> > > I meant the move of the functionality that's provided by
>> > > vfio_save_complete_precopy_async_thread_thread_terminate() in this patch 
>> > > set
>> > > to the common migration code.
>> > 
>> > I see.  That function only does a thread_join() so far.
>> > 
>> > So can I understand it as below [1] should work for us, and it'll be clean
>> > too (with nothing to hard-code)?
>> 
>> It will need some signal to the worker threads pool to terminate before
>> waiting for them to finish (as the code in [1] just waits).
>> 
>> In the case of current vfio_save_complete_precopy_async_thread() 
>> implementation,
>> this signal isn't necessary as this thread simply terminates when it has read
>> all the date it needs from the device.
>> 
>> In a worker threads pool case there will be some threads waiting for
>> jobs to be queued to them and so they will need to be somehow signaled
>> to exit.
>
> Right.  We may need something like multifd_send_should_exit() +
> MultiFDSendParams.sem.  It'll be nicer if we can generalize that part so
> multifd threads can also rebase to that thread model, but maybe I'm asking
> too much.
>
>> 
>> > The time to join() the worker threads can be even later, until
>> > migrate_fd_cleanup() on sender side.  You may have a better idea on when
>> > would be the best place to do it when start working on it.
>> > 
>> > > 
>> > > > What I was saying is if we target the worker thread pool to be used for
>> > > > "concurrently dump vmstates", then it'll make sense to make sure all 
>> > > > the
>> > > > jobs there were flushed after qemu dumps all non-iterables (because 
>> > > > this
>> > > > should be the last step of the switchover).
>> > > > 
>> > > > I expect it looks like this:
>> > > > 
>> > > >     while (pool->active_threads) {
>> > > >         qemu_sem_wait(&pool->job_done);
>> > > >     }
>> > 
>> > [1]
>> > 
>> (..)
>> > > I think that with this thread pool introduction we'll unfortunately 
>> > > almost certainly
>> > > need to target this patch set at 9.2, since these overall changes (and 
>> > > Fabiano
>> > > patches too) will need good testing, might uncover some performance 
>> > > regressions
>> > > (for example related to the number of buffers limit or Fabiano multifd 
>> > > changes),
>> > > bring some review comments from other people, etc.
>> > > 
>> > > In addition to that, we are in the middle of holiday season and a lot of 
>> > > people
>> > > aren't available - like Fabiano said he will be available only in a few 
>> > > weeks.
>> > 
>> > Right, that's unfortunate.  Let's see, but still I really hope we can also
>> > get some feedback from Fabiano before it lands, even with that we have
>> > chance for 9.1 but it's just challenging, it's the same condition I
>> > mentioned since the 1st email.  And before Fabiano's back (he's the active
>> > maintainer for this release), I'm personally happy if you can propose
>> > something that can land earlier in this release partly.  E.g., if you want
>> > we can at least upstream Fabiano's idea first, or some more on top.
>> > 
>> > For that, also feel to have a look at my comment today:
>> > 
>> > https://lore.kernel.org/r/Zn15y693g0AkDbYD@x1n
>> > 
>> > Feel free to comment there too.  There's a tiny uncertainty there so far on
>> > specifying "max size for a device state" if do what I suggested, as multifd
>> > setup will need to allocate an enum buffer suitable for both ram + device.
>> > But I think that's not an issue and you'll tackle that properly when
>> > working on it.  It's more about whether you agree on what I said as a
>> > general concept.
>> > 
>> 
>> Since it seems that the discussion on Fabiano's patch set has subsided I 
>> think
>> I will start by basing my updated patch set on top of his RFC and then if
>> Fabiano wants to submit v1/v2 of his patch set then I will rebase mine on top
>> of it.
>> 
>> Otherwise, you can wait until I have a v2 ready and then we can work with 
>> that.
>
> Oh I thought you already started modifying his patchset.
>
> In this case, AFAIR Fabiano has plan to rework that RFC series, so maybe
> you want to double check with him, and can also wait for his new version if
> that's easier, because I do expect there'll be major changes.
>
> Fabiano?

Don't wait on me. I think I can make the changes Peter suggested without
affecting too much the interfaces used by this series. If it comes to
it, I can rebase this series "under" Maciej's.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]