qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 6/7] migration/multifd: Move payload storage out of the c


From: Fabiano Rosas
Subject: Re: [RFC PATCH 6/7] migration/multifd: Move payload storage out of the channel parameters
Date: Thu, 11 Jul 2024 18:12:44 -0300

Peter Xu <peterx@redhat.com> writes:

> On Thu, Jul 11, 2024 at 04:37:34PM -0300, Fabiano Rosas wrote:
>
> [...]
>
>> We also don't flush the iov at once, so f->buf seems redundant to
>> me. But of course, if we touch any of that we must ensure we're not
>> dropping any major optimization.
>
> Yes some tests over that would be more persuasive when it comes.
>
> Per my limited experience in the past few years: memcpy on chips nowadays
> is pretty cheap.  You'll see very soon one more example of that when you
> start to look at the qatzip series: that series decided to do one more
> memcpy for all guest pages, to make it a larger chunk of buffer instead of
> submitting the compression tasks in 4k chunks (while I thought 4k wasn't
> too small itself).
>
> That may be more involved so may not be a great example (e.g. the
> compression algo can be special in this case where it just likes larger
> buffers), but it's not uncommon that I see people trade things with memcpy,
> especially small buffers.
>
> [...]
>
>> Any piece of code that fills an iov with data is prone to be able to
>> send that data through multifd. From this perspective, multifd is just a
>> way to give work to an iochannel. We don't *need* to use it, but it
>> might be simple enough to the point that the benefit of ditching
>> QEMUFile can be reached without too much rework.
>> 
>> Say we provision multifd threads early and leave them waiting for any
>> part of the migration code to send some data. We could have n-1 threads
>> idle waiting for the bulk of the data and use a single thread for any
>> early traffic that does not need to be parallel.
>> 
>> I'm not suggesting we do any of this right away or even that this is the
>> correct way to go, I'm just letting you know some of my ideas and why I
>> think ram + device state might not be the only data we put through
>> multifd.
>
> We can wait and see whether that can be of any use in the future, even if
> so, we still have chance to add more types into the union, I think.  But
> again, I don't expect.
>
> My gut feeling: we shouldn't bother putting any (1) non-huge-chunk, or (2)
> non-IO, data onto multifd.  Again, I would ask "why not the main channel",
> otherwise.
>
> [...]
>
>> Just to be clear, do you want a thread-pool to replace multifd? Or would
>> that be only used for concurrency on the producer side?
>
> Not replace multifd.  It's just that I was imagining multifd threads only
> manage IO stuff, nothing else.
>
> I was indeed thinking whether we can reuse multifd threads, but then I
> found there's risk mangling these two concepts, as: when we do more than IO
> in multifd threads (e.g., talking to VFIO kernel fetching data which can
> block), we have risk of blocking IO even if we can push more so the NICs
> can be idle again.  There's also the complexity where the job fetches data
> from VFIO kernel and want to enqueue again, it means an multifd task can
> enqueue to itself, and circular enqueue can be challenging: imagine 8
> concurrent tasks (with a total of 8 multifd threads) trying to enqueue at
> the same time; they hunger themselves to death.  Things like that.  Then I
> figured the rest jobs are really fn(void*) type of things; they should
> deserve their own pool of threads.
>
> So the VFIO threads (used to be per-device) becomes migration worker
> threads, we need them for both src/dst: on dst there's still pending work
> to apply the continuous VFIO data back to the kernel driver, and that can't
> be done by multifd thread too due to similar same reason.  Then those dest
> side worker threads can also do load() not only for VFIO but also other
> device states if we can add more.
>
> So to summary, we'll have:
>
>   - 1 main thread (send / recv)
>   - N multifd threads (IOs only)
>   - M worker threads (jobs only)
>
> Of course, postcopy not involved..  How's that sound?

Looks good. There's a better divide between producer and consumer this
way. I think it will help when designing new features.

One observation is that we'll still have two different entities doing IO
(multifd threads and the migration thread), which I would prefer were
using a common code at a higher level than the iochannel.

One thing that I tried to look into for mapped-ram was whether we could
set up iouring in the migration code, but got entirely discouraged by
the migration thread doing IO at random points. And of course, you've
seen what we had to do with direct-io. That was in part due to having
the migration thread in parallel doing it's small writes at undetermined
points in time.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]