qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 6/7] migration/multifd: Move payload storage out of the c


From: Fabiano Rosas
Subject: Re: [RFC PATCH 6/7] migration/multifd: Move payload storage out of the channel parameters
Date: Wed, 10 Jul 2024 13:10:37 -0300

Peter Xu <peterx@redhat.com> writes:

> On Thu, Jun 27, 2024 at 11:27:08AM +0800, Wang, Lei wrote:
>> > Or graphically:
>> > 
>> > 1) client fills the active slot with data. Channels point to nothing
>> >    at this point:
>> >   [a]      <-- active slot
>> >   [][][][] <-- free slots, one per-channel
>> > 
>> >   [][][][] <-- channels' p->data pointers
>> > 
>> > 2) multifd_send() swaps the pointers inside the client slot. Channels
>> >    still point to nothing:
>> >   []
>> >   [a][][][]
>> > 
>> >   [][][][]
>> > 
>> > 3) multifd_send() finds an idle channel and updates its pointer:
>> 
>> It seems the action "finds an idle channel" is in step 2 rather than step 3,
>> which means the free slot is selected based on the id of the channel found, 
>> am I
>> understanding correctly?
>
> I think you're right.
>
> Actually I also feel like the desription here is ambiguous, even though I
> think I get what Fabiano wanted to say.
>
> The free slot should be the first step of step 2+3, here what Fabiano
> really wanted to suggest is we move the free buffer array from multifd
> channels into the callers, then the caller can pass in whatever data to
> send.
>
> So I think maybe it's cleaner to write it as this in code (note: I didn't
> really change the code, just some ordering and comments):
>
> ===8<===
> @@ -710,15 +710,11 @@ static bool multifd_send(MultiFDSlots *slots)
>       */
>      active_slot = slots->active;
>      slots->active = slots->free[p->id];
> -    p->data = active_slot;
> -
> -    /*
> -     * By the next time we arrive here, the channel will certainly
> -     * have consumed the active slot. Put it back on the free list
> -     * now.
> -     */
>      slots->free[p->id] = active_slot;
>  
> +    /* Assign the current active slot to the chosen thread */
> +    p->data = active_slot;
> ===8<===
>
> The comment I removed is slightly misleading to me too, because right now 
> active_slot contains the data hasn't yet been delivered to multifd, so
> we're "putting it back to free list" not because of it's free, but because
> we know it won't get used until the multifd send thread consumes it
> (because before that the thread will be busy, and we won't use the buffer
> if so in upcoming send()s).
>
> And then when I'm looking at this again, I think maybe it's a slight
> overkill, and maybe we can still keep the "opaque data" managed by multifd.
> One reason might be that I don't expect the "opaque data" payload keep
> growing at all: it should really be either RAM or device state as I
> commented elsewhere in a relevant thread, after all it's a thread model
> only for migration purpose to move vmstates..

Some amount of flexibility needs to be baked in. For instance, what
about the handshake procedure? Don't we want to use multifd threads to
put some information on the wire for that as well?

> Putting it managed by multifd thread should involve less change than this
> series, but it could look like this:
>
> typedef enum {
>     MULTIFD_PAYLOAD_RAM = 0,
>     MULTIFD_PAYLOAD_DEVICE_STATE = 1,
> } MultifdPayloadType;
>
> typedef enum {
>     MultiFDPages_t ram_payload;
>     MultifdDeviceState_t device_payload;
> } MultifdPayload;
>
> struct MultiFDSendData {
>     MultifdPayloadType type;
>     MultifdPayload data;
> };

Is that an union up there? So you want to simply allocate in multifd the
max amount of memory between the two types of payload? But then we'll
need a memset(p->data, 0, ...) at every round of sending to avoid giving
stale data from one client to another. That doesn't work with the
current ram migration because it wants p->pages to remain active across
several calls of multifd_queue_page().

>
> Then the "enum" makes sure the payload only consumes only the max of both
> types; a side benefit to save some memory.
>
> I think we need to make sure MultifdDeviceState_t is generic enough so that
> it will work for mostly everything (especially normal VMSDs).  In this case
> the VFIO series should be good as that was currently defined as:
>
> typedef struct {
>     MultiFDPacketHdr_t hdr;
>
>     char idstr[256] QEMU_NONSTRING;
>     uint32_t instance_id;
>
>     /* size of the next packet that contains the actual data */
>     uint32_t next_packet_size;
> } __attribute__((packed)) MultiFDPacketDeviceState_t;

This is the packet, a different thing. Not sure if your paragraph above
means to talk about that or really MultifdDeviceState, which is what is
exchanged between the multifd threads and the client code.

>
> IIUC that was what we need exactly with idstr+instance_id, so as to nail
> exactly at where should the "opaque device state" go to, then load it with
> a buffer-based loader when it's ready (starting from VFIO, to get rid of
> qemufile).  For VMSDs in the future if ever possible, that should be a
> modified version of vmstate_load() where it may take buffers not qemufiles.
>
> To Maciej: please see whether above makes sense to you, and if you also
> agree please consider that with your VFIO work.
>
> Thanks,
>
>> 
>> >   []
>> >   [a][][][]
>> > 
>> >   [a][][][]
>> >   ^idle



reply via email to

[Prev in Thread] Current Thread [Next in Thread]