qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 6/7] migration/multifd: Move payload storage out of the c


From: Fabiano Rosas
Subject: Re: [RFC PATCH 6/7] migration/multifd: Move payload storage out of the channel parameters
Date: Fri, 19 Jul 2024 13:54:37 -0300

Peter Xu <peterx@redhat.com> writes:

> On Thu, Jul 18, 2024 at 07:32:05PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Thu, Jul 18, 2024 at 06:27:32PM -0300, Fabiano Rosas wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >> 
>> >> > On Thu, Jul 18, 2024 at 04:39:00PM -0300, Fabiano Rosas wrote:
>> >> >> v2 is ready, but unfortunately this approach doesn't work. When client 
>> >> >> A
>> >> >> takes the payload, it fills it with it's data, which may include
>> >> >> allocating memory. MultiFDPages_t does that for the offset. This means
>> >> >> we need a round of free/malloc at every packet sent. For every client
>> >> >> and every allocation they decide to do.
>> >> >
>> >> > Shouldn't be a blocker?  E.g. one option is:
>> >> >
>> >> >     /* Allocate both the pages + offset[] */
>> >> >     MultiFDPages_t *pages = g_malloc0(sizeof(MultiFDPages_t) +
>> >> >                                       sizeof(ram_addr_t) * n, 1);
>> >> >     pages->allocated = n;
>> >> >     pages->offset = &pages[1];
>> >> >
>> >> > Or.. we can also make offset[] dynamic size, if that looks less tricky:
>> >> >
>> >> > typedef struct {
>> >> >     /* number of used pages */
>> >> >     uint32_t num;
>> >> >     /* number of normal pages */
>> >> >     uint32_t normal_num;
>> >> >     /* number of allocated pages */
>> >> >     uint32_t allocated;
>> >> >     RAMBlock *block;
>> >> >     /* offset of each page */
>> >> >     ram_addr_t offset[0];
>> >> > } MultiFDPages_t;
>> >> 
>> >> I think you missed the point. If we hold a pointer inside the payload,
>> >> we lose the reference when the other client takes the structure and puts
>> >> its own data there. So we'll need to alloc/free everytime we send a
>> >> packet.
>> >
>> > For option 1: when the buffer switch happens, MultiFDPages_t will switch as
>> > a whole, including its offset[], because its offset[] always belong to this
>> > MultiFDPages_t.  So yes, we want to lose that *offset reference together
>> > with MultiFDPages_t here, so the offset[] always belongs to one single
>> > MultiFDPages_t object for its lifetime.
>> 
>> MultiFDPages_t is part of MultiFDSendData, it doesn't get allocated
>> individually:
>> 
>> struct MultiFDSendData {
>>     MultiFDPayloadType type;
>>     union {
>>         MultiFDPages_t ram_payload;
>>     } u;
>> };
>> 
>> (and even if it did, then we'd lose the pointer to ram_payload anyway -
>> or require multiple free/alloc)
>
> IMHO it's the same.
>
> The core idea is we allocate a buffer to put MultiFDSendData which may
> contain either Pages_t or DeviceState_t, and the size of the buffer should
> be MAX(A, B).
>

Right, but with your zero-length array proposals we need to have a
separate allocation for MultiFDPages_t because to expand the array we
need to include the number of pages.

Also, don't think only about MultiFDPages_t. With this approach we
cannot have pointers to memory allocated by the client at all anywhere
inside the union. Every pointer needs to have another reference
somewhere else to ensure we don't leak it. That's an unnecessary
restriction.

>> 
>> >
>> > For option 2: I meant MultiFDPages_t will have no offset[] pointer anymore,
>> > but make it part of the struct (MultiFDPages_t.offset[]).  Logically it's
>> > the same as option 1 but maybe slight cleaner.  We just need to make it
>> > sized 0 so as to be dynamic in size.
>> 
>> Seems like an undefined behavior magnet. If I sent this as the first
>> version, you'd NACK me right away.
>> 
>> Besides, it's an unnecessary restriction to impose in the client
>> code. And like above, we don't allocate the struct directly, it's part
>> of MultiFDSendData, that's an advantage of using the union.
>> 
>> I think we've reached the point where I'd like to hear more concrete
>> reasons for not going with the current proposal, except for the
>> simplicity argument you already put. I like the union idea, but OTOH we
>> already have a working solution right here.
>
> I think the issue with current proposal is each client will need to
> allocate (N+1)*buffer, so more user using it the more buffers we'll need (M
> users, then M*(N+1)*buffer).  Currently it seems to me we will have 3 users
> at least: RAM, VFIO, and some other VMSD devices TBD in mid-long futures;
> the latter two will share the same DeviceState_t.  Maybe vDPA as well at
> some point?  Then 4.

You used the opposite argument earlier in this thread to argue in favor
of the union: We'll only have 2 clients. I'm confused.

Although, granted, this RFC does use more memory.

> I'd agree with this approach only if multifd is flexible enough to not even
> know what's the buffers, but it's not the case, and we seem only care about
> two:
>
>   if (type==RAM)
>      ...
>   else
>      assert(type==DEVICE);
>      ...

I don't understand: "not even know what's the buffers" is exactly what
this series is about. It doesn't have any such conditional on "type".

>
> In this case I think it's easier we have multifd manage all the buffers
> (after all, it knows them well...).  Then the consumption is not
> M*(N+1)*buffer, but (M+N)*buffer.

Fine. As I said, I like the union approach. It's just that it doesn't
work if the client wants to have a pointer in there.

Again, this is client data that multifd holds, it's not multifd
data. MultiFDPages_t or DeviceState_t have nothing to do with
multifd. It should be ok to have:

DeviceState_t *devstate = &p->data->u.device;
devstate->foo = g_new0(...);
devstate->bar = g_new0(...);

just like we have:

MultiFDPages_t *pages = &p->data->u.ram;
pages->offset = g_new0(ram_addr_t, page_count);

>
> Perhaps push your tree somewhere so we can have a quick look?

https://gitlab.com/farosas/qemu/-/commits/multifd-pages-decouple

> I'm totally
> lost when you said I'll nack it.. so maybe I didn't really get what you
> meant.  Codes may clarify that.

I'm conjecturing that any contributor adding a zero-length array (a[0])
would probably be given a hard time on the mailing list. There's 10
instances of it in the code base. The proper way to grow an array is to
use a flexible array (a[]) instead.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]