[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] RDMA: Reduce restriction on block length match
From: |
Juan Quintela |
Subject: |
Re: [Qemu-devel] [PATCH] RDMA: Reduce restriction on block length match |
Date: |
Wed, 08 Jul 2015 21:06:16 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
"Dr. David Alan Gilbert" <address@hidden> wrote:
> * Juan Quintela (address@hidden) wrote:
>> "Dr. David Alan Gilbert (git)" <address@hidden> wrote:
>> > From: "Dr. David Alan Gilbert" <address@hidden>
>> >
>> > My e4d633207 patch has an over zealous sanity check that checked
>> > the lengths of the RAM Blocks on source/destination were the same. This
>> > isn't true because of the 'used_length' trick for RAM blocks like the
>> > ACPI table that vary in size.
>> >
>> > Prior to that patch RDMA would also fail in this case, but it should
>> > now work with the changes in the set e4d633207 is in.
>> >
>> > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
>> >
>> > Fixes: e4d633207c129dc5b7d145240ac4a1997ef3902f
>> > ---
>> > migration/rdma.c | 13 +++++++------
>> > 1 file changed, 7 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/migration/rdma.c b/migration/rdma.c
>> > index f106b2a..1d094b0 100644
>> > --- a/migration/rdma.c
>> > +++ b/migration/rdma.c
>> > @@ -3338,14 +3338,15 @@ static int qemu_rdma_registration_stop(QEMUFile
>> > *f, void *opaque,
>> > for (i = 0; i < nb_dest_blocks; i++) {
>> > network_to_dest_block(&rdma->dest_blocks[i]);
>> >
>> > - /* We require that the blocks are in the same order */
>> > + /* We require that the blocks are in the same order,
>> > + * but the used_length trick for acpi blocks means that
>> > + * the destination can validly be larger than the source
>> > + */
>> > if (rdma->dest_blocks[i].length != local->block[i].length) {
>>
>> Should we change the check to be that destination is bigger or equal
>> than source?
>>
>> With your change, we only remove the check?
>
> I'm actually going to drop this change; so keep the error if they're
> different.
>
> My argument works like this (I've not yet found a good way to test it):
>
> 1) The source sends to the destination a list of RAM blocks in the
> qemu-file stream
> 2) The destination performs a resize on the RAM blocks to match the source
> so at this point the destination's block sizes should match.
Humm, I *thought* that what the resize does is getting it bigger, but if
destination is bigger, it does nothing, no?
> 3) The source sends a series of RDMA block registration requests for the
> RAM
> 4) The destination sends a list of RAM registrations back to the source
> 5) This check is checking that this destination list matches the local list
> 6) As long as (4) happens after (2) then the size that the destination sees
> should always match the source.
> 7) I think 4 is after 2 due to a qemu_fflush
>
> So keeping this check guards against 7 not really being true and/or
> the destination populating it's list of blocks prior to (2) - which I have
> a sneaky feeling might be happening, but am not sure yet.
>
> Dave
>
>>
>> Thanks, Juan.
>>
>>
>> > - ERROR(errp, "Block %s/%d has a different length %" PRIu64
>> > - "vs %" PRIu64, local->block[i].block_name, i,
>> > - local->block[i].length,
>> > + fprintf(stderr, "INFO: Block %s/%d has a different length
>> > %"
>> > + PRIu64 "vs %" PRIu64,
>> > local->block[i].block_name,
>> > + i, local->block[i].length,
>> > rdma->dest_blocks[i].length);
>> > - rdma->error_state = -EINVAL;
>> > - return -EINVAL;
>> > }
>> > local->block[i].remote_host_addr =
>> > rdma->dest_blocks[i].remote_host_addr;
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK