qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH RDMA support v2: 5/6] connection-setup code


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v2: 5/6] connection-setup code between client/server
Date: Mon, 18 Feb 2013 11:52:52 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

Il 14/02/2013 20:29, Michael R. Hines ha scritto:
> 
>> Are you still using the tcp for transferring device state? If so you
>> can call the tcp functions from the migration rdma code as a first
>> step but I would prefer it to use RDMA too.
> 
> This is the crux of the problem of using RDMA for migration: Currently
> all of the QEMU migration control logic and device state goes through
> the the QEMUFile implementation. RDMA, however is by nature a zero-copy
> solution and is incompatible with QEMUFile.

With the patches I sent yesterday, there is no more buffering involved
in migration.c.  All data goes straight from arch_init.c to a QEMUFile.

QEMUFile still does some buffering, but this should change with other
patches that Orit is working on.

> Using RDMA for transferring device state is not recommended: Setuping an
> RDMA requires registering the memory locations on both sides with the
> RDMA hardware. This is not ideal because this would require pinning the
> memory holding the device state and then issuing the RDMA transfer for
> *each* type of device - which would require changing the control path of
> every type of migrated device in QEMU.

Yes, this would not work well.  However, you can (I think) define a
QEMUFileOps implementation for RDMA that:

1) registers the buffer of a QEMUFile with the RDMA hardware;

2) in its get_buffer (receive) and put_buffer (send) callbacks, issues a
synchronous RDMA transfer;

3) unregisters the buffer in the close callback.

As a proof of concept, this would also work (though it would make no
sense) for transferring the RAM; in the end of course it would be used
only for the device state.

It's not a problem to add more operations to QEMUFileOps or similar.
It's also not a problem to change the way buf is allocated, if you need
it to be page-aligned or something like that.

It is much better than adding migration_use_rdma() everywhere.

Paolo

> 
> Currently the Patch we submitted bypasses QEMUFile. It does just issues
> the RDMA transfer for the memory that was dirtied and then continues
> along with the rest of the migration call path normally.
> 
> In an ideal world, we would prefer a hyrbid approach, something like:
> 
> *Begin Migration Iteration Round:*
> 1. stop VCPU
> 2. start iterative pass over memory
> 3. send control signals (if any) / device state to QEMUFile
> 4. When a dirty memory page is found, do:
>      a) Instruct the QEMUFile to block
>      b) Issue the RDMA transfer
>      c) Instruct the QEMUFile to unblock
> 5. resume VCPU
> 
> This would require a "smarter" QEMUFile implementation that understands
> when to block and for how long.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]