Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support

From:	Michael R. Hines
Subject:	Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support
Date:	Thu, 13 Jun 2013 17:17:17 -0400
User-agent:	Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130329 Thunderbird/17.0.5

On 06/13/2013 04:06 PM, Paolo Bonzini wrote:


(CC-ing qemu-devel).

OK, that's good to know. This means that we need to bringup the mlock()
problem as a "larger" issue in the linux community instead of the QEMU
community.

In the meantime, how about I make update to the RDMA patch which does
the following:

1. Solution #1:
        If user requests "x-rdma-pin-all", then
             If QEMU has enabled "-realtime mlock=on"
                    Then, allow the capability
             Else
                   Disallow the capability

2. Solution #2:  Create NEW qemu monitor command which locks memory *in
advance*
                           before the migrate command occurs, to clearly
indicate to the user
                           that the cost of locking memory must be paid
before the migration starts.

Which solution do you prefer? Or do you have alternative idea?

Let's just document it in the release notes.  There's time to fix it.

Regarding the timestamp problem, it should be fixed in the RDMA code.
You did find a bug, but xyz_start_outgoing_migration should be
asynchronous and the pinning should happen in the setup phase.  This is
because the setup phase is already running outside the big QEMU lock and
the guest would not be frozen.


I think you misunderstood the symptom. The pinning is *already*
happening in the setup phase (xyz_start_outgoing_migration), not
inside the the migration_thread().

The problem is in Linux: The guest appears to be frozen not because
of any locks but because the pinning itself (allocating and clearing memory)

is slowing down the virtual machine so much that it looks like its notrunning.

I think the patches are ready for merging, because incremental work
makes it easier to discuss the changes(*) but you really need to do two
things before 1.6, or I would rather revert them.


Yes, could someone go ahead and pull them? They are very well bug-tested.


(1) move the pinning to the setup phase

This is already done in the existing patchset.


(2) add a debug mode where every pass unpins all the memory and
restarts.  Speed doesn't matter, this is so that the protocol supports
it from the beginning, and any caching heuristics need to be done on the
source side.  As all debug modes, it will be somewhat prone to bitrot,
but at least there is a reference implementation for anyone who laters
wants to add caching.

I think (2) is very important so that, for example, during fault
tolerance you can reduce a bit the pinned size for smaller workloads,
even without ballooning.


I agree that this is a necessary feature (dynamic source registration), but

it is a lot more complicated than a simple unpin of everything beforeevery pass.As you suggested, I would rather not introduce unused code, but ratherwait until

someone in the future has a full-functional, testable, implementation.

Actually - I am already working on a fault-tolerance implementation aswe speakand will be posting it soon, so it's likely I will submit a patch to dosomething like

this at that time.

     (*) for example, why the introduction of acct_update_position?  Is
     it a fix for a bug that always existed, or driven by some other
     changes?

This is important because RDMA writes do not happen sycnrhonously. It isimpossibleto update the accounting inside of save_live_iterate() because the RDMAoperations

are still outstanding.

It is only until they have completed later that we can actually knowwhether what the

accounting statistics really are.

- Michael

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v7 01/12] rdma: add documentation, (continued)
- [Qemu-devel] [PATCH v7 01/12] rdma: add documentation, mrhines, 2013/06/10
- [Qemu-devel] [PATCH v7 03/12] rdma: export yield_until_fd_readable(), mrhines, 2013/06/10
- [Qemu-devel] [PATCH v7 12/12] rdma: send pc.ram, mrhines, 2013/06/10
- [Qemu-devel] [PATCH v7 08/12] rdma: introduce qemu_ram_foreach_block(), mrhines, 2013/06/10
- [Qemu-devel] [PATCH v7 11/12] rdma: core logic, mrhines, 2013/06/10
- [Qemu-devel] [PATCH v7 06/12] rdma: export qemu_fflush(), mrhines, 2013/06/10
- [Qemu-devel] [PATCH v7 02/12] rdma: introduce qemu_update_position(), mrhines, 2013/06/10
- Message not available
  - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support, Michael R. Hines, 2013/06/13
- Message not available
  - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Message not available
    - Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support, Michael R. Hines, 2013/06/13
    - Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support, Paolo Bonzini, 2013/06/13
    - Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support, Michael R. Hines <=
    - Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support, Paolo Bonzini, 2013/06/13
    - Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support, Michael R. Hines, 2013/06/14
    - [Qemu-devel] RDMA: please pull and re-test freezing fixes, Michael R. Hines, 2013/06/14
    - Re: [Qemu-devel] RDMA: please pull and re-test freezing fixes, Michael R. Hines, 2013/06/14
    - Re: [Qemu-devel] RDMA: please pull and re-test freezing fixes, Chegu Vinod, 2013/06/15
    - Re: [Qemu-devel] RDMA: please pull and re-test freezing fixes, Michael R. Hines, 2013/06/16

Prev by Date: Re: [Qemu-devel] [PATCH] allow reading variable size vmdk descriptor files
Next by Date: [Qemu-devel] [Bug 830723] Re: bad patch net_h.patch on freebsd ports
Previous by thread: Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support
Next by thread: Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support
Index(es):
- Date
- Thread