qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 7/10] Switch the memory savevm handler to be "li


From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH 7/10] Switch the memory savevm handler to be "live"
Date: Tue, 09 Sep 2008 17:49:15 -0500
User-agent: Thunderbird 2.0.0.16 (X11/20080723)

Jamie Lokier wrote:
Anthony Liguori wrote:
This patch replaces the static memory savevm/loadvm handler with a
"live" one.  This handler is used even if performing a non-live
migration.

Excellent.  One of the annoyances of savevm currently is it pauses the
VM for a significant time, so you can't use it to snapshot production
systems being used.

qcow2 needs some modification to allow this, but yeah, that's on my todo. When you do a savevm today, you write everything to a chunk of qcow2 file (presumably at the end). The only thing keeping others from allocating over you is that you're essentially holding the big qemu lock (because we're single threaded). With an asynchronous savevm, this no longer holds. So what we really need to do, is let snapshots chain within a qcow2 file. We can then write chunks of savevm data at a time and chain the chunks together.

Shouldn't be that hard and should be reasonable to do in a backwards compatible way.

The key difference between this handler and the previous is that each page is
prefixed with the address of the page.  The QEMUFile rate limiting code, in
combination with the live migration dirty tracking bits, is used to determine
which pages should be sent and how many should be sent.

The live save code "converges" when the number of dirty pages
reaches a fixed amount.  Currently, this is 10 pages.  This is
something that should eventually be derived from whatever the
bandwidth limitation is.

Does this mean that a snapshot could record the same page many times,
perhaps even unbounded, while the guest is dirtying pages at a high
rate?  Or is the guest dirtying rate limited too to ensure the file
writer will converge in bounded time?

With synchronous savevm (non-live), it's all deterministic. Everything starts out dirty and nothing will get dirtied again because the guest isn't running. With asynchronous savevm, it's indeterministic.

In general, you can't avoid the indeterminism. In practice, you usually converge quickly so simply having a max iterations where if you exceed, you stop the guest and revert to a synchronous savevm is completely reasonable.

The other options would be to fail after a certain number of iterations or just completely punt to the management tools and provide a mechanism to cancel an existing live migration if it takes too long. This functionality exists in KVM, I simple need to add it to this patch series. It's quite simple really.

Regards,

Anthony Liguori

Thanks,
-- Jamie





reply via email to

[Prev in Thread] Current Thread [Next in Thread]