qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy


From: Anthony Liguori
Subject: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Date: Wed, 23 Feb 2011 10:01:55 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10

On 02/23/2011 09:31 AM, Avi Kivity wrote:
On 02/23/2011 04:35 PM, Anthony Liguori wrote:
On 02/23/2011 07:01 AM, Avi Kivity wrote:
On 02/23/2011 01:14 AM, Anthony Liguori wrote:

-drive already ties into the qemuopts infrastructure and we have readconfig and writeconfig. I don't think we're missing any major pieces to do this in a more proper fashion.

The problem with qemu config files is that it splits the authoritative source of where images are stored into two. Is it in the management tool's database or is it in qemu's config file?

I like to use the phrase "stateful config file". To me, it's just a database for QEMU to persist data about the VM. It's the only way for QEMU to make certain transactions atomic in the face of QEMU crashing.

The user visible config file is a totally different concept. A management tool launches QEMU and tells it where to keep it's state database. The management application may prepopulate the state database or it may just use an empty file.

In that case the word 'config' is misleading. To me, it implies that the user configures something, and qemu reads it, not something mostly internal to qemu.

Understood.


Qemu does keep state. Currently only images, but in theory also the on-board NVRAM.

Yeah, this is a good example of an area where a "stateful config file" would be useful. I like the idea of storing this sort of thing in a text file with a config structure because a user certainly wants to be able to specify the boot order. Being able to tweak this kind of stuff adds a lot of interesting capabilities.


QEMU uses the state database to store information that is created dynamically. For instance, devices added through device_add. A device added via -device wouldn't necessary get added to the state database.

Practically speaking, it let's you invoke QEMU with a fixed command line, while still using the monitor to make changes that would otherwise require the command line being updated.

Then the invoker quickly loses track of what the actual state is. It can't just remember which commands it issued (presumably in response to the user updating user visible state). It has to parse the stateful config file qemu outputs.

Well specifically, it has to ask QEMU and QEMU can tell it the current state via a nice structured data format over QMP. It's a hell of a lot easier than the management tool trying to do this outside of QEMU.

  But at which points should it parse it?

I was thinking that we should post events whenever we change the stateful config. That would let the management tool have a mechanism for determining when settings have been changed. Of course, if the management tool crashes, it should re-read at startup.

I don't think it's reasonable to have three different ways to interact with qemu, all needed: the command line, reading and writing the stateful config file, and the monitor. I'd rather push for starting qemu with a blank guest and assembling (cold-plugging) all the hardware via the monitor before starting the guest.

Yes. I view the command line as optional. To me, this is the ideal interaction:

1) start qemu with an empty stateful config file

2) issue monitor commands to create all devices and backends

3) the stateful config file totally captures the state of all of the issued QMP commands. The management tool can relaunch the guest just by passing the stateful config file to QEMU.

4) when the management tool needs to "extract" a config file, it can read the stateful config (through the monitor) and generate it's own config.

5) the management tool should treat the stateful config file as more or less opaque. It shouldn't be visible to end user.

In the non-managed case, users should interact directly with the config file.

For the problem at hand, one solution is to make qemu stop after the copy, and then management can issue an additional command to rearrange the disk and resume the guest. A drawback here is that if management dies, the guest is stopped until it restarts. We also make management latency guest visible, even if it doesn't die at an inconvenient place.

An alternative approach is to have the copy be performed by a new layered block format driver:

- create a new image, type = live-copy, containing three pieces of information
   - source image
   - destination image
   - copy state (initially nothing is copied)
- tell qemu switch to the new image
- qemu starts copying, updates copy state as needed
- copy finishes, event is emitted; reads and writes still serviced
- management receives event, switches qemu to destination image
- management removes live-copy image

If management dies while this is happening, it can simply query the state of the copy. Similarly, if qemu dies, the copy state is persistent (could be 0/1 or real range of blocks).

This is a more elegant solution to the problem than the commit problem but it's also a one-off. I think we have a generic problem here and we ought to try to solve it generically (within reason).

Can you give more examples?

I think I demonstrated that hot-plug can be solved via the existing interfaces.

Sure. CMOS settings right now are not persisted across reboot. Guest initiated activities like IDE or PCI eject are tricky to persist correctly within a management tool.

We could add events for all of this things but it's all racy since events are posted. If we have a stateful config file, we can make all of these things non-racy and post an event that the config has changed. If there's a crash, the management tool can read the config on startup to catch up on missed events.

I think the nature of a posted event management interface is such that we need a stateful config that persists across QEMU invocations.

Regards,

Anthony Liguori





reply via email to

[Prev in Thread] Current Thread [Next in Thread]