qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU interfaces for image streaming and post-copy block


From: Alexander Graf
Subject: Re: [Qemu-devel] QEMU interfaces for image streaming and post-copy block migration
Date: Tue, 7 Sep 2010 16:01:41 +0200

On 07.09.2010, at 15:41, Anthony Liguori wrote:

> Hi,
> 
> We've got copy-on-read and image streaming working in QED and before going 
> much further, I wanted to bounce some interfaces off of the libvirt folks to 
> make sure our final interface makes sense.
> 
> Here's the basic idea:
> 
> Today, you can create images based on base images that are copy on write.  
> With QED, we also support copy on read which forces a copy from the backing 
> image on read requests and write requests.
> 
> In additional to copy on read, we introduce a notion of streaming a block 
> device which means that we search for an unallocated region of the leaf image 
> and force a copy-on-read operation.
> 
> The combination of copy-on-read and streaming means that you can start a 
> guest based on slow storage (like over the network) and bring in blocks on 
> demand while also having a deterministic mechanism to complete the transfer.
> 
> The interface for copy-on-read is just an option within qemu-img create.  
> Streaming, on the other hand, requires a bit more thought.  Today, I have a 
> monitor command that does the following:
> 
> stream <device> <sector offset>
> 
> Which will try to stream the minimal amount of data for a single I/O 
> operation and then return how many sectors were successfully streamed.
> 
> The idea about how to drive this interface is a loop like:
> 
> offset = 0;
> while offset < image_size:
>   wait_for_idle_time()
>   count = stream(device, offset)
>   offset += count
> 
> Obviously, the "wait_for_idle_time()" requires wide system awareness.  The 
> thing I'm not sure about is 1) would libvirt want to expose a similar stream 
> interface and let management software determine idle time 2) attempt to 
> detect idle time on it's own and provide a higher level interface.  If (2), 
> the question then becomes whether we should try to do this within qemu and 
> provide libvirt a higher level interface.

I'm torn here too. Why not expose both? Have a qemu internal daemon available 
that gets a sleep time as parameter and an external "pull sectors" command. 
We'll see which one is more useful, but I don't think it's too much code to 
justify only having one of the two. And the internal daemon could be started 
using a command line parameter, which helps non-managed users.

> 
> A related topic is block migration.  Today we support pre-copy migration 
> which means we transfer the block device and then do a live migration.  
> Another approach is to do a live migration, and on the source, run a block 
> server using image streaming on the destination to move the device.
> 
> With QED, to implement this one would:
> 
> 1) launch qemu-nbd on the source while the guest is running
> 2) create a qed file on the destination with copy-on-read enabled and a 
> backing file using nbd: to point to the source qemu-nbd
> 3) run qemu -incoming on the destination with the qed file
> 4) execute the migration
> 5) when migration completes, begin streaming on the destination to complete 
> the copy
> 6) when the streaming is complete, shut down the qemu-nbd instance on the 
> source
> 
> This is a bit involved and we could potentially automate some of this in qemu 
> by launching qemu-nbd and providing commands to do some of this.  Again 
> though, I think the question is what type of interfaces would libvirt prefer? 
>  Low level interfaces + recipes on how to do high level things or higher 
> level interfaces?

Is there anything keeping us from making the QMP socket multiplexable? I was 
thinking of something like:

{ command = "nbd_server" ; block = "qemu_block_name" }
{ result = "done" }
<qmp socket turns into nbd socket>

This way we don't require yet another port, don't have to care about conflicts 
and get internal qemu block names for free.


Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]