qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [RFC][PATCH] qemu-img: make convert async


From: Max Reitz
Subject: Re: [Qemu-block] [RFC][PATCH] qemu-img: make convert async
Date: Sun, 12 Feb 2017 03:06:05 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0

On 02.02.2017 17:06, Peter Lieven wrote:
> this is something I have been thinking about for almost 2 years now.
> we heavily have the following two use cases when using qemu-img convert.
> 
> a) reading from NFS and writing to iSCSI for deploying templates
> b) reading from iSCSI and writing to NFS for backups
> 
> In both processes we use libiscsi and libnfs so we have no kernel pagecache.
> As qemu-img convert is implemented with sync operations that means we
> read one buffer and then write it. No parallelism and each sync request
> takes as long as it takes until it is completed.
> 
> What I put together is an approach to use aio routines for the conversion
> process to have ideally read and write happening in parallel.
> 
> The code is far from clean or complete, but I would appreaciate comments
> and thoughts from you.
> 
> So far I have the following runtimes when reading an uncompressed QCOW2 from
> NFS and writing it to iSCSI (raw):
> 
> qemu-img (master)
>  nfs -> iscsi 33 secs
>  nfs -> ram   19 secs
>  ram -> iscsi 14 secs
> 
> qemu-img-async
>  nfs -> iscsi 23 secs
>  nfs -> ram   17 secs
>  ram -> iscsi 14 secs
> 
> Its visible that on master the runtimes add up as expected. The async branch
> is faster, but not as fast as I would have expected. I would expect the 
> runtime
> to be as slow as the slowest of the two involved transfers.
> 
> Thank you,
> Peter
> 
> Signed-off-by: Peter Lieven <address@hidden>
> ---
>  qemu-img.c | 271 
> +++++++++++++++++++++++++++++++++++++++++++++----------------
>  1 file changed, 199 insertions(+), 72 deletions(-)

Asynchronous convert sounds good. But your implementation looks a bit
weird to me.

Your implementation has four "slots" which receive work from a central
work queue that they then process. You can do that, but it looks
counter-intuitive to me. (Or if you do that, I would do it using
coroutines: Start up four coroutines that simply submit blk_co_* requests.)

What I would have done (if using AIO) is the following: Seek through the
image, finding the next bit of work to do (without having a central work
queue). Then submit an AIO request with a newly allocated piece of data
(not using fixed slots). Continue until four requests are in flight,
then wait until one is settled.

I think this would simplify the design. Also, it's basically what the
mirror block job does.

Which brings me to a totally different point: At some point we intended
to convert as many qemu-img functions to block jobs as possible.
Unfortunately, we only ever did one and that's commit. If we were to
convert convert to mirror, then we'd get async for free.

But the effort of converting convert to mirror is probably much larger
than adding async to convert...

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]