qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v1 01/18] block/io: add bdrv_aio_{preadv, pwrite


From: Kevin Wolf
Subject: Re: [Qemu-block] [PATCH v1 01/18] block/io: add bdrv_aio_{preadv, pwritev}
Date: Thu, 24 Nov 2016 13:36:20 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 24.11.2016 um 11:58 hat Pavel Butsykin geschrieben:
> On 23.11.2016 17:28, Kevin Wolf wrote:
> >Am 15.11.2016 um 07:36 hat Pavel Butsykin geschrieben:
> >>It's just byte-based wrappers over bdrv_co_aio_prw_vector(), which provide
> >>  a byte-based interface for AIO read/write.
> >>
> >>Signed-off-by: Pavel Butsykin <address@hidden>
> >
> >I'm in the process to phase out the last users of bdrv_aio_*() so that
> >this set of interfaces can be removed. I'm doing this because it's an
> >unnecessary redundancy, we have too many wrapper functions that expose
> >the same functionality with different syntax. So let's not add new
> >users.
> >
> >At first sight, you don't even seem to use bdrv_aio_preadv() for actual
> >parallelism, but you often have a pattern like this:
> >
> >     void foo_cb(void *opaque)
> >     {
> >         ...
> >         qemu_coroutine_enter(acb->co);
> >     }
> >
> >     void caller()
> >     {
> >         ...
> >         acb = bdrv_aio_preadv(...);
> >         qemu_coroutine_yield();
> >     }
> >
> >The code will actually become a lot simpler if you use bdrv_co_preadv()
> >instead because you don't have to have a callback, but you get pure
> >sequential code.
> >
> >The part that actually has some parallelism, pcache_readahead_request(),
> >already creates its own coroutine, so it runs in the background without
> >using callback-style interfaces.
> 
> I used bdrv_co_preadv(), because it conveniently solves the partial
> cache hit. To solve the partial cache hit, we need to split a request
> into smaller parts, make asynchronous requests and wait for all
> requests in one place.
> 
> Do you propose to create a coroutine for each part of request? It
> seemed to me that bdrv_co_preadv() is a wrapper that allows us to get
> rid of the same code.

It's actually the other way round, bdrv_co_preadv() is the "native"
block layer API, and bdrv_aio_*() are wrappers providing an alternative
interface.


I'm looking at pcache_co_readahead(), for example. It looks like this:

    bdrv_aio_preadv(bs->file, node->common.offset, &readahead_acb.qiov,
                    node->common.bytes, pcache_aio_readahead_cb,
                    &readahead_acb);
    qemu_coroutine_yield();

And then we have pcache_aio_readahead_cb(), which ends in:

    qemu_coroutine_enter(acb->co);

So here the callback style doesn't buy you anything, it just rips the
code apart in two function. There is no parallelism here anyway,
pcache_co_readahead() doesn't do anything until the callback reenters
it. This is a very obvious example where bdrv_co_preadv() will simplify
the code.


It's similar with the other bdrv_aio_preadv() calls, which are in
pcache_co_preadv():

        if (bytes > s->max_aio_size) {
            bdrv_aio_preadv(bs->file, offset, qiov, bytes,
                            pcache_aio_read_cb, &acb);
            goto out;
        }

        update_req_stats(s->req_stats, offset, bytes);

        status = pcache_lookup_data(&acb);
        if (status == CACHE_MISS) {
            bdrv_aio_preadv(bs->file, offset, qiov, bytes,
                            pcache_aio_read_cb, &acb);
        } else if (status == PARTIAL_CACHE_HIT) {
            assert(acb.part.qiov.niov != 0);
            bdrv_aio_preadv(bs->file, acb.part.offset, &acb.part.qiov,
                            acb.part.bytes, pcache_aio_read_cb, &acb);
        }

        pcache_readahead_request(&acb);

        if (status == CACHE_HIT && --acb.ref == 0) {
            return 0;
        }

    out:
        qemu_coroutine_yield();

Here you have mainly the pcache_readahead_request() call between
bdrv_aio_preadv() and the yield. It only spawns a new coroutine, which
works in the background, so I think you can move it to before the reads
and then the reads can trivially become bdrv_co_preadv() and the
callback can again be inlined instead of ripping the function in two
parts.


The bdrv_aio_pwritev() call in pcache_co_pwritev() is just the same
thing and using the coroutine version results in obvious code
improvements.


And I think this are all uses of bdrv_aio_*() in the pcache driver, so
converting it to use bdrv_co_*() instead isn't only possible, but will
improve the legibility of your code, too. It's a clear win in all three
places.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]