[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [Qemu-devel] AIO error case
From: |
John Snow |
Subject: |
Re: [Qemu-block] [Qemu-devel] AIO error case |
Date: |
Wed, 23 May 2018 13:53:23 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 |
On 05/22/2018 06:01 PM, Nishanth Aravamudan via Qemu-devel wrote:
> Hi!
>
Hi! CCing address@hidden;
> I'm tracking an error case in the native AIO path, and was wondering if
> there was a latent (albeit possibly hard to hit) bug. Specifically
> util/async.c::aio_get_linux_aio:
>
> #ifdef CONFIG_LINUX_AIO
> LinuxAioState *aio_get_linux_aio(AioContext *ctx)
> {
> if (!ctx->linux_aio) {
> ctx->linux_aio = laio_init();
> laio_attach_aio_context(ctx->linux_aio, ctx);
> }
> return ctx->linux_aio;
> }
> #endif
>
> laio_init() can in certain conditions return NULL, but that's not checked
> here and then the NULL result is passed directly into
> laio_attach_aio_context, which dereferences it without checking that the
> pointer is valid.
>
Looks like a good old-fashioned bug to me:
``
if (event_notifier_init(&s->e, false) < 0) {
goto out_free_state;
}
if (io_setup(MAX_EVENTS, &s->ctx) != 0) {
goto out_close_efd;
}
...
out_close_efd:
event_notifier_cleanup(&s->e);
out_free_state:
g_free(s);
return NULL;
``
event_notifier_init looks like it is... usually(?) going to return
-errno on error, and the man page for io_setup suggests that the libaio
wrapper for io_setup is supposed to return -errno if it fails; we could
probably hold onto that error code.
We probably ought to forward those error codes up to the caller; maybe:
int laio_init(LinuxAioState **linux_aio);
> I'm not sure what is appropriate if laio_init() returns NULL, returning
> NULL back to the caller of aio_get_linux_aio() has its own issues, because
> those callers don't seem to check its return value either.
>
If laio_init can return NULL, aio_get_linux_aio needs to check for that.
Then callers of that ought to be amended to return some kind of error:
raw_co_prw can return whatever laio_init returned as an error code.
raw_aio_plug doesn't have a way to communicate errors, yet...
raw_aio_unplug doesn't have an error reporting mechanism either.
Those last two are defined here:
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
but it looks like bdrv_io_un|plug doesn't return errors either.
bdrv_io_plug is called by both blk_io_plug and itself, but has no other
callers.
blk_io_plug is called in two places:
virtio_scsi_handle_cmd_req_prepare, which can report -errno error codes
(so we can amend blk_io_plug and bdrv_io_plug to just bubble up errors), and
virtio_blk_handle_vq -- which doesn't appear to be capable of handling
an error, just indicating "progress".
unplug has three usages, one in virtio-blk and two in virtio-scsi, all
of which only appear to check for "progress" and don't seem to have
explicit error reporting.
Probably unplug can't actually fail like this though -- I'm assuming by
the name that it must occur after a call to plug and that will have
memoized the call to aio_get_linux_aio by then.
So... probably what we need to do here is take a look at the virtio-blk
usage of plug and do error-plumbing as necessary.
Wanna send a patch?
--js
> Thanks in advance!
> -Nish
>
- Re: [Qemu-block] [Qemu-devel] AIO error case,
John Snow <=