[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH v5 04/12] block/io_uring: implements interfaces
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-block] [PATCH v5 04/12] block/io_uring: implements interfaces for io_uring |
Date: |
Sat, 22 Jun 2019 16:10:07 +0100 |
On Mon, Jun 17, 2019 at 1:27 PM Maxim Levitsky <address@hidden> wrote:
> On Mon, 2019-06-10 at 19:18 +0530, Aarushi Mehta wrote:
> > + if (!cqes) {
> > + break;
> > + }
> > + LuringAIOCB *luringcb = io_uring_cqe_get_data(cqes);
> > + ret = cqes->res;
> > +
> > + if (ret == luringcb->qiov->size) {
> > + ret = 0;
> > + } else if (ret >= 0) {
>
>
> You should very carefully check the allowed return values here.
>
> It looks like you can get '-EINTR' here, which would ask you to rerun the
> read operation, and otherwise
> you will get the number of bytes read, which might be less that what was
> asked for, which implies that you
> need to retry the read operation with the remainder of the buffer rather that
> zero the end of the buffer IMHO
>
> (0 is returned on EOF according to 'read' semantics, which I think are used
> here, thus a short read might not be an EOF)
>
>
> Looking at linux-aio.c though I do see that it just passes through the
> returned value with no special treatments.
> including lack of check for -EINTR.
>
> I assume that since aio is linux specific, and it only supports direct IO, it
> happens
> to have assumption of no short reads/-EINTR (but since libaio has very sparse
> documentation I can't verify this)
>
> On the other hand the aio=threads implementation actually does everything as
> specified on the 'write' manpage,
> retrying the reads on -EINTR, and doing additional reads if less that
> required number of bytes were read.
>
> Looking at io_uring implementation in the kernel I see that it does support
> synchronous (non O_DIRECT mode),
> and in this case, it goes through the same ->read_iter which is pretty much
> the same path that
> regular read() takes and so it might return short reads and or -EINTR.
Thanks, Maxim. I've confirmed that -EINTR needs to be handled based
on fs/io_uring.c. We need to get a new sqe (since the old one is no
longer usable) and submit the request again. There are no ordering
guarantees between pending requests so doing this is permissible.
Stefan
[Qemu-block] [PATCH v5 09/12] block: add trace events for io_uring, Aarushi Mehta, 2019/06/10
[Qemu-block] [PATCH v5 08/12] block/file-posix.c: extend to use io_uring, Aarushi Mehta, 2019/06/10