qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/3] v2 Fix Block Hotplug race with drive_unplug


From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH 2/3] v2 Fix Block Hotplug race with drive_unplug()
Date: Fri, 29 Oct 2010 10:28:03 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10

On 10/29/2010 09:57 AM, Kevin Wolf wrote:
Am 29.10.2010 16:40, schrieb Anthony Liguori:
On 10/29/2010 09:29 AM, Kevin Wolf wrote:
Am 29.10.2010 16:15, schrieb Anthony Liguori:
I don't think it's a bad idea to do that but to the extent that the
block API is designed after posix file I/O, close does not usually imply
flush.

I don't think it really resembles POSIX. More or less the only thing
they have in common is that both provide open, read, write and close,
which is something that probably any API for file accesses provides.

The operation you're talking about here is bdrv_flush/fsync that is not
implied by a POSIX close?

Yes.  But I think for the purposes of this patch, a bdrv_cancel_all()
would be just as good.  The intention is to eliminate pending I/O
requests, the fsync is just a side effect.
Well, if I'm not mistaken, bdrv_flush would provide only this side
effect and not the semantics that you're really looking for. This is why
I suggested adding both bdrv_flush and qemu_aio_flush. We could probably
introduce a qemu_aio_flush variant that flushes only one
BlockDriverState - this is what you really want.

And why do we have to flush here, but not before other uses of
bdrv_close(), such as eject_device()?


Good question.  Kevin should also confirm, but looking at the code, I
think flush() is needed before close.  If there's a pending I/O event
and you close before the I/O event is completed, you'll get a callback
for completion against a bogus BlockDriverState.

I can't find anything in either raw-posix or the generic block layer
that would mitigate this.

I'm not aware of anything either. This is what qemu_aio_flush would do.

It seems reasonable to me to call both qemu_aio_flush and bdrv_flush in
bdrv_close. We probably don't really need to call bdrv_flush to operate
correctly, but it can't hurt and bdrv_close shouldn't happen that often
anyway.

I agree.  Re: qemu_aio_flush, we have to wait for it to complete which
gets a little complicated in bdrv_close().
qemu_aio_flush is the function that waits for requests to complete.

Please excuse me while my head explodes ;-)

I think we've got a bit of a problem.

We have:

1) bdrv_flush() - sends an fdatasync

2) bdrv_aio_flush() - sends an fdatasync using the thread pool

3) qemu_aio_flush() - waits for all pending aio requests to complete

But we use bdrv_aio_flush() to implement a barrier and we don't actually preserve those barrier semantics in the thread pool.

That is:

If I do:

bdrv_aio_write() -> A
bdrv_aio_write() -> B
bdrv_aio_flush() -> C

This will get queued as three requests on the thread pool. (A) is a write, (B) is a write, and (C) is a fdatasync.

But if this gets picked up by three separate threads, the ordering isn't guaranteed. It might be C, B, A. So semantically, is bdrv_aio_flush() supposed to flush any *pending* writes or any *completed* writes? If it's the later, we're okay, but if it's the former, we're broken.

If it's supposed to flush any pending writes, then my patch series is correct in theory.

Regards,

Anthony Liguori

I think it would be better
to make bdrv_flush() call bdrv_aio_flush() if an explicit bdrv_flush
method isn't provided.  Something like the attached (still need to test).

Does that seem reasonable?
I'm not sure why you want to introduce this emulation. Are there any
drivers that implement bdrv_aio_flush, but not bdrv_flush? They are
definitely broken.

Today, bdrv_aio_flush is emulated using bdrv_flush if the driver doesn't
provide it explicitly.

I think this also means that your first patch would kill any drivers
implementing neither bdrv_flush nor bdrv_aio_flush because they'd try to
emulate the other function in an endless recursion.

And apart from that, as said above, bdrv_flush doesn't do the right
thing anyway. ;-)

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]