[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH v2 0/2] block/file-posix: allow -drive cache.dir
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-block] [PATCH v2 0/2] block/file-posix: allow -drive cache.direct=off live migration |
Date: |
Fri, 11 May 2018 16:50:30 +0100 |
User-agent: |
Mutt/1.9.3 (2018-01-21) |
On Fri, Apr 27, 2018 at 05:23:10PM +0100, Stefan Hajnoczi wrote:
> v2:
> * Add comment on !__linux__ situation [Fam]
> * Add file-posix.c x-check-cache-dropped=on|off option [DaveG, Kevin]
>
> file-posix.c only supports shared storage live migration with -drive
> cache.direct=off due to cache consistency issues. There are two main shared
> storage configurations: files on NFS and host block devices on SAN LUNs.
>
> The problem is that QEMU starts on the destination host before the source host
> has written everything out to the disk. The page cache on the destination
> host
> may contain stale data read when QEMU opened the image file (before migration
> handover). Using O_DIRECT avoids this problem but prevents users from taking
> advantage of the host page cache.
>
> Although cache=none is the recommended setting for virtualization use cases,
> there are scenarios where cache=writeback makes sense. If the guest has much
> less RAM than the host or many guests share the same backing file, then the
> host page cache can significantly improve disk I/O performance.
>
> This patch series implements .bdrv_co_invalidate_cache() for
> block/file-posix.c
> on Linux so that shared storage live migration works. I have sent it as an
> RFC
> because cache consistency is not binary, there are corner cases which I've
> described in the actual patch, and this may require more discussion.
>
> Regarding NFS, QEMU relies on O_DIRECT rather than the close-to-open
> consistency model (see nfs(5)), which is the basic guarantee provided by NFS.
> After this patch cache consistency is no longer provided by O_DIRECT.
>
> This patch series relies on fdatasync(2) (source) +
> posix_fadvise(POSIX_FADV_DONTNEED) (destination) instead. I believe it is
> safe
> for both NFS and SAN LUNs. Maybe we should use fsync(2) instead of
> fdatasync(2) so that NFS has up-to-date inode metadata?
>
> Stefan Hajnoczi (2):
> block/file-posix: implement bdrv_co_invalidate_cache() on Linux
> block/file-posix: add x-check-page-cache=on|off option
>
> qapi/block-core.json | 7 ++-
> block/file-posix.c | 146
> ++++++++++++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 150 insertions(+), 3 deletions(-)
>
> --
> 2.14.3
>
Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block
Stefan
signature.asc
Description: PGP signature