qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/2] block/file-posix: allow -drive cache.direct=o


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC 0/2] block/file-posix: allow -drive cache.direct=off live migration
Date: Fri, 20 Apr 2018 11:05:42 +0800
User-agent: Mutt/1.9.2 (2017-12-15)

On Thu, Apr 19, 2018 at 11:09:53AM -0500, Eric Blake wrote:
> On 04/19/2018 02:52 AM, Stefan Hajnoczi wrote:
> > file-posix.c only supports shared storage live migration with -drive
> > cache.direct=off due to cache consistency issues.  There are two main shared
> > storage configurations: files on NFS and host block devices on SAN LUNs.
> > 
> > The problem is that QEMU starts on the destination host before the source 
> > host
> > has written everything out to the disk.  The page cache on the destination 
> > host
> > may contain stale data read when QEMU opened the image file (before 
> > migration
> > handover).  Using O_DIRECT avoids this problem but prevents users from 
> > taking
> > advantage of the host page cache.
> > 
> > Although cache=none is the recommended setting for virtualization use cases,
> > there are scenarios where cache=writeback makes sense.  If the guest has 
> > much
> > less RAM than the host or many guests share the same backing file, then the
> > host page cache can significantly improve disk I/O performance.
> > 
> > This patch series implements .bdrv_co_invalidate_cache() for 
> > block/file-posix.c
> > on Linux so that shared storage live migration works.  I have sent it as an 
> > RFC
> > because cache consistency is not binary, there are corner cases which I've
> > described in the actual patch, and this may require more discussion.
> 
> Interesting, in that the NBD list is also discussing the possible
> standardization of a NBD_CMD_CACHE command (based on existing practice
> in the xNBD implementation), and covering whether that MIGHT be worth
> doing as a thin wrapper that corresponds to posix_fadvise() semantics.
> Thus, if NBD_CMD_CACHE learns flags, we could support
> .bdrv_co_invalidate_cache() through the NBD protocol driver, in addition
> to the POSIX file driver.  Obviously, your usage invalidates the cache
> of the entire file; but does it also make sense to expose a start/length
> subset invalidation, for better exposure to posix_fadvise() semantics?

bdrv_co_invalidate_cache() is currently only used by migration before
using a file that may have been touched by the other host.  I don't
think start/length will be needed for that use case.

Can you describe how will NBD use cache invalidation?  Maybe this will
help me understand other use cases.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]