qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [ceph-users] qemu + rbd block driver with cache=writeba


From: Kevin Wolf
Subject: Re: [Qemu-devel] [ceph-users] qemu + rbd block driver with cache=writeback, is live migration safe ?
Date: Tue, 22 Apr 2014 11:08:08 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

Am 19.04.2014 um 00:33 hat Josh Durgin geschrieben:
> On 04/18/2014 10:47 AM, Alexandre DERUMIER wrote:
> >Thanks Kevin for for the full explain!
> >
> >>>cache.writeback=on,cache.direct=off,cache.no-flush=off
> >
> >I didn't known about the cache options split,thanks.
> >
> >
> >>>rbd does, to my knowledge, not use the kernel page cache, so we're safe
> >>>from that part. It does however honour the cache.direct flag when it
> >>>decides whether to use its own cache. rbd doesn't implement
> >>>bdrv_invalidate_cache() in order to clear that cache when migration
> >>>completes.
> >
> >Maybe some ceph devs could comment about this ?
> 
> That's correct, librbd uses its own in-memory cache instead of
> the kernel page cache, and it honors flush requests. Furthermore,
> librbd keeps its own metadata synchronized among different
> clients via the ceph cluster (this is information like image
> size, which rbd snapshots exist, and rbd parent image).
> 
> So as I understand it live migration with raw format images on
> rbd is safe even with cache.writeback=true and cache.direct=false
> (i.e. cache=writeback) because:
> 
> 1) rbd metadata is synchronized internally
> 
> 2) the source vm has any rbd caches flushed by vm_stop() before
>    the destination starts
> 
> 3) rbd does not read anything into its cache before the
>    destination starts
> 
> 4) raw format images have no extra metadata that needs invalidation
> 
> If librbd populated its cache when the disk was opened, the rbd driver
> would need to implement bdrv_invalidate(), but since it does not, it's
> unnecessary.
> 
> Is this correct Kevin?

I'm not sure about 3). The rbd block driver itself may not be reading
anything into its cache during bdrv_open (though, what about things like
the image size?), but qemu doesn't guarantee that it doesn't read
anything from the image before migration completes.

I think you may indeed be lucky for raw images, even though wouldn't bet
money on it, but if your cache isn't internally kept coherent by librbd,
without a bdrv_invalidate() implementation you're almost for sure unsafe
with non-raw image formats.

My recommendation would be to add that bdrv_invalidate() implementation,
then we can be sure for raw, and get the rest fixed as well.

Kevin

> >>>No, such a QMP command doesn't exist, though it would be possible to
> >>>implement (for toggling cache.direct, that is; cache.writeback is guest
> >>>visible and can therefore only be toggled by the guest)
> >
> >yes, that's what I have in mind, toggling cache.direct=on before migration, 
> >then disable it after the migration.
> >
> >
> >
> >----- Mail original -----
> >
> >De: "Kevin Wolf" <address@hidden>
> >À: "Alexandre DERUMIER" <address@hidden>
> >Cc: "qemu-devel" <address@hidden>, address@hidden
> >Envoyé: Mardi 15 Avril 2014 11:36:22
> >Objet: Re: [Qemu-devel] qemu + rbd block driver with cache=writeback, is 
> >live migration safe ?
> >
> >Am 12.04.2014 um 17:01 hat Alexandre DERUMIER geschrieben:
> >>Hello,
> >>
> >>I known that qemu live migration with disk with cache=writeback are not 
> >>safe with storage like nfs,iscsi...
> >>
> >>Is it also true with rbd ?
> >
> >First of all, in order to avoid misunderstandings, let's be clear that
> >there are three dimensions for the cache configuration of qemu block
> >devices. In current versions, they are separately configurable and
> >cache=writeback really expands to:
> >
> >cache.writeback=on,cache.direct=off,cache.no-flush=off
> >
> >The problematic part of this for live migration is generally not
> >cache.writeback being enabled, but cache.direct being disabled.
> >
> >The reason for that is that the destination host will open the image
> >file immediately, because it needs things like the image size to
> >correctly initialise the emulated disk devices. Now during the migration
> >the source keeps working on the image, so if qemu read some metadata on
> >the destination host, that metadata may be stale by the time that the
> >migration actually completes.
> >
> >In order to solve this problem, qemu calls bdrv_invalidate_cache(),
> >which throws away everything that is cached in qemu so that it is reread
> >from the image. However, this is ineffective if there are other caches
> >having stale data, such as the kernel page cache. cache.direct bypasses
> >the kernel page cache, so this is why it's important in many cases.
> >
> >rbd does, to my knowledge, not use the kernel page cache, so we're safe
> >from that part. It does however honour the cache.direct flag when it
> >decides whether to use its own cache. rbd doesn't implement
> >bdrv_invalidate_cache() in order to clear that cache when migration
> >completes.
> >
> >So the answer to your original question is that it's probably _not_ safe
> >to use live migration with rbd and cache.direct=off.
> >
> >>If yes, it is possible to disable manually writeback online with qmp ?
> >
> >No, such a QMP command doesn't exist, though it would be possible to
> >implement (for toggling cache.direct, that is; cache.writeback is guest
> >visible and can therefore only be toggled by the guest).
> >
> >Kevin
> 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]