qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [ceph-users] qemu + rbd block driver with cache=writeba


From: Alexandre DERUMIER
Subject: Re: [Qemu-devel] [ceph-users] qemu + rbd block driver with cache=writeback, is live migration safe ?
Date: Sun, 20 Apr 2014 17:47:25 +0200 (CEST)

Hi Josh,

>>2) the source vm has any rbd caches flushed by vm_stop() before
>>the destination starts
>>
>>3) rbd does not read anything into its cache before the
>>destination starts


I'm not sure about this, we a live migration occur, the workflow is:

1)- source vm running

2)- target vm start (open rbd disk image, but don't do any read/write, like in 
pause mode)

3)- memory migration begin (and during the migration, read/write occur on the 
source vm)

4) -migration end : the target vm become active, and the source vm is stopped.


----- Mail original -----

De: "Josh Durgin" <address@hidden>
À: "Alexandre DERUMIER" <address@hidden>, "Kevin Wolf" <address@hidden>
Cc: address@hidden, "qemu-devel" <address@hidden>
Envoyé: Samedi 19 Avril 2014 00:33:12
Objet: Re: [ceph-users] [Qemu-devel] qemu + rbd block driver with 
cache=writeback, is live migration safe ?

On 04/18/2014 10:47 AM, Alexandre DERUMIER wrote:
> Thanks Kevin for for the full explain!
>
>>> cache.writeback=on,cache.direct=off,cache.no-flush=off
>
> I didn't known about the cache options split,thanks.
>
>
>>> rbd does, to my knowledge, not use the kernel page cache, so we're safe
>> >from that part. It does however honour the cache.direct flag when it
>>> decides whether to use its own cache. rbd doesn't implement
>>> bdrv_invalidate_cache() in order to clear that cache when migration
>>> completes.
>
> Maybe some ceph devs could comment about this ?

That's correct, librbd uses its own in-memory cache instead of
the kernel page cache, and it honors flush requests. Furthermore,
librbd keeps its own metadata synchronized among different
clients via the ceph cluster (this is information like image
size, which rbd snapshots exist, and rbd parent image).

So as I understand it live migration with raw format images on
rbd is safe even with cache.writeback=true and cache.direct=false
(i.e. cache=writeback) because:

1) rbd metadata is synchronized internally

2) the source vm has any rbd caches flushed by vm_stop() before
the destination starts

3) rbd does not read anything into its cache before the
destination starts

4) raw format images have no extra metadata that needs invalidation

If librbd populated its cache when the disk was opened, the rbd driver
would need to implement bdrv_invalidate(), but since it does not, it's
unnecessary.

Is this correct Kevin?

Josh

>>> No, such a QMP command doesn't exist, though it would be possible to
>>> implement (for toggling cache.direct, that is; cache.writeback is guest
>>> visible and can therefore only be toggled by the guest)
>
> yes, that's what I have in mind, toggling cache.direct=on before migration, 
> then disable it after the migration.
>
>
>
> ----- Mail original -----
>
> De: "Kevin Wolf" <address@hidden>
> À: "Alexandre DERUMIER" <address@hidden>
> Cc: "qemu-devel" <address@hidden>, address@hidden
> Envoyé: Mardi 15 Avril 2014 11:36:22
> Objet: Re: [Qemu-devel] qemu + rbd block driver with cache=writeback, is live 
> migration safe ?
>
> Am 12.04.2014 um 17:01 hat Alexandre DERUMIER geschrieben:
>> Hello,
>>
>> I known that qemu live migration with disk with cache=writeback are not safe 
>> with storage like nfs,iscsi...
>>
>> Is it also true with rbd ?
>
> First of all, in order to avoid misunderstandings, let's be clear that
> there are three dimensions for the cache configuration of qemu block
> devices. In current versions, they are separately configurable and
> cache=writeback really expands to:
>
> cache.writeback=on,cache.direct=off,cache.no-flush=off
>
> The problematic part of this for live migration is generally not
> cache.writeback being enabled, but cache.direct being disabled.
>
> The reason for that is that the destination host will open the image
> file immediately, because it needs things like the image size to
> correctly initialise the emulated disk devices. Now during the migration 
> the source keeps working on the image, so if qemu read some metadata on
> the destination host, that metadata may be stale by the time that the
> migration actually completes.
>
> In order to solve this problem, qemu calls bdrv_invalidate_cache(),
> which throws away everything that is cached in qemu so that it is reread 
> from the image. However, this is ineffective if there are other caches
> having stale data, such as the kernel page cache. cache.direct bypasses
> the kernel page cache, so this is why it's important in many cases.
>
> rbd does, to my knowledge, not use the kernel page cache, so we're safe
> from that part. It does however honour the cache.direct flag when it
> decides whether to use its own cache. rbd doesn't implement
> bdrv_invalidate_cache() in order to clear that cache when migration
> completes.
>
> So the answer to your original question is that it's probably _not_ safe 
> to use live migration with rbd and cache.direct=off.
>
>> If yes, it is possible to disable manually writeback online with qmp ?
>
> No, such a QMP command doesn't exist, though it would be possible to
> implement (for toggling cache.direct, that is; cache.writeback is guest
> visible and can therefore only be toggled by the guest).
>
> Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]