qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: avoid copying ignore-shared ramblock


From: Catherine Ho
Subject: Re: [Qemu-devel] [PATCH] migration: avoid copying ignore-shared ramblock when in incoming migration
Date: Tue, 2 Apr 2019 22:33:30 +0800

On Tue, 2 Apr 2019 at 22:17, Catherine Ho <address@hidden> wrote:

>
>
> On Tue, 2 Apr 2019 at 20:37, Peter Xu <address@hidden> wrote:
>
>> On Tue, Apr 02, 2019 at 05:06:15PM +0800, Catherine Ho wrote:
>> > On Tue, 2 Apr 2019 at 15:58, Peter Xu <address@hidden> wrote:
>> >
>> > > On Tue, Apr 02, 2019 at 03:47:16PM +0800, Catherine Ho wrote:
>> > > > Hi Peter Maydell
>> > > >
>> > > > On Tue, 2 Apr 2019 at 11:05, Peter Maydell <
>> address@hidden>
>> > > wrote:
>> > > >
>> > > > > On Tue, 2 Apr 2019 at 09:57, Catherine Ho <
>> address@hidden>
>> > > > > wrote:
>> > > > > > The root cause is the used idx is moved forward after 1st time
>> > > incoming,
>> > > > > and in 2nd time incoming,
>> > > > > > the last_avail_idx will be incorrectly restored from the saved
>> device
>> > > > > state file(not in the ram).
>> > > > > >
>> > > > > > I watched this even on x86 for a virtio-scsi disk
>> > > > > >
>> > > > > > Any ideas for supporting 2nd time, 3rd time... incoming
>> restoring?
>> > > > >
>> > > > > Does the destination end go through reset between the 1st and 2nd
>> > > > >
>> > > > seems not, please see my step below
>> > > >
>> > > > > incoming attempts? I'm not a migration expert, but I thought that
>> > > > > devices were allowed to assume that their state is "state of the
>> > > > > device following QEMU reset" before the start of an incoming
>> > > > > migration attempt.
>> > > > >
>> > > >
>> > > > Here  is my step:
>> > > > 1. start guest normal by qemu with shared memory-backend file
>> > > > 2. stop the vm. save the device state to another file via monitor
>> migrate
>> > > > "exec: cat>..."
>> > > > 3. quit the vm
>> > > > 4. retore the vm by qemu -incoming "exec:cat ..."
>> > > > 5. continue the vm via monito, the 1st incoming works fine
>> > > > 6. quit the vm
>> > > > 7. retore the vm by qemu -incoming "exec:cat ..." for 2nd time
>> > > > 8. continue   -> error happened
>> > > > Actually, this can be fixed by forcely restore the idx by
>> > > > virtio_queue_restore_last_avail_idx()
>> > > > But I am sure whether it is reasonable.
>> > >
>> > > Yeah I really suspect its validity.
>> > >
>> > > IMHO normal migration streams keep the device state and RAM data
>> > > together in the dumped file, so they always match.
>> > >
>> > > In your shared case, the device states are in the dumped file however
>> > > the RAM data is located somewhere else.  After you quit the VM from
>> > > the 1st incoming migration the RAM is new (because that's a shared
>> > > memory file) and the device data is still old.  They do not match
>> > > already, then I'd say you can't migrate with that any more.
>> > >
>> > > If you want to do that, you'd better take snapshot of the RAM backend
>> > > file if your filesystem supports (or even simpler, to back it up
>> > > before hand) before you start any incoming migration.  Then with the
>> > > dumped file (which contains the device states) and that snapshot file
>> > > (which contains the exact RAM data that matches the device states)
>> > > you'll alway be able to migrate for as many times as you want.
>> > >
>> >
>> > Understood, thanks Peter Xu
>> > Is there any feasible way to indicate the snapshot of the RAM backend
>> file
>> > is
>> > matched with the device data?
>> > >VQ 2 size 0x400 < last_avail_idx 0x1639 - used_idx 0x2688
>> > >Failed to load virtio-scsi:virtio
>> >
>> > Because I thought reporting above error is not so friendly. Could we
>> add a
>> > version id in both RAM backend file and device date file?
>>
>> It would be non-trivial I'd say - AFAIK we don't have an existing way
>> to tag the memory-backend-file content (IIUC that's what you use).
>>
>> And since you mentioned about versioning of these states, I just
>> remembered that even with this you may not be able to get a complete
>> matched state of the VM, because AFAICT actually besides RAM state &
>> device state, you probably also need to consider the disk state as
>> well.  After you started the VM of the 1st incoming, there could be
>> data flushed to the VM backend disk and then that state is changed as
>> well.  So here even if you snapshot the RAM file you'll still lose the
>> disk state IIUC so it could still be broken.  In other words, to make
>> a migration/snapshot to work you'll need to make all these three
>> states to match.
>>
>
> Yes, thanks
>
>>
>> Before we discuss further on the topic... could you share me with your
>> requirement first?  I started to get a bit confused now since when I
>> thought about shared mem I was thinking about migrating within the
>> same host to e.g. upgrade the hypervisor but that obviously does not
>> need you to do incoming migration for multiple times.  Then what do
>> you finally want to achieve?
>>
> Actually, I am investigating the ignore-shared capability case support on
> arm64. This feature is used in Kata containers project as "vm template"
>  The rom reset failure is the first bug.
> Ok, now I can confirm that doing incoming migration for multiple times is
>
s/is /isn't, sorry ~

> supported. Thanks for the detailed explanation :)
>
B.R.
> Catherine
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]