[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] migration: Fix possible bug for migrate cancel
From: |
Gonglei (Arei) |
Subject: |
Re: [Qemu-devel] [PATCH] migration: Fix possible bug for migrate cancel |
Date: |
Tue, 25 Mar 2014 11:15:51 +0000 |
> -----Original Message-----
> From: Eric Blake [mailto:address@hidden
> Sent: Tuesday, March 25, 2014 12:01 AM
> To: Paolo Bonzini; Gonglei (Arei); address@hidden
> Cc: address@hidden; address@hidden; Yanqiangjun; Zhaoyanbin
> (A); Zengjunliang; address@hidden
> Subject: Re: [PATCH] migration: Fix possible bug for migrate cancel
>
> [adding libvirt]
>
> On 03/24/2014 09:47 AM, Paolo Bonzini wrote:
> > Il 24/03/2014 14:04, address@hidden ha scritto:
> >> From: zengjunliang <address@hidden>
> >>
> >> Return error for migrate cancel, when migration status is not
> >> MIG_STATE_SETUP or MIG_STATE_ACTIVE. Thus, libvirt can can
> >> perceive the operation fails.
> >>
> >> Signed-off-by: zengjunliang <address@hidden>
> >> Signed-off-by: Gonglei <address@hidden>
> >
> > I think this is done on purpose, because canceling migration is racy.
> > Instead, libvirt should do "query-migrate" and check if the migration
> > was completed or canceled.
>
> Can you please give more details at how you are triggering the problem
> with libvirt? I think Paolo is probably right - the bug is more likely
> to be in libvirt not expecting the race and not recovering correctly
> when the race occurs, than it is to be in changing qemu's state algorithm.
>
When the migration progress reaches 100%, and the migration status becomes
MIG_STATE_COMPLETED in Qemu.
It will take some time which from MIG_STATE_COMPLETED to the migration thread
resources are recovered.
If we cancel the migration at this moment, the migrate_fd_cancel function will
break directly without reporting
error code. Then, libvirt considers the cancle operation a success, contrary
facts.
Best regards,
-Gonglei