qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem


From: Xulei (Stone)
Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
Date: Fri, 6 Nov 2015 09:12:34 +0000

>On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
>> On 2015/11/3 14:58, Xulei (Stone, Euler) wrote:
>> > On qemu-kvm platform, when I reset a VM through "virsh reset", and 
>> > coincidently
>> > the VM is in process of internal rebooting at the same time. Then the VM 
>> > will
>> > not be successfully reseted any more due to the reset reentrancy. I found:
>> > (1)SeaBios try to shutdown the VM after reseting it failed by 
>> > apm_shutdown().
>> > However, apm_shutdown() does not work on qemu-kvm platform;
>> > (2)I add 1s sleep in qemu_prep_reset(), then continuously reset the VM 
>> > twice,
>> > aforementioned case must happen.
>
>So, the problem occurs when issuing a second reset before the first
>reset completes?

Yes. Detailedly, the 2nd reset issued after "HaveAttemptedReboot = 1"
and prior to the memcpy completing in qemu_prep_reset().

>> > This patch fixes this issue by letting the VM always execute the reboot
>> > routing while a reenrancy happenes instead of attempting apm_shutdown on
>> > qemu-kvm platform.
>
>The reason for the HaveAttemptedReboot check is to work around old
>versions of KVM that unexpectedly map the same memory to both 0xf0000
>and 0xffff0000.  So, it does not make sense to wrap the check in a
>!runningOnKVM() block as that disables the only reason for the check.
>
>I'm surprised you would see the above on a recent qemu/kvm though - as
>on a newer KVM I think the second reset would have to happen after
>HaveAttemptedReboot is set and prior to the memcpy in
>qemu_prep_reset() completing.  Can you verify your KVM version?
>
>-Kevin

I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can 
see this problem. 
I do like this: put a HA and a watchdog mechanism in a VM. Deliberately, 
let this VM lose heartbeat and don't feed dog. Then, after 2 minutes, 
a self-defined timeout, HA mechnism will issue a internal reboot command to
the VM and watchdog mechanism will issue a "virsh reset" from the host. Then, 
aforementioned problem will occurs in high probability. 

-Leixu

reply via email to

[Prev in Thread] Current Thread [Next in Thread]