qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Bug 1297218] Re: guest hangs after live migration due


From: Alex Bligh
Subject: Re: [Qemu-devel] [Bug 1297218] Re: guest hangs after live migration due to tsc jump
Date: Wed, 30 Jul 2014 09:51:48 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0

On 29/07/2014 17:45, Paul Boven wrote:
> https://bugs.launchpad.net/bugs/1297218
...
> Bug description:
>   We have two identical Ubuntu servers running libvirt/kvm/qemu, sharing
>   a Gluster filesystem. Guests can be live migrated between them.
>   However, live migration often leads to the guest being stuck at 100%
>   for a while. In that case, the dmesg output for such a guest will show
>   (once it recovers): Clocksource tsc unstable (delta = 662463064082
>   ns). In this particular example, a guest was migrated and only after
>   11 minutes (662 seconds) did it become responsive again.
> 
>   It seems that newly booted guests doe not suffer from this problem,
>   these can be migrated back and forth at will. After a day or so, the
>   problem becomes apparent. It also seems that migrating from server A
>   to server B causes much more problems than going from B back to A. If
>   necessary, I can do more measurements to qualify these observations.

I've seen this a couple of times, though under odd circumstances.
If it's useful to repeat it, here's what I was doing.

* Run a host with ntpd (in my case the host was 12.04 running in
  a VMware fusion (sorry) VM on my Mac with VMware tools installed,
  which also syncs time), with nested virtualisation permitted
* Run 14.04 in two lxc containers which are fully privileged and also
  running ntp
* Run qemu in the two containers (with kvm), and migrate from one
  to the other and back repeatedly. I was using a cirros image.

Given that there are 3 versions of ntp and vmware tools all fighting
over synchronizing one virtual clock, this configuration is clearly
broken timewise, so problems were not unexpected. But it might help
someone debug it.

Once, eventually the guest broke and gave an oops type thing which
I didn't even bother to copy down - repeated lines showing it trying to
sync up clocks by increasing numbers of microseconds and
finally giving up.

-- 
Alex Bligh



reply via email to

[Prev in Thread] Current Thread [Next in Thread]