[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Rethinking missed tick catchup
From: |
Anthony Liguori |
Subject: |
Re: [Qemu-devel] Rethinking missed tick catchup |
Date: |
Wed, 12 Sep 2012 09:44:10 -0500 |
User-agent: |
Notmuch/0.13.2+93~ged93d79 (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu) |
Jan Kiszka <address@hidden> writes:
> On 2012-09-12 15:54, Anthony Liguori wrote:
>>
>> Hi,
>>
>> We've been running into a lot of problems lately with Windows guests and
>> I think they all ultimately could be addressed by revisiting the missed
>> tick catchup algorithms that we use. Mike and I spent a while talking
>> about it yesterday and I wanted to take the discussion to the list to
>> get some additional input.
>>
>> Here are the problems we're seeing:
>>
>> 1) Rapid reinjection can lead to time moving faster for short bursts of
>> time. We've seen a number of RTC watchdog BSoDs and it's possible
>> that at least one cause is reinjection speed.
>>
>> 2) When hibernating a host system, the guest gets is essentially paused
>> for a long period of time. This results in a very large tick catchup
>> while also resulting in a large skew in guest time.
>>
>> I've gotten reports of the tick catchup consuming a lot of CPU time
>> from rapid delivery of interrupts (although I haven't reproduced this
>> yet).
>>
>> 3) Windows appears to have a service that periodically syncs the guest
>> time with the hardware clock. I've been told the resync period is an
>> hour. For large clock skews, this can compete with reinjection
>> resulting in a positive skew in time (the guest can be ahead of the
>> host).
>>
>> I've been thinking about an algorithm like this to address these
>> problems:
>>
>> A) Limit the number of interrupts that we reinject to the equivalent of
>> a small period of wallclock time. Something like 60 seconds.
>>
>> B) In the event of (A), trigger a notification in QEMU. This is easy
>> for the RTC but harder for the in-kernel PIT. Maybe it's a good time to
>> revisit usage of the in-kernel PIT?
>>
>> C) On acculumated tick overflow, rely on using a qemu-ga command to
>> force a resync of the guest's time to the hardware wallclock time.
>>
>> D) Whenever the guest reads the wallclock time from the RTC, reset all
>> accumulated ticks.
>>
>> In order to do (C), we'll need to plumb qemu-ga through QMP. Mike and I
>> discussed a low-impact way of doing this (having a separate dispatch
>> path for guest agent commands) and I'm confident we could do this for
>> 1.3.
>>
>> This would mean that management tools would need to consume qemu-ga
>> through QMP. Not sure if this is a problem for anyone.
>>
>> I'm not sure whether it's worth trying to support this with the
>> in-kernel PIT or not either.
>
> As with our current discussion around fixing the PIC and its impact on
> the PIT, we should try on the userspace model first and then check if
> the design can be adapted to support in-kernel as well.
>
> For which guests is the PIT important again? Old Linux kernels? Windows
> should be mostly happy with the RTC - or the HPET.
I thought that only 64-bit Win2k8+ used the RTC.
I thought win2k3 and even 32-bit win2k8 still used the PIT.
>> Are there other issues with reinjection that people are aware of? Does
>> anything seem obviously wrong with the above?
>
> We should take the chance and design everything in a way that the HPET
> can finally be (left) enabled.
I thought the issue with the HPET was access frequency and the cost of
heavy weight exits.
I don't have concrete data here. I've only heard it second hand. Can
anyone comment more?
Regards,
Anthony Liguori
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux
Re: [Qemu-devel] Rethinking missed tick catchup, Gleb Natapov, 2012/09/12