qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] timer issue on 1.7.0 and later


From: Alex Bligh
Subject: Re: [Qemu-devel] timer issue on 1.7.0 and later
Date: Sat, 8 Feb 2014 11:48:38 +0000

Rob,

On 7 Feb 2014, at 18:15, Rob Herring wrote:

> I've bisected a problem with system emulation and SMP kernels using
> per cpu timers to this commit. I can reproduce this problem on ARM
> emulation with both ARM generic timers (only in 1.7.0) and ARM MPCore
> timers. Using a single broadcast timer in the guest kernel works fine.
> My host is ubuntu 13.10.

I don't know the ARM emulation well, but from the description this looks
like a problem where timers are firing too often. The timer changes have
tended to uncover bugs elsewhere in the QEMU, for instance setting timer
expiry times to something very near zero. So far (touch wood) the timer
changes themselves have been relatively clean.

What I'd suggest you do is run qemu within gdb, and when you have
seen the sluggish behaviour, set a breakpoint in timerlist_run_timers
just before the line saying cb(opaque). If I'm right your breakpoint
should immediately hit. Step in to the callback and note which it is.
It should always (or nearly always) be the same timer (note the process
of debugging will cause other timers to expire).

You then want to find where that timer is set and see if the expiry
value looks silly. Normally the issue is that timer_mod takes an
argument (i) which is absolute time, not an interval, and (ii) is
measured in nanoseconds unless the timer's scale has been
set otherwise (quite often I've seen it read the current time
in nanoseconds then add an offset in milliseconds).

If you find which timer it is but can't work out why it's doing it,
I can take another look.

Alex


> 
> commit b1bbfe72ec1ebf302d97f886cc646466c0abd679
> Author: Alex Bligh <address@hidden>
> Date:   Wed Aug 21 16:02:55 2013 +0100
> 
>    aio / timers: On timer modification, qemu_notify or aio_notify
> 
>    On qemu_mod_timer_ns, ensure qemu_notify or aio_notify is called to
>    end the appropriate poll(), irrespective of use_icount value.
> 
>    On qemu_clock_enable, ensure qemu_notify or aio_notify is called for
>    all QEMUTimerLists attached to the QEMUClock.
> 
>    Signed-off-by: Alex Bligh <address@hidden>
>    Signed-off-by: Stefan Hajnoczi <address@hidden>
> 
> 
> This can be reproduced with a simple busybox initramfs and spawning
> several instances of a simple shell script to load the cores:
> 
> while [ 1 ]; do echo rob > /dev/null; done &
> 
> The symptom is user interaction become sluggish and jerky, and then
> kernel messages about soft lockup, rcu stalls and/or like this:
> 
> hrtimer: interrupt took 3030033000 ns
> [sched_delayed] sched: RT throttling activated
> 
> I also intermittently hang on boot hitting this warning:
> 
> [    0.640204] WARNING: CPU: 0 PID: 0 at
> /home/rob/proj/git/linux-2.6/kernel/time/clockevents.c:212
> clockevents_program_event+0x50/0x138()
> 
> which is from here:
> 
> if (unlikely(expires.tv64 < 0)) {
> WARN_ON_ONCE(1);
> return -ETIME;
> }
> 
> I'm not sure if this warning is caused by the same commit or not, but
> it seems like I'm getting wrong timer values from qemu.
> 
> 
> It appears to me that this bug report may also be related:
> 
> https://bugs.launchpad.net/qemu/+bug/1222034
> 
> Rob
> 
> 

-- 
Alex Bligh







reply via email to

[Prev in Thread] Current Thread [Next in Thread]