qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU ARM SMP: IPI delivery delayed until next main loop


From: Peter Maydell
Subject: Re: [Qemu-devel] QEMU ARM SMP: IPI delivery delayed until next main loop event // how to improve IPI latency?
Date: Tue, 16 Jun 2015 12:53:36 +0100

On 16 June 2015 at 12:11, Alex Züpke <address@hidden> wrote:
> But the startup is not my problem, it's the later parts.

But it was my problem because it meant your test case wasn't
functional :-)

> I added the WFE to the initial lock. Here are two new tests, both are now 
> 3178 bytes in size:
> http://www.cs.hs-rm.de/~zuepke/qemu/ipi.elf
> http://www.cs.hs-rm.de/~zuepke/qemu/ipi_yield.elf
>
> Both start on my machine. The IPI ping-pong starts after the
> first timer interrupt after 1s. The problem is that IPIs are
> delivered only once a second after the timer interrupts QEMU's
> main loop.

Thanks. These test cases work for me, and I can repro the
same behaviour you see.

I intend to investigate why we're not at least timeslicing
between the two CPUs at a faster rate than "when there's
another timer interrupt".

> Something else: Existing ARM CPU so far do not use hyper-threading,
> but have real phyical cores. In contrast, QEMU is an extreme
> coarse-grained hyper-threading architectures, so existing legacy
> code that was written with physical cores in mind will trigger
> timing bugs in synchronization primitives then, especially code
> originally written for ARM11 MPCore like mine, which lacks WFE/SEV.
> If we consider QEMU as a platform to run legacy code, doesn't it
> make sense to address these issues?

In general QEMU's approach is more "run correct code reasonably
fast" rather than "run buggy code the same way the hardware
would" or "identify bugs in buggy code". There's certainly
scope for heuristics for making our timeslicing approach less
obtrusive, but we need to understand the underlying behaviour
first (and check it doesn't accidentally slow down other
common workloads in the process). In particular I think the
'do cpu_exit if one CPU triggers an interrupt on another'
approach is probably good, but I need to investigate why
it isn't working on your test programs without that extra
'level &&' condition first...

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]