qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking re


From: Alex Bennée
Subject: Re: [Qemu-devel] qemu_system_reset_request() broken w.r.t BQL locking regime
Date: Thu, 06 Jul 2017 09:37:07 +0100
User-agent: mu4e 0.9.19; emacs 25.2.50.3

Paolo Bonzini <address@hidden> writes:

> On 05/07/2017 18:14, Peter Maydell wrote:
>>>   - Guest resets board, writing to some hw address (e.g.
>>>     arm_sysctl_write)
>>>   - This triggers qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET)
>>>   - We exit iowrite and drop the BQL
>>>   - vl.c schedules qemu_system_reset->qemu_devices_reset...arm_cpu_reset
>>>   - we start writing new values to CPU env while still in TCG code
>>>   - CHAOS!
>>>
>>> The general solution for this is to ensure these sort of tasks are done
>>> with safe work in the CPUs context when we know nothing else is running.
>>> It seems this is probably best done by modifying
>>> qemu_system_reset_request to queue work up on current_cpu and execute it
>>> as safe work - I don't think the vl.c thread should ever be messing
>>> about with calling cpu_reset directly.
>> My first thought is that qemu_system_reset() should absolutely
>> stop every CPU (or other runnable thing like a DMA agent) in the
>> system. The semantics are basically "like a power cycle", so
>> that should include a complete stop of the world. (Is this
>> what vm_stop() does? Dunno...)
>
> I agree, it should do vm_stop() as the first thing and, if applicable,
> vm_start() as the last thing, similar to e.g. savevm.

OK I did some more digging and basically the problem is cpu_stop_current
does the wrong thing. It can set cpu->stopped while still in the vCPU
thread which means when the vl.c thread does pause_all_vcpus() it thinks
the thread is paused when in fact it isn't leading to the chaos. I think
the fix is to tighten up our usage of these two functions. So my current
plan is:

* pause_all_vcpus() should never be called from vCPU/HW emulation

One case in kvm_apic has been fixed by Pranith. The other case in s390
should be converted to use async_safe_work. Once this is done we can
assert that pause_all_vcpus() is not in a vCPU thread and keep it for
qmp,hmp and gdb type operations.

* vm_stop() is probably being misused by vCPU threads

There are more uses than pause_all_vcpus here but they all seem to be
for error handling bail-out type things.

* cpu_stop_current() is probably superfluous now

It certainly shouldn't be called directly from the vCPU code
(rtas_power_off) and once we know pause_all_vcpus() can't be called
directly at least one call is gone. I think the current_cpu handling is
a relic of the days of single-threaded handling when it was a global.

Does that sound reasonable?

--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]