qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] MTTCG External Halt


From: Alistair Francis
Subject: Re: [Qemu-devel] MTTCG External Halt
Date: Tue, 30 Jan 2018 15:56:47 -0800

On Fri, Jan 5, 2018 at 6:23 PM, Alistair Francis <address@hidden> wrote:
> On Thu, Jan 4, 2018 at 3:08 AM, Alex Bennée <address@hidden> wrote:
>>
>> Alistair Francis <address@hidden> writes:
>>
>>> Hey guys, I'm super stuck with an ugly MTTCG issue and was wondering
>>> if anyone had any ideas.
>>>
>>> In the Xilinx fork of QEMU (based on 2.11) we have a way for CPUs to
>>> halt other CPUs. This is used for example when the power control unit
>>> halts the ARM A53s. To do this we have internal GPIO signals that end
>>> up calling a function that basically does this:
>>>
>>> To halt:
>>>     cpu->halted = true;
>>>     cpu_interrupt(cpu, CPU_INTERRUPT_HALT);
>>
>> Hmm I don't think you should be setting cpu->halted unless you know it
>> is safe to do so. As the other CPUs free-run during BQL this isn't
>> enough for a cross vCPU interaction. However you can schedule work to
>> run in the target vCPUs context safely.
>
> We actually pretty much only ever set it on reset.
>
>>
>> That said isn't the cpu_interrupt enough to trigger the target vCPU to
>> halt?
>>
>>>
>>> To un-halt
>>>     cpu->halted = false;
>>>     cpu_reset_interrupt(cpu, CPU_INTERRUPT_HALT);
>>
>> Again if cross vCPU context this needs to be scheduled against the
>> target vCPU.
>>
>>>
>>> We also have the standard ARM WFI (Wait For Interrupt) implementation
>>> in op_helper.c:
>>>     cs->halted = 1;
>>>     cs->exception_index = EXCP_HLT;
>>>     cpu_loop_exit(cs);
>>>
>>> Before MTTCG this used to work great, but now either we end up with
>>> the guest Linux complaining about CPU stalls or we hit:
>>> ERROR:/scratch/alistai/master-qemu/cpus.c:1516:qemu_tcg_cpu_thread_fn:
>>> assertion failed: (cpu->halted)
>>>
>>> If I remove the instances of manually setting cpu->halted then I don't
>>> see the asserts(), but the the WFI instruction doesn't work correctly.
>>> So it seems like setting the halted status externally from the CPU
>>> causes the issue.
>>
>>   /* during start-up the vCPU is reset and the thread is
>>    * kicked several times. If we don't ensure we go back
>>    * to sleep in the halted state we won't cleanly
>>    * start-up when the vCPU is enabled.
>>    *
>>    * cpu->halted should ensure we sleep in wait_io_event
>>    */
>>
>> I think what I'm trying to say is we should never be halted without
>> having gone via wait_io_event where we can sleep.
>>
>>
>>> I have tried setting it inside a lock, using atomic
>>> operations and running the setter async on the CPU, but nothing works.
>>>
>>> Any chance any one has some insight into a way to externally set a
>>> vCPU as halted/un-halted?
>>
>> See the PSCI code which uses the async interface for exactly this.

Grr... It's back.

I narrowed it down to a reset (triggered by a external GPIO) is
causing the problem. Apparently QEMU doesn't like halted CPUs being
reset while spinning around qemu_tcg_cpu_thread_fn().

I don't have a good solution though, as setting CPU_INTERRUPT_RESET
doesn't help (that isn't handled while we are halted) and
async_run_on_cpu()/run_on_cpu() doesn't reliably reset the CPU when we
want.

I've ever tried pausing all CPUs before reseting the CPU and them
resuming them all but that doesn't seem to to work either. Is there
anything I'm missing? Is there no reliable way to reset a CPU?

Alistair

>
> Yeah, that and a fix to our weird double reset fixed it.
>
> What I don't get is how a double reset would cause the assert() to be hit.
>
> Alistair
>
>>
>>>
>>> Thanks,
>>> Alistair
>>
>>
>> --
>> Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]