qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC


From: Peter Maydell
Subject: Re: [Qemu-arm] [PATCH v8 23/25] target-arm: introduce ARM_CP_EXIT_PC
Date: Thu, 2 Feb 2017 11:31:01 +0000

On 2 February 2017 at 11:03, Alex Bennée <address@hidden> wrote:
>
> Peter Maydell <address@hidden> writes:
>> Does single-stepping (of the emulated architectural
>> debug step, and gdbstub singlestep) work across one of
>> these instructions?
>
> I'll have to test but I don't see why not. The instruction is fully
> executed we just ensure we have exited the run loop to process the flush
> before we get to the next instruction/

The reason I ask is that the single-stepping code path involves
doing some work at the tail end of the translate:

    if (unlikely(cs->singlestep_enabled || dc->ss_active)
        && dc->is_jmp != DISAS_EXC) {
        /* do some stuff */
    }

The other things that jump out of the normal code flow are:
 * exceptions (where we don't want to do finished-the-step
   work anyway as the insn hasn't executed)
 * SWI (hopefully we single step SWI right but maybe not)
 * YIELD, WFE (which are special cased so that they do the
   actual work only at the end of the gen_intermediate_code
   function and only if not single-stepping, so they're
   no-ops on singlestep)

You've introduced a new item to this list which isn't
handled by the singlestep code.

>> This is probably a question answered in the rest of the series,
>> but why do we need the helper to be able to longjump out to the
>> top level? Can't we just have the helper do its work and then
>> end the TB with tcg_gen_exit_tb(0) so we return to the top level
>> loop in the normal way?
>
> Well I guess this is a philosophical question. The cputlb API is
> offering the guarantee that when an *_all_cpus_synced() flush is done
> everything will be complete with respect to all vCPUS. This is reliant
> on the source vCPU executing an exclusive safe work which ensures all
> other vCPUs have halted and therefor will have run their safe work
> before returning to execution.
>
> If ARM wanted to it could call the *_all_cpus() variant, schedule its
> own exclusive safe work (a null function - as cputlb will have scheduled
> the flush) and exit the TB in the usual way. In fact this is the
> mechanism ARM could use if it wanted to defer the sync point to a later
> DMB instruction.
>
> I haven't implemented it yet as the flush stuff only comes up high in
> the perf runs with my aggressive TLB flush microbenchmarks.
>
> However I'm wary of having a _synched() variant which will only work
> correctly if the guest also does a bunch of other steps.

Well, with the implementation as it is you need to do a bunch
of extra steps to handle all the corner cases (condexec,
single stepping) that would be handled for you if you exited
the TB in the normal way rather than longjumping out of it...
IME longjumping out should be reserved for "we don't want to
continue executing whatever other generated code we have after
this" situations. Here we know definitely what we're going to
want to do, so it would be better to generate code that
arranged to leave the TB in the usual way.

>>>      default:
>>>          break;
>>>      }
>>> diff --git a/target/arm/translate.c b/target/arm/translate.c
>>> index 444a24c2b6..7bd18cd25d 100644
>>> --- a/target/arm/translate.c
>>> +++ b/target/arm/translate.c
>>> @@ -7508,6 +7508,10 @@ static int disas_coproc_insn(DisasContext *s, 
>>> uint32_t insn)
>>>              gen_set_pc_im(s, s->pc);
>>>              s->is_jmp = DISAS_WFI;
>>>              return 0;
>>> +        case ARM_CP_EXIT_PC:
>>> +            /* The helper may exit the cpu_loop so ensure PC is correct */
>>> +            gen_set_pc_im(s, s->pc);
>>> +            break;
>>
>> Do we also need to gen_set_condexec() ?
>
> Do we? This isn't an exception so we don't need to resolve the condition
> flags as long as there is enough information preserved so the next TB
> can resolve if it needs to.

Your longjump is effectively skipping the normal "end of the TB" code,
which is what usually does the set_condexec for you. At the end of a
TB the expectation is that everything's been sync'd back to the CPU
state structure.

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]