[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Instruction virtual address in TCG Plugins
|
From: |
Alex Bennée |
|
Subject: |
Re: Instruction virtual address in TCG Plugins |
|
Date: |
Tue, 14 Nov 2023 10:57:06 +0000 |
|
User-agent: |
mu4e 1.11.24; emacs 29.1 |
Mikhail Tyutin <m.tyutin@yadro.com> writes:
>> > What is the right way to get virtual address of either translation block
>> > or instruction inside of TCG plugin? Does
>> > plugin API allow that or it needs some extension?
>> >
>> > So far I use qemu_plugin_tb_vaddr() inside of my block translation
>> > callback to get block virtual address and then
>> > pass it as 'userdata' argument into
>> > qemu_plugin_register_vcpu_tb_exec_cb(). I use it later during code
>> > execution.
>> > It works well for user-mode emulation, but sometimes leads to
>> > incorrect addresses in system-mode emulation.
>>
>> You can use qemu_plugin_insn_vaddr and qemu_plugin_insn_haddr. But your
>> right something under one vaddr and be executed under another with
>> overlapping mappings. The haddr should be stable though I think.
>
> As far as I see haddr is ok and can be used to identify blocks. However, if I
> have haddr at block execution phase and
> I want to know vaddr, there is no API to get such mapping. Maybe it is
> possible to extract from software MMU, but I
> have no clue where to start with.
The translator doesn't know (at least since CF_PCREL) because the whole
point of that change was to avoid re-translating the same code from
multiple mappings. However we do have the ability to resolve a PC at
fault time so we could expose that to a execution callback.
>> > I suspect it is because of memory mappings by guest OS that changes
>> > virtual addresses for that block.
>> >
>> > I also looked at gen_empty_udata_cb() function and considered to extend
>> > plugin API to pass a program counter
>> > value as additional callback argument. I thought it would always give me
>> > valid virtual address of an instruction.
>> > Unfortunately, I didn't find a way to get value of that register in
>> > architecture agnostic way (it is 'pc' member in
>> > CPUArchState structure).
>>
>> When we merge the register api you should be able to do that. Although
>> during testing I realised that PC acted funny compared to everything
>> else because we don't actually update the shadow register every
>> instruction.
>
> We implemented similar API to read registers (by coincidence, I posted this
> patch at the same time as the API you
> mentioned) and I observe similar behavior. As far as I see, CPU state is only
> updated in between of executed translation
> blocks. Switching to 'singlestep' mode helps to fix that, but execution
> overhead is huge.
>
> There is also blocks 'chaining' mechanism which is likely contributes to
> corrupted blocks vaddr inside of callbacks.
> My guess is that 'pc' value for those chained blocks points to the first
> block of entire chain. Unfortunately, It is very
> hard to debug, because I can only see block chains when I run whole Linux
> guest OS. Does Qemu has small test
> application to trigger long enough chain of translation blocks?
No all registers should be resolved by the end of any block. There is
currently no optimisation of register usage between TBs. If you are
seeing PC corruption that would be a bug - but fundamentally things
would break pretty quick if tb_lookup() and friends didn't have an
accurate PC.
As for block chains any moderately complex loop should trigger chaining.
> Having those complexities makes me think to inject appropriate code into
> translation blocks to compute actual block
> vaddr at execution stage. The problem here is to find a variable where
> I can load 'pc' at start of translation block.
I think it would be pretty easy to ensure there is a rectified PC value
written before calling any callback - arguably the QEMU_PLUGIN_CB_R_REGS
flag should do this.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro