Hi Pierrick,
December 6, 2024 at 12:03 AM, "Pierrick Bouvier" wrote:
On 12/5/24 13:50, Julian Ganz wrote:
If you cannot rely on an input being a sensible value, doesn't that
render the input useless?
I agree. If for a specific event it's impossible to provide a value (i.e. the
value has no meaning for a real cpu), it will just point that we need several
types of data per event, and the compromise of having a single callback won't
be possible.
We should differentiate "it's hard to find this value in QEMU" vs "this value does
not exist in real life". The first can be solved if we put effort into it. And every time a
cpu changes it's flow of execution, it makes sense to find where it was just before.
One of the end goals is to be able to build a full control flow graph, with
edges labeled on transition type (exceptions, traps, interrupts, jump,
fallback), which we can do with the triple {event,from,to}.
I agree that that triple is sensible for any event type and likely
useful to plugin authors. At least if the semantics are sufficiently
uniform among event types. However, I also feel that given the actual
implementation (hooks sprinkled over target specific code) this is not
easily achievable reliably. At least testability should be a hard
requirement. Otherwise the API's reliability will inevitably deteriorate
over time without any way to tell how bad the situation got.
Let's try to move forward, and solve the problems we have with from_pc. The
testing part can be solved already (as explained in a previous message). In
which cases can't you identify from_pc?
I'll have to check, but problems that I discussed with a colleague
included jumps to an unmapped page resulting in the appropriate
exception. We ultimately agreed that in such a situation from_pc should
point to the jump target inside the unmapped page, instead of, say, the
jump. We assume that most targets should already behave this way without
further changes. However, in order to compute the correct from_pc, we
need to know the jump target before the exception is raised (i.e. right
after the jump instruction is executed), and that's not necessarily
straight-forward to do in a plugin.
It's an interesting conversation. For the scope of this series, I agree you
should use the jump target, which triggered the trap.
In fine, transitions should simply follow what the cpu does.
- orig_insn: jump to A
- jump_target: execute A traps
- page_fault: load page
- jump_target: come back to A
event(JUMP, orig_insn, jump_target) // not covered by this series
event(EXCEPTION, jump_target, page_fault)
... execute page_fault (with potential other transitions)
event(JUMP, end_page_fault, jump_target)
In the case of a double trap, we could follow the same logic, and represent the
original transition that lead to the trap, and the two consecutive traps.
Does it make sense?
Yes, those transitions are correct imo. And if a jump event should be
introduced at some point, the call sequence would look like that. My
issue is that testing this (in a plugin) will not be straight forward
or even impossible. And overly complex tests don't exactly provoke
confidence.