[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] docs/devel: update tcg-plugins page
From: |
Pierrick Bouvier |
Subject: |
[PATCH] docs/devel: update tcg-plugins page |
Date: |
Tue, 23 Jul 2024 13:54:51 -0700 |
Reflect recent changes on API (inline ops) and new plugins.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
docs/devel/tcg-plugins.rst | 101 +++++++++++++++++++++++--------------
1 file changed, 63 insertions(+), 38 deletions(-)
diff --git a/docs/devel/tcg-plugins.rst b/docs/devel/tcg-plugins.rst
index 954623f9bf1..5d2ebb92977 100644
--- a/docs/devel/tcg-plugins.rst
+++ b/docs/devel/tcg-plugins.rst
@@ -29,8 +29,8 @@ Once built a program can be run with multiple plugins loaded
each with
their own arguments::
$QEMU $OTHER_QEMU_ARGS \
- -plugin contrib/plugin/libhowvec.so,inline=on,count=hint \
- -plugin contrib/plugin/libhotblocks.so
+ -plugin contrib/plugins/libhowvec.so,inline=on,count=hint \
+ -plugin contrib/plugins/libhotblocks.so
Arguments are plugin specific and can be used to modify their
behaviour. In this case the howvec plugin is being asked to use inline
@@ -41,6 +41,14 @@ Linux user-mode emulation also evaluates the environment
variable
QEMU_PLUGIN="file=contrib/plugins/libhowvec.so,inline=on,count=hint" $QEMU
+QEMU plugins avoid to write directly to stdin/stderr, and use the log provided
+by the API (see function ``qemu_plugin_outs``).
+To show output, you may use this additional parameter::
+
+ $QEMU $OTHER_QEMU_ARGS \
+ -d plugin \
+ -plugin contrib/plugins/libhowvec.so,inline=on,count=hint
+
Writing plugins
---------------
@@ -93,11 +101,14 @@ translation event the plugin has an option to enumerate the
instructions in a block of instructions and optionally register
callbacks to some or all instructions when they are executed.
-There is also a facility to add an inline event where code to
-increment a counter can be directly inlined with the translation.
-Currently only a simple increment is supported. This is not atomic so
-can miss counts. If you want absolute precision you should use a
-callback which can then ensure atomicity itself.
+There is also a facility to add inline instructions doing various operations,
+like adding or storing an immediate value. It is also possible to execute a
+callback conditionally, with condition being evaluated inline. All those inline
+operations are associated to a ``scoreboard``, which is a thread-local storage
+automatically expanded when new cores/threads are created and that can be
+accessed/modified in a thread-safe way without any lock needed. Combining
inline
+operations and conditional callbacks offer a more efficient way to instrument
+binaries, compared to classic callbacks.
Finally when QEMU exits all the registered *atexit* callbacks are
invoked.
@@ -117,9 +128,9 @@ However the following assumptions can be made:
Translation Blocks
++++++++++++++++++
-All code will go through a translation phase although not all
-translations will be necessarily be executed. You need to instrument
-actual executions to track what is happening.
+All code will go through a translation phase although not all translations will
+necessarily be executed. You need to instrument actual executions to track what
+is happening.
It is quite normal to see the same address translated multiple times.
If you want to track the code in system emulation you should examine
@@ -135,13 +146,12 @@ change control flow mid-block.
Instructions
++++++++++++
-Instruction instrumentation runs before the instruction executes. You
-can be can be sure the instruction will be dispatched, but you can't
-be sure it will complete. Generally this will be because of a
-synchronous exception (e.g. SIGILL) triggered by the instruction
-attempting to execute. If you want to be sure you will need to
-instrument the next instruction as well. See the ``execlog.c`` plugin
-for examples of how to track this and finalise details after execution.
+Instruction instrumentation runs before the instruction executes. You can be
+sure the instruction will be dispatched, but you can't be sure it will
complete.
+Generally this will be because of a synchronous exception (e.g. SIGILL)
+triggered by the instruction attempting to execute. If you want to be sure you
+will need to instrument the next instruction as well. See the ``execlog.c``
+plugin for examples of how to track this and finalise details after execution.
Memory Accesses
+++++++++++++++
@@ -200,12 +210,12 @@ encouraged to contribute your own plugins plugins
upstream. There is a
basic plugins that are used to test and exercise the API during the
``make check-tcg`` target in ``tests\plugins``.
-- tests/plugins/empty.c
+- tests/plugin/empty.c
Purely a test plugin for measuring the overhead of the plugins system
itself. Does no instrumentation.
-- tests/plugins/bb.c
+- tests/plugin/bb.c
A very basic plugin which will measure execution in course terms as
each basic block is executed. By default the results are shown once
@@ -220,14 +230,13 @@ Behaviour can be tweaked with the following arguments:
* inline=true|false
- Use faster inline addition of a single counter. Not per-cpu and not
- thread safe.
+ Use faster inline addition of a single counter.
* idle=true|false
Dump the current execution stats whenever the guest vCPU idles
-- tests/plugins/insn.c
+- tests/plugin/insn.c
This is a basic instruction level instrumentation which can count the
number of instructions executed on each core/thread::
@@ -250,8 +259,7 @@ Behaviour can be tweaked with the following arguments:
* inline=true|false
- Use faster inline addition of a single counter. Not per-cpu and not
- thread safe.
+ Use faster inline addition of a single counter.
* sizes=true|false
@@ -267,18 +275,18 @@ Behaviour can be tweaked with the following arguments:
-d plugin ./tests/tcg/aarch64-linux-user/sha512-vector
...
0x40069c, 'bl #0x4002b0', 10 hits, 1093 match hits, Δ+1257 since last
match, 98 avg insns/match
- 0x4006ac, 'bl #0x403690', 10 hits, 1094 match hits, Δ+47 since last match,
98 avg insns/match
- 0x4037fc, 'bl #0x4002b0', 18 hits, 1095 match hits, Δ+22 since last match,
98 avg insns/match
- 0x400720, 'bl #0x403690', 10 hits, 1096 match hits, Δ+58 since last match,
98 avg insns/match
- 0x4037fc, 'bl #0x4002b0', 19 hits, 1097 match hits, Δ+22 since last match,
98 avg insns/match
- 0x400730, 'bl #0x403690', 10 hits, 1098 match hits, Δ+33 since last match,
98 avg insns/match
- 0x4037ac, 'bl #0x4002b0', 12 hits, 1099 match hits, Δ+20 since last match,
98 avg insns/match
+ 0x4006ac, 'bl #0x403690', 10 hits, 1094 match hits, Δ+47 since last match,
98 avg insns/match
+ 0x4037fc, 'bl #0x4002b0', 18 hits, 1095 match hits, Δ+22 since last match,
98 avg insns/match
+ 0x400720, 'bl #0x403690', 10 hits, 1096 match hits, Δ+58 since last match,
98 avg insns/match
+ 0x4037fc, 'bl #0x4002b0', 19 hits, 1097 match hits, Δ+22 since last match,
98 avg insns/match
+ 0x400730, 'bl #0x403690', 10 hits, 1098 match hits, Δ+33 since last match,
98 avg insns/match
+ 0x4037ac, 'bl #0x4002b0', 12 hits, 1099 match hits, Δ+20 since last match,
98 avg insns/match
...
For more detailed execution tracing see the ``execlog`` plugin for
other options.
-- tests/plugins/mem.c
+- tests/plugin/mem.c
Basic instruction level memory instrumentation::
@@ -291,8 +299,7 @@ Behaviour can be tweaked with the following arguments:
* inline=true|false
- Use faster inline addition of a single counter. Not per-cpu and not
- thread safe.
+ Use faster inline addition of a single counter.
* callback=true|false
@@ -302,7 +309,7 @@ Behaviour can be tweaked with the following arguments:
Count IO accesses (only for system emulation)
-- tests/plugins/syscall.c
+- tests/plugin/syscall.c
A basic syscall tracing plugin. This only works for user-mode. By
default it will give a summary of syscall stats at the end of the
@@ -332,6 +339,11 @@ run::
160 1 0
135 1 0
+- tests/plugins/inline.c
+
+This plugin is used for testing all inline operations, conditional callbacks
and
+scoreboard. It prints a per-cpu summary of all events.
+
- contrib/plugins/hotblocks.c
The hotblocks plugin allows you to examine the where hot paths of
@@ -342,9 +354,6 @@ with linux-user execution as system emulation tends to
generate
re-translations as blocks from different programs get swapped in and
out of system memory.
-If your program is single-threaded you can use the ``inline`` option for
-slightly faster (but not thread safe) counters.
-
Example::
$ qemu-aarch64 \
@@ -462,7 +471,6 @@ people at roughly where execution diverges. The only
argument you need
for the plugin is a path for the socket the two instances will
communicate over::
-
$ qemu-system-sparc -monitor none -parallel none \
-net none -M SS-20 -m 256 -kernel day11/zImage.elf \
-plugin ./contrib/plugins/liblockstep.so,sockpath=lockstep-sparc.sock \
@@ -664,6 +672,23 @@ The plugin will log the reason of exit, for example::
0xd4 reached, exiting
+- contrib/plugins/ips.c
+
+This plugin can limit the number of Instructions Per Second that are executed::
+
+ # get number of instructions
+ $ num_insn=$(./build/qemu-x86_64 -plugin ./build/tests/plugin/libinsn.so
-d plugin /bin/true |& grep total | sed -e 's/.*: //')
+ # limit speed to execute in 10 seconds
+ $ time ./build/qemu-x86_64 -plugin
./build/contrib/plugins/libips.so,ips=$(($num_insn/10)) /bin/true
+ real 10.000s
+
+Options:
+
+ * ips=N
+
+ Maximum number of instructions per cpu that can be executed in one second.
+ The plugin will sleep when the given number of instructions is reached.
+
Plugin API
==========
--
2.39.2
- [PATCH] docs/devel: update tcg-plugins page,
Pierrick Bouvier <=