qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] tests/functional/aarch64: add tests for FEAT_RME


From: Pierrick Bouvier
Subject: Re: [PATCH v2] tests/functional/aarch64: add tests for FEAT_RME
Date: Tue, 3 Dec 2024 10:24:11 -0800
User-agent: Mozilla Thunderbird

On 12/3/24 06:56, Alex Bennée wrote:
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:

This boot an OP-TEE environment, and launch a nested guest VM inside it
using the Realms feature. We do it for virt and sbsa-ref platforms.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>

<snip>
+
+        self.vm.add_args('-accel', 'tcg')
+        self.vm.add_args('-cpu', 'max,x-rme=on')

With debug on the PAC function are certainly very high in the perf
report. So pauth-impdef=on seems worthwhile here.

+        self.vm.add_args('-m', '2G')
+        self.vm.add_args('-M', 'sbsa-ref')
+        self.vm.add_args('-drive', f'file={pflash0},format=raw,if=pflash')
+        self.vm.add_args('-drive', f'file={pflash1},format=raw,if=pflash')
+        self.vm.add_args('-drive', f'file=fat:rw:{virtual},format=raw')
+        self.vm.add_args('-drive', f'format=raw,if=none,file={drive},id=hd0')
+        self.vm.add_args('-device', 'virtio-blk-pci,drive=hd0')
+        self.vm.add_args('-device', 'virtio-9p-pci,fsdev=shr0,mount_tag=shr0')
+        self.vm.add_args('-fsdev', 
f'local,security_model=none,path={rme_stack},id=shr0')
+        self.vm.add_args('-device', 'virtio-net-pci,netdev=net0')
+        self.vm.add_args('-netdev', 'user,id=net0')
<snip>
+
+        self.vm.add_args('-accel', 'tcg')
+        self.vm.add_args('-cpu', 'max,x-rme=on')

And here.

<snip>

With that the tests both pass with --enable-debug (312s, 352s) and the
profile looks like:

    6.33%  qemu-system-aar  qemu-system-aarch64                       [.] 
arm_feature
    5.66%  qemu-system-aar  qemu-system-aarch64                       [.] 
tcg_flush_jmp_cache
    3.44%  qemu-system-aar  qemu-system-aarch64                       [.] 
rebuild_hflags_a64

This I suspect is triggered by assert_hflags_rebuild_correctly() which
is validating we've not skipped rebuilding the flags when we need to.
It's a lot easier than debugging why your execution trace looks weird.

    2.95%  qemu-system-aar  qemu-system-aarch64                       [.] 
extract64
    2.52%  qemu-system-aar  qemu-system-aarch64                       [.] 
extract64

This is usually triggered by translation code which uses extract64
heavily during instruction decode.

It might be useful to see if we can get functional tests run under TCG
to dump "info jit" at the end and ensure we are not over generating code
and exhausting the translation cache.

    2.12%  qemu-system-aar  qemu-system-aarch64                       [.] 
arm_el_is_aa64
    2.11%  qemu-system-aar  qemu-system-aarch64                       [.] 
arm_security_space_below_el3
    2.11%  qemu-system-aar  qemu-system-aarch64                       [.] 
deposit64
    1.49%  qemu-system-aar  qemu-system-aarch64                       [.] 
arm_hcr_el2_eff_secstate
    1.46%  qemu-system-aar  qemu-system-aarch64                       [.] 
arm_is_el2_enabled_secstate
    1.38%  qemu-system-aar  qemu-system-aarch64                       [.] 
extract32
    1.34%  qemu-system-aar  qemu-system-aarch64                       [.] 
extract64
    1.30%  qemu-system-aar  qemu-system-aarch64                       [.] 
get_phys_addr_lpae
    1.23%  qemu-system-aar  qemu-system-aarch64                       [.] 
aa64_va_parameters
    1.09%  qemu-system-aar  qemu-system-aarch64                       [.] rol32
    1.07%  qemu-system-aar  qemu-system-aarch64                       [.] 
probe_access_internal
    1.02%  qemu-system-aar  qemu-system-aarch64                       [.] 
deposit32



Thanks Alex.

I did the same investigation, and switching to pauth-impdef brings down time from 1500s to a more "acceptable" 450s on my machine. In my profile (using call graphs, which I'm not sure you used), I observe that 26% of the time is spent in assert_hflags_rebuild_correctly, which is enabled by --enable-debug-tcg.

I'll send a v3 switching to impdef and increasing the timeout, should be enough for this time.

Pierrick

reply via email to

[Prev in Thread] Current Thread [Next in Thread]