qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/2] Change default pointer authentication algorithm on aarch


From: Pierrick Bouvier
Subject: Re: [PATCH 0/2] Change default pointer authentication algorithm on aarch64 to impdef
Date: Wed, 18 Dec 2024 11:08:03 -0800
User-agent: Mozilla Thunderbird

On 12/18/24 05:51, Peter Maydell wrote:
On Tue, 17 Dec 2024 at 21:08, Pierrick Bouvier
<pierrick.bouvier@linaro.org> wrote:

On 12/17/24 02:38, Peter Maydell wrote:
On Tue, 17 Dec 2024 at 07:40, Alex Bennée <alex.bennee@linaro.org> wrote:

Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
I think this is still a change worth to do, because people can get a
100% speedup with this simple change, and it's a better default than
the previous value.
In more, in case of this migration scenario, QEMU will immediately
abort upon accessing memory through a pointer.

I'm not sure about what would be the best way to make this change as
smooth as possible for QEMU users.

Surely we can only honour and apply the new default to -cpu max?


With all my respect, I think the current default is wrong, and it would
be sad to keep it when people don't precise cpu max, or for other cpus
enabling pointer authentication.

In all our conversations, there seems to be a focus on choosing the
"fastest" emulation solution that satisfies the guest (behaviour wise).
And, for a reason I ignore, pointer authentication escaped this rule.

I think the reason is just that we didn't understand how much
of a performance hit the architected algorithm for pointer auth
is in emulation. So we took our default approach of "implement
what the architecture says". Then later when we realised how
bad the effect was we added in a faster impdef authentication
algorithm, but we put it in as not-the-default because of our
usual bias towards "don't change existing behaviour".


I understand the reason behind the current choice.
For my personal knowledge, is there a QEMU policy for "breaking changes"?

I understand the concern regarding retro compatibility, but it would be
better to ask politely (with an error message) to people to restart
their virtual machines when they try to migrate, instead of being stuck
with a slow default forever.
In more, we are talking of a tcg scenario, for which I'm not sure people
use migration feature (save/restore) heavily, but I may be wrong on this.

Between the risk of breaking migration (with a polite error message),
and having a default that is 100% faster, I think it would be better to
favor the second one. If it would be a 5% speedup, I would not argue,
but slowing down execution with a factor of 2 is really a lot.

The point here about "breaking migration" is that we have a strong
set of rules:
  * if you say "-machine virt-8.2" you get "exactly the behaviour
    that the 'virt' machine type had in QEMU 8.2, and it is
    migration compatible
  * we can make changes that are not migration compatible only if we
    ensure that they are not applied to older versioned machine types
    (or if they're to devices that are only used in machines which
    do not have versioned machine types at all)
  * TCG '-cpu max' is a special case: it is not a fixed thing, and so
    it may acquire new non-migration-compatible changes between versions
    (and so if you care about VM migration compat you don't use it);
    but this is not true of the named CPU types that match real
    hardware implementations

This patch as it stands will not preserve the migration
guarantees that we make. So we need to fix it by either:
  * only making the default change on -cpu max
  * making the default change be bound to versioned types


I'm not sure to follow you on this second approach. The cpu is not versioned, and if someone use -machine virt (non versioned), is there a guarantee it should stay possible to migrate?

In other words, can we break the migration with "-machine virt -cpu model"?

As I say, I don't have a strong view on which of these we go for
(and I'm actually kind of leaning to the second, given the discussion).


After looking more closely, compared to backcompat_cntfreq, the cpu registers will be different, and migration fail when calling "write_list_to_cpustate" from "cpu_post_load" for register ID_AA64ISAR1_EL1, which contains pauth configuration.

If we can break the migration for (non versioned) virt machine, then I'll make the change for all cpus using the backcompat strategy, and if not possible, I'll only make the change for -cpu max.

That was what I thought we were aiming for, yes. We *could* have
a property on the CPU to say "use the old back-compatible default,
not the new one", which we then list in the appropriate hw_compat
array. (Grep for the "backcompat-cntfrq" property for an example of
this.) But I'm not sure if that is worth the effort compared to
just changing 'max'.

When we'll define hw_compat_10_0, and hw_compat_11_0, do we have to
carry this on forever? (Same question for "backcompat-cntfrq").

The machinery for how this works means that you only need to
put the property in the appropriate hw_compat array for the
machine version before where it was introduced. The 'virt-9.2'
machine type applies the properties listed in hw_compat_9_2
(you can think of the properties listed there as having the
meaning "downgrade the default behaviour back to what it was
in 9.2 and earlier".) The virt-9.1 machine type applies the
properties listed in hw_compat_9_1 and hw_compat_9_2. The
virt-9.0 machine type applies the properties listed in hw_compat_9_0,
_9_1 and _9_2.

This is all implemented by the boilerplate DEFINE_VIRT_MACHINE() and
virt_machine_*_options functions at the bottom of hw/arm/virt.c
(plus the common code that invokes). We have to carry all this
machinery around anyway to handle other migration-breaking changes
in other parts of QEMU, so it's pretty free to add another property
like backcompat-cntfrq here.

The very oldest versioned machine types are deprecated after
3 years and dropped after another 3 years, so eventually the
older hw_compat arrays will go away.

(It's not that much extra code to add the property, so I could
easily be persuaded the other way. Possible arguments include
preferring consistency across all CPUs. If we already make the
default be not "what the real CPU of this type uses" then that's
also an argument that we can set it to whatever is convenient;
if we do honour the CPU ID register values for the implementation
default then that's an argument that we should continue to do
so and not change the default to our impdef one.)


For the TCG use case, is there any visible side effect for the guest to
use any specific pointer authentication algorithm?
In other words, is there a scenario where pointer authentication would
work with impdef, but not with qarma{3,5}?
If no, I don't see any reason for a cpu to favor an expensive emulation.

The guest can look at the value that the pointer auth instruction
produces if it likes, so it can certainly tell whether there's
a difference. But the only reason to do that is in test code
that's checking that the pauth instructions do what they're
supposed to do. Architecturally because multiple authentication
options are permitted no well behaved guest is going to depend
on which one exactly is being used.

As I say, I do think it would be good to check whether our
current implementation is "default to qarma5 everywhere", or
whether it is "default to what the real CPU says it has in its
ID registers". If we are already defaulting to something that's
not what the real implementation does it's another piece of
evidence on the side of "we can just default to a different
not-matching-the-hardware choice".


We default to qarma5 (for tcg), or for what host cpu configures (for other accelerators).

thanks
-- PMM


reply via email to

[Prev in Thread] Current Thread [Next in Thread]