[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu:
From: |
Auger Eric |
Subject: |
Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties |
Date: |
Thu, 27 Jun 2019 12:51:10 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 |
Hi Drew,
On 6/27/19 11:40 AM, Andrew Jones wrote:
> On Wed, Jun 26, 2019 at 04:58:05PM +0200, Auger Eric wrote:
>> Hi Drew,
>>
>> On 6/21/19 6:34 PM, Andrew Jones wrote:
>>> Introduce cpu properties to give fine control over SVE vector lengths.
>>> We introduce a property for each valid length up to the current
>>> maximum supported, which is 2048-bits. The properties are named, e.g.
>>> sve128, sve256, sve512, ..., where the number is the number of bits.
>> sve384 then in the natural order, otherwise it gives the impression you
>> can only specify * 128bit pow of 2 sizes at this stage of the reading.
>
> OK
>
>>
>>>
>>> It's now possible to do something like -cpu max,sve-max-vq=4,sve384=off
>> Document the fact sve512 cannot be turned to off, which sounds fully
>> sensible (by reading the code). By the way, I think an actual
>> documentation should be provided in qemu. Maybe as spec to agree on.
>
> Actually, maybe I could just allow it to be disabled. It would be
> a strange command line to do '-cpu max,sve-max-vq=4,sve512=off', but if
> that's what the user wants, then there's not really any good reason to
> block it. As I say below, mixing these types of properties on the command
> line isn't really a good idea, but it's not completely blocked because we
> want a user that prefers launching their (most likely TCG) guest with,
> e.g. '-cpu max,sve-max-vq=4', to also have a functional QMP CPU model
> expansion, but the CPU model expansion for SVE vector lengths depends on
> the sve<vl-bits> properties, thus mixing sve-max-vq with sve<vl-bits>,
> where sve-max-vq comes first. They're just not mixed on the command line.
OK
>
>>> to provide a guest vector lengths 128, 256, and 512 bits. The resulting
>>> set must conform to the architectural constraint of having all power-of-2
>>> lengths smaller than the maximum length present. It's also possible to
>>> only provide sve<vl-bits> properties, e.g. -cpu max,sve512=on> That example
>>> provides the machine with 128, 256, and 512 bit vector
>> lengths.
>>> It doesn't hurt to explicitly ask for all expected vector lengths,
>>> which is what, for example, libvirt should do.>
>>> Note1, it is not possible to use sve<vl-bits> properties before
>>> sve-max-vq, e.g. -cpu max,sve384=off,sve-max-vq=4, as supporting
>>> that overly complicates the user input validation.
>>>
>>> Note2, while one might expect -cpu max,sve-max-vq=4,sve512=on to be the
>>> same as -cpu max,sve512=on, they are not.
>> yep it is a bit weird
>>
>> Didn't you consider -cpu max,sve-max-vq=4,req_only=true removing non
>> power of 2 values and sve<vl-bits> setting a single VLbit?
>
> Nope. I mostly considered making sure sve-max-vq was supported to the
> level necessary to work with the new properties, and to not change its
> current semantics, but I don't want to extend it in any way, nor to
> require it to be used with the new properties at all. sve-max-vq isn't
> even going to be supported by '-cpu host', so we can't rely on it being
> part of the SVE vector length selection for normal KVM guests.
OK, as long as it is documented somewhere.
>
> Keep in mind that the current semantics of sve-max-vq are to enable all
> vector lengths up to and including that specified maximum. That's not
> a safe way to select vector lengths on KVM, as it may include vector
> lengths that the KVM host does not support, thus it's not something KVM
> users should use. It's already there for the 'max' cpu type though, so
> we work with it in this series the best we can for 'max', but not at all
> for 'host'.
Do you expect hosts to expose only a few of the permitted values? I
imagined it would generally expose none or all of them actually. If you
discourage people to use sve-max-vq and sve<>=on only sets 2^n values,
userspace will need to set manually all intermediate !2^n values
>
> Re: documentation for this. I suppose I could write something up that
> clarifies the properties and also suggests best practices regarding
> sve-max-vq.
To me this is really helpful to have a global understanding
>
>>> The former enables all vector lengths 512 bits and smaller
>> ( required and permitted)
>>> while the latter only sets the 512-bit
>>> length and its smaller power-of-2 lengths. It's probably best not to use
>>> sve-max-vq with sve<vl-bits> properties, but it can't be completely
>>> forbidden as we want qmp_query_cpu_model_expansion to work with guests
>>> launched with e.g. -cpu max,sve-max-vq=8 on their command line.
>> what does happen if you specify -cpu max,sve384=on? (seems to be allowed
>> in the code?)
>
> That's perfectly valid with tcg, you'll get 128,256,384. It's also valid
> with KVM if you're host supports it.
OK so it exposes the maximum configurable vector length + all the
corresponding required values.
>
>>
>>>
>>> Signed-off-by: Andrew Jones <address@hidden>
>>> ---
>>> target/arm/cpu.c | 6 +
>>> target/arm/cpu.h | 14 ++
>>> target/arm/cpu64.c | 360 ++++++++++++++++++++++++++++++++++++++-
>>> target/arm/helper.c | 11 +-
>>> target/arm/monitor.c | 16 ++
>>> tests/arm-cpu-features.c | 217 +++++++++++++++++++++++
>>> 6 files changed, 620 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
>>> index f08e178fc84b..e060a0d9df0e 100644
>>> --- a/target/arm/cpu.c
>>> +++ b/target/arm/cpu.c
>>> @@ -1019,6 +1019,12 @@ static void arm_cpu_realizefn(DeviceState *dev,
>>> Error **errp)
>>> return;
>>> }
>>>
>>> + arm_cpu_sve_finalize(cpu, &local_err);
>>> + if (local_err) {
>>> + error_propagate(errp, local_err);
>>> + return;
>>> + }
>>> +
>>> if (arm_feature(env, ARM_FEATURE_AARCH64) &&
>>> cpu->has_vfp != cpu->has_neon) {
>>> /*
>>> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
>>> index f9da672be575..cbb155cf72a5 100644
>>> --- a/target/arm/cpu.h
>>> +++ b/target/arm/cpu.h
>>> @@ -184,8 +184,13 @@ typedef struct {
>>>
>>> #ifdef TARGET_AARCH64
>>> # define ARM_MAX_VQ 16
>>> +void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp);
>>> +uint32_t arm_cpu_vq_map_next_smaller(ARMCPU *cpu, uint32_t vq);
>>> #else
>>> # define ARM_MAX_VQ 1
>>> +static inline void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp) { }
>>> +static inline uint32_t arm_cpu_vq_map_next_smaller(ARMCPU *cpu, uint32_t
>>> vq)
>>> +{ return 0; }
>>> #endif
>>>
>>> typedef struct ARMVectorReg {
>>> @@ -915,6 +920,15 @@ struct ARMCPU {
>>>
>>> /* Used to set the maximum vector length the cpu will support. */
>>> uint32_t sve_max_vq;
>>> +
>>> + /*
>>> + * In sve_vq_map each set bit is a supported vector length of
>>> + * (bit-number + 1) * 16 bytes, i.e. each bit number + 1 is the
>>> vector> + * length in quadwords. We need a map size twice the maximum
>>> + * quadword length though because we use two bits for each vector
>>> + * length in order to track three states: uninitialized, off, and on.
>>> + */
>>> + DECLARE_BITMAP(sve_vq_map, ARM_MAX_VQ * 2);
>>> };
>>>
>>> void arm_cpu_post_init(Object *obj);
>>> diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
>>> index 02ada65f240c..5def82111dee 100644
>>> --- a/target/arm/cpu64.c
>>> +++ b/target/arm/cpu64.c
>>> @@ -257,6 +257,149 @@ static void aarch64_a72_initfn(Object *obj)
>>> define_arm_cp_regs(cpu, cortex_a72_a57_a53_cp_reginfo);
>>> }
>>>
>>> +/*
>>> + * While we eventually use cpu->sve_vq_map as a typical bitmap, where each
>>> vq
>>> + * has only two states (off/on), until we've finalized the map at realize
>>> time
>>> + * we use an extra bit, at the vq - 1 + ARM_MAX_VQ bit number, to also
>>> allow
>>> + * tracking of the uninitialized state. The arm_vq_state typedef and
>>> following
>>> + * functions allow us to more easily work with the bitmap. Also, while the
>>> map
>>> + * is still initializing, sve-max-vq has an additional three states,
>>> bringing
>>> + * the number of its states to five, which are the following:
>>> + *
>>> + * sve-max-vq:
>>> + * 0: SVE is disabled. The default value for a vq in the map is 'OFF'.
>>> + * -1: SVE is enabled, but neither sve-max-vq nor sve<vl-bits>
>>> properties
>>> + * have yet been specified by the user. The default value for a vq
>>> in
>>> + * the map is 'ON'.
>>> + * -2: SVE is enabled and one or more sve<vl-bits> properties have been
>>> + * set to 'OFF' by the user, but no sve<vl-bits> properties have
>>> yet
>>> + * been set to 'ON'. The user is now blocked from setting
>>> sve-max-vq
>>> + * and the default value for a vq in the map is 'ON'.
>>> + * -3: SVE is enabled and one or more sve<vl-bits> properties have been
>>> + * set to 'ON' by the user. The user is blocked from setting
>>> sve-max-vq
>>> + * and the default value for a vq in the map is 'OFF'. sve-max-vq
>>> never
>>> + * transitions back to -2, even if later inputs disable the vector
>>> + * lengths that initially transitioned sve-max-vq to this state.
>>> This
>>> + * avoids the default values from flip-flopping.
>>> + * [1-ARM_MAX_VQ]: SVE is enabled and the user has specified a valid
>>> + * sve-max-vq. The sve-max-vq specified vq and all smaller
>>> + * vq's will be initially enabled. All larger vq's will
>>> have
>>> + * a default of 'OFF'.
>>> + */
>>> +#define ARM_SVE_INIT -1
>>> +#define ARM_VQ_DEFAULT_ON -2
>>> +#define ARM_VQ_DEFAULT_OFF -3
>>> +
>>> +#define arm_sve_have_max_vq(cpu) ((int32_t)(cpu)->sve_max_vq > 0)
>>> +
>>> +typedef enum arm_vq_state {
>>> + ARM_VQ_OFF,
>>> + ARM_VQ_ON,
>>> + ARM_VQ_UNINITIALIZED,
>>> +} arm_vq_state;
>>> +
>>> +static arm_vq_state arm_cpu_vq_map_get(ARMCPU *cpu, int vq)> +{
>>> + assert(vq <= ARM_MAX_VQ);
>>> +
>>> + return test_bit(vq - 1, cpu->sve_vq_map) |
>>> + test_bit(vq - 1 + ARM_MAX_VQ, cpu->sve_vq_map) << 1;
>>> +}> +
>>> +static void arm_cpu_vq_map_set(ARMCPU *cpu, int vq, arm_vq_state state)
>>> +{
>>> + assert(state == ARM_VQ_OFF || state == ARM_VQ_ON);
>>> + assert(vq <= ARM_MAX_VQ);
>>> +
>>> + clear_bit(vq - 1 + ARM_MAX_VQ, cpu->sve_vq_map);
>>> +
>>> + if (state == ARM_VQ_ON) {
>>> + set_bit(vq - 1, cpu->sve_vq_map);
>>> + } else {
>>> + clear_bit(vq - 1, cpu->sve_vq_map);
>>> + }
>>> +}
>>> +
>>> +static void arm_cpu_vq_map_init(ARMCPU *cpu)
>>> +{
>>> + bitmap_zero(cpu->sve_vq_map, ARM_MAX_VQ * 2);
>> nit: bitmap_clear(map, 0, ARM_MAX_VQ);
>
> bitmap_clear will only clear the specified bits, leaving the upper bits
> uninitialized, which could cause problems. It's not a problem here
> because we're using a zero initialized cpu state member, but if it was
> a bitmap like below, declared on the stack, then we need the zeroing at
> least once. I'd prefer to use zero here too, to keep it consistent.
Sorry I don't get it. I would have expected the above bitmap_clear would
0 initialize 0 - ARM_MAX_VQ-1 bits and the bitmap_set below would
initialize ARM_MAX_VQ - 2 * ARM_MAX_VQ -1 with 1's?
>
>> /* all VLs OFF */
>>> + bitmap_set(cpu->sve_vq_map, ARM_MAX_VQ, ARM_MAX_VQ);
>> /* all VLs uninitialized */
>
> I can add one comment that says
>
> "Set all VQ's to ARM_VQ_UNINITIALIZED" above both lines.
>
>>> +}
>>> +
>>> +static bool arm_cpu_vq_map_is_finalized(ARMCPU *cpu)
>>> +{
>>> + DECLARE_BITMAP(map, ARM_MAX_VQ * 2);
>>> +
>>> + bitmap_zero(map, ARM_MAX_VQ * 2);
>> same
>
> Here me must use bitmap_zero.
>
>>> + bitmap_set(map, ARM_MAX_VQ, ARM_MAX_VQ);
>>> + bitmap_and(map, map, cpu->sve_vq_map, ARM_MAX_VQ * 2);
>>> +
>>> + return bitmap_empty(map, ARM_MAX_VQ * 2);
>>> +}
>>> +
>>> +static void arm_cpu_vq_map_finalize(ARMCPU *cpu)
>>> +{
>>> + Error *err = NULL;
>>> + char name[8];
>>> + uint32_t vq;
>>> + bool value;
>>> +
>>> + /*
>>> + * We use the property get accessor because it knows what default
>>> + * values to return for uninitialized vector lengths.
>>> + */
>>> + for (vq = 1; vq <= ARM_MAX_VQ; ++vq) {
>>> + sprintf(name, "sve%d", vq * 128);
>>> + value = object_property_get_bool(OBJECT(cpu), name, &err);
>>> + assert(!err);
>>> + if (value) {
>>> + arm_cpu_vq_map_set(cpu, vq, ARM_VQ_ON);
>>> + } else {
>>> + arm_cpu_vq_map_set(cpu, vq, ARM_VQ_OFF);
>>> + }
>>> + }
>>> +
>>> + assert(arm_cpu_vq_map_is_finalized(cpu));
>> this can be removed
>
> Yes, it can be today, as the code works fine. The idea was that if
> somebody came in here and modified this in someway that broke the
> finalized state, then we'd catch it here before going off and doing
> weird things. This isn't a hot path, so I'd prefer we keep it, but
> if QEMU maintainers prefer to limit defensive code, then I can
> certainly remove it.
Well I understand this was useful was developing and to check some
tricky states but some of the asserts correspond to some checks that
look rather obvious, like this one. But well I let others give their
opinions and it is not a big deal either. Then we can wonder when one
needs a trace mesg before asserting ...
>
>>> +}
>>> +
>>> +void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
>>> +{
>>> + Error *err = NULL;
>>> +
>>> + if (!cpu->sve_max_vq) {
>>> + bitmap_zero(cpu->sve_vq_map, ARM_MAX_VQ * 2);
>>> + return;
>>> + }
>>> +
>>> + if (cpu->sve_max_vq == ARM_SVE_INIT) {
>>> + object_property_set_uint(OBJECT(cpu), ARM_MAX_VQ, "sve-max-vq",
>>> &err);
>>> + if (err) {
>>> + error_propagate(errp, err);
>>> + return;
>>> + }
>>> + assert(cpu->sve_max_vq == ARM_MAX_VQ);
>> I guess those asserts can be removed now?
>
> I'd prefer to keep it. It's critical that we get this right, so if
> somebody changes something somewhere in the property set/get code
> that would break this, then it's best we know right away. Again,
> though, if there's some reason to limit defensive code, even on the
> init path, then I can.
Here I would expect that if the set did not fail, sve-max-vq effectively
if set to the expected value. But maybe I don't remember the cases where
sve_max-vq can be set to some other value without reprting an error.
>
>>> + arm_cpu_vq_map_finalize(cpu);
>> move the arm_cpu_vq_map_finalize out of the if, at the end.
>
> That wouldn't work. In this branch arm_cpu_vq_map_finalize() comes
> after setting cpu->sve_max_vq. In the else branch it must come first.
That's right sorry
>
>>> + } else {
>>> + arm_cpu_vq_map_finalize(cpu);
>>> + if (!arm_sve_have_max_vq(cpu)) {
>>> + cpu->sve_max_vq = arm_cpu_vq_map_next_smaller(cpu, ARM_MAX_VQ
>>> + 1);
>>> + }
>>> + }
>>> +
>>> + assert(cpu->sve_max_vq == arm_cpu_vq_map_next_smaller(cpu,
>>> ARM_MAX_VQ));>
> What happened to my '+ 1' here? I see it's in the patch, but somehow got
> removed in your quote of the patch? For a second there I thought something
> was really wrong, since that assert should have fired for every full map.
Yep sorry for the cut :-(
>
>> same here
>
> Anyway my same argument that we don't want to leave arm_cpu_sve_finalize()
> without knowing we got this right applies.
>
>>> +}
>>> +
>>> +uint32_t arm_cpu_vq_map_next_smaller(ARMCPU *cpu, uint32_t vq)
>>> +{
>>> + uint32_t bitnum;
>>> +
>>> + assert(vq <= ARM_MAX_VQ + 1);
>>> + assert(arm_cpu_vq_map_is_finalized(cpu));
>>> +
>>> + bitnum = find_last_bit(cpu->sve_vq_map, vq - 1);
>> why do you pass ARM_MAX_VQ + 1 and then do vq -1? doesn't
>> find_last_bit() take the size which is ARM_MAX_VQ in this case?
>
> The size is ARM_MAX_VQ + 1, when you want to also check the bitnum
> corresponding to ARM_MAX_VQ. The bitnum for a VQ is always VQ - 1.
> Recall VQ=1, which is 128-bits. It takes the bit position zero.
> VQ=ARM_MAX_VQ=16 takes the bit position 15. As for the 'vq - 1' here
> in the find_last_bit() call, that's required because that's what the
> function says it does. It finds the next smaller VQ. So if you pass
> in, for example, vq=2, it shouldn't search anything but bit position
> zero:
>
> vq=2 => max next smaller is vq=1, bitnum = 0
> => search the bitmap of size 1 for the last set bit
>
> Or, if you want to search the whole map, including the maximum
> possibly VQ, as is done above, then you need to pass ARM_MAX_VQ + 1,
> since the max next smaller VQ is ARM_MAX_VQ.
OK I get it now. Maybe renaming the function into something like
arm_cpu_vq_map_last_bit() would have avoided that. the first assert
looks strange typically.
>
>>> + return bitnum == vq - 1 ? 0 : bitnum + 1;
>>> +}
>>> +
>>> static void cpu_max_get_sve_max_vq(Object *obj, Visitor *v, const char
>>> *name,
>>> void *opaque, Error **errp)
>>> {
>>> @@ -283,12 +426,203 @@ static void cpu_max_set_sve_max_vq(Object *obj,
>>> Visitor *v, const char *name,
>>> return;
>>> }
>>>
>>> + /*
>>> + * It gets complicated trying to support both sve-max-vq and
>>> + * sve<vl-bits> properties together, so we mostly don't. We
>>> + * do allow both if sve-max-vq is specified first and only once
>>> + * though.
>>> + */
>>> + if (cpu->sve_max_vq != ARM_SVE_INIT) {
>>> + error_setg(errp, "sve<vl-bits> in use or sve-max-vq already "
>>> + "specified");
>>> + error_append_hint(errp, "sve-max-vq must come before all "
>>> + "sve<vl-bits> properties and it must only "
>>> + "be specified once.\n");
>>> + return;
>>> + }
>>> +
>>> cpu->sve_max_vq = value;
>>>
>>> if (cpu->sve_max_vq == 0 || cpu->sve_max_vq > ARM_MAX_VQ) {
>>> error_setg(errp, "unsupported SVE vector length");
>>> error_append_hint(errp, "Valid sve-max-vq in range [1-%d]\n",
>>> ARM_MAX_VQ);
>>> + } else {
>>> + uint32_t vq;
>>> +
>>> + for (vq = 1; vq <= cpu->sve_max_vq; ++vq) {
>>> + char name[8];
>>> + sprintf(name, "sve%d", vq * 128);
>>> + object_property_set_bool(obj, true, name, &err);
>>> + if (err) {
>>> + error_propagate(errp, err);
>>> + return;
>>> + }
>>> + }
>>> + }
>>> +}
>>> +
>>> +static void cpu_arm_get_sve_vq(Object *obj, Visitor *v, const char *name,
>>> + void *opaque, Error **errp)
>>> +{
>>> + ARMCPU *cpu = ARM_CPU(obj);
>>> + int vq = atoi(&name[3]) / 128;
>>> + arm_vq_state vq_state;
>>> + bool value;
>>> +
>>> + vq_state = arm_cpu_vq_map_get(cpu, vq);
>>> +
>>> + if (!cpu->sve_max_vq) {
>>> + /* All vector lengths are disabled when SVE is off. */
>>> + value = false;
>>> + } else if (vq_state == ARM_VQ_ON) {
>>> + value = true;
>>> + } else if (vq_state == ARM_VQ_OFF) {
>>> + value = false;
>>> + } else {
>>> + /*
>>> + * vq is uninitialized. We pick a default here based on the
>>> + * the state of sve-max-vq and other sve<vl-bits> properties.
>>> + */
>>> + if (arm_sve_have_max_vq(cpu)) {
>>> + /*
>>> + * If we have sve-max-vq, then all remaining uninitialized
>>> + * vq's are 'OFF'.
>>> + */
>>> + value = false;
>>> + } else {
>>> + switch (cpu->sve_max_vq) {
>>> + case ARM_SVE_INIT:
>>> + case ARM_VQ_DEFAULT_ON:
>>> + value = true;
>>> + break;
>>> + case ARM_VQ_DEFAULT_OFF:
>>> + value = false;
>>> + break;
>>> + }
>>> + }
>>> + }
>>> +
>>> + visit_type_bool(v, name, &value, errp);
>>> +}
>>> +
>>> +static void cpu_arm_set_sve_vq(Object *obj, Visitor *v, const char *name,
>>> + void *opaque, Error **errp)
>>> +{
>>> + ARMCPU *cpu = ARM_CPU(obj);
>>> + int vq = atoi(&name[3]) / 128;
>>> + arm_vq_state vq_state;
>>> + Error *err = NULL;
>>> + uint32_t max_vq = 0;
>>> + bool value;
>>> +
>>> + visit_type_bool(v, name, &value, &err);
>>> + if (err) {
>>> + error_propagate(errp, err);
>>> + return;
>>> + }
>>> +
>>> + if (value && !cpu->sve_max_vq) {
>>> + error_setg(errp, "cannot enable %s", name);
>>> + error_append_hint(errp, "SVE has been disabled with sve=off\n");
>>> + return;
>>> + } else if (!cpu->sve_max_vq) {
>>> + /*
>>> + * We don't complain about disabling vector lengths when SVE
>>> + * is off, but we don't do anything either.
>>> + */
>>> + return;
>>> + }
>>> +
>>> + if (arm_sve_have_max_vq(cpu)) {
>>> + max_vq = cpu->sve_max_vq;
>>> + } else {
>>> + if (value) {
>>> + cpu->sve_max_vq = ARM_VQ_DEFAULT_OFF;
>>> + } else if (cpu->sve_max_vq != ARM_VQ_DEFAULT_OFF) {
>>> + cpu->sve_max_vq = ARM_VQ_DEFAULT_ON;
>>> + }
>>> + }
>>> +
>>> + /*
>>> + * We need to know the maximum vector length, which may just currently
>>> + * be the maximum length, in order to validate the enabling/disabling
>>> + * of this vector length. We use the property get accessor in order to
>>> + * get the appropriate default value for any uninitialized lengths.
>>> + */
>>> + if (!max_vq) {
>>> + char tmp[8];
>>> + bool s;
>>> +
>>> + for (max_vq = ARM_MAX_VQ; max_vq >= 1; --max_vq) {
>>> + sprintf(tmp, "sve%d", max_vq * 128);
>>> + s = object_property_get_bool(OBJECT(cpu), tmp, &err);
>>> + assert(!err);
>>> + if (s) {
>>> + break;
>>> + }
>>> + }
>>> + }
>>> +
>>> + if (arm_sve_have_max_vq(cpu) && value && vq > cpu->sve_max_vq) {
>>> + error_setg(errp, "cannot enable %s", name);
>>> + error_append_hint(errp, "vq=%d (%d bits) is larger than the "
>>> + "maximum vector length, sve-max-vq=%d "
>>> + "(%d bits)\n", vq, vq * 128,
>>> + cpu->sve_max_vq, cpu->sve_max_vq * 128);
>>> + } else if (arm_sve_have_max_vq(cpu) && !value && vq ==
>>> cpu->sve_max_vq) {
>>> + error_setg(errp, "cannot disable %s", name);
>>> + error_append_hint(errp, "The maximum vector length must be "
>>> + "enabled, sve-max-vq=%d (%d bits)\n",
>>> + cpu->sve_max_vq, cpu->sve_max_vq * 128);
>>> + } else if (arm_sve_have_max_vq(cpu) && !value && vq < cpu->sve_max_vq
>>> &&
>>> + is_power_of_2(vq)) {
>>> + error_setg(errp, "cannot disable %s", name);
>>> + error_append_hint(errp, "vq=%d (%d bits) is required as it is a "
>>> + "power-of-2 length smaller than the maximum, "
>>> + "sve-max-vq=%d (%d bits)\n", vq, vq * 128,
>>> + cpu->sve_max_vq, cpu->sve_max_vq * 128);
>>> + } else if (!value && vq < max_vq && is_power_of_2(vq)) {
>>> + error_setg(errp, "cannot disable %s", name);
>>> + error_append_hint(errp, "Vector length %d-bits is required as it "
>>> + "is a power-of-2 length smaller than another "
>>> + "enabled vector length. Disable all larger
>>> vector "
>>> + "lengths first.\n", vq * 128);
>>> + } else {
>> adding return in each if/elsif would allow to avoid this indent.
>
> Yeah, but I didn't really mind the indent :-)
>
>>> + if (value) {
>>> + bool fail = false;
>>> + uint32_t s;
>>> +
>>> + /*
>>> + * Enabling a vector length automatically enables all
>>> + * uninitialized power-of-2 lengths smaller than it, as
>>> + * per the architecture.
>>> + */
>> Test we are not attempting to enable a !is_power_of_2
>
> I'm not sure what this means. In this context 'vq' can be a !power-of-2
> if it wants to be, and there's no way for 's' to be a !power-of-2 because
> we filter the vq's with is_power_of_2(s).
Yep got it now
>
>>> + for (s = 1; s < vq; ++s) {
>>> + if (is_power_of_2(s)) {
>>> + vq_state = arm_cpu_vq_map_get(cpu, s);
>>> + if (vq_state == ARM_VQ_UNINITIALIZED) {
>>> + arm_cpu_vq_map_set(cpu, s, ARM_VQ_ON);
>>> + } else if (vq_state == ARM_VQ_OFF) {
>>> + fail = true;
>>> + break;
>>> + }
>>> + }
>>> + }
>>> +
>>> + if (fail) {
>>> + error_setg(errp, "cannot enable %s", name);
>>> + error_append_hint(errp, "Vector length %d-bits is disabled
>>> "
>>> + "and is a power-of-2 length smaller than
>>> "
>>> + "%s. All power-of-2 vector lengths
>>> smaller "
>>> + "than the maximum length are
>>> required.\n",
>>> + s * 128, name);
>>> + } else {
>>> + arm_cpu_vq_map_set(cpu, vq, ARM_VQ_ON);
>>> + }
>>> + } else {
>>> + arm_cpu_vq_map_set(cpu, vq, ARM_VQ_OFF);
>>> + }
>>> }
>>> }
>>>
>>> @@ -318,10 +652,11 @@ static void cpu_arm_set_sve(Object *obj, Visitor *v,
>>> const char *name,
>>> /*
>>> * We handle the -cpu <cpu>,sve=off,sve=on case by reinitializing,
>>> * but otherwise we don't do anything as an sve=on could come after
>>> - * a sve-max-vq setting.
>>> + * a sve-max-vq or sve<vl-bits> setting.
>>> */
>>> if (!cpu->sve_max_vq) {
>>> - cpu->sve_max_vq = ARM_MAX_VQ;
>>> + cpu->sve_max_vq = ARM_SVE_INIT;
>>> + arm_cpu_vq_map_init(cpu);
>>> }
>>> } else {
>>> cpu->sve_max_vq = 0;
>>> @@ -336,6 +671,7 @@ static void cpu_arm_set_sve(Object *obj, Visitor *v,
>>> const char *name,
>>> static void aarch64_max_initfn(Object *obj)
>>> {
>>> ARMCPU *cpu = ARM_CPU(obj);
>>> + uint32_t vq;
>>>
>>> if (kvm_enabled()) {
>>> kvm_arm_set_cpu_features_from_host(cpu);
>>> @@ -420,11 +756,29 @@ static void aarch64_max_initfn(Object *obj)
>>> cpu->dcz_blocksize = 7; /* 512 bytes */
>>> #endif
>>>
>>> - cpu->sve_max_vq = ARM_MAX_VQ;
>>> + /*
>>> + * sve_max_vq is initially unspecified, but must be initialized to
>>> a
>>> + * non-zero value (ARM_SVE_INIT) to indicate that this cpu type has
>>> + * SVE. It will be finalized in arm_cpu_realizefn().
>>> + */
>>> + cpu->sve_max_vq = ARM_SVE_INIT;
>>> object_property_add(obj, "sve-max-vq", "uint32",
>>> cpu_max_get_sve_max_vq,
>>> cpu_max_set_sve_max_vq, NULL, NULL,
>>> &error_fatal);
>>> object_property_add(obj, "sve", "bool", cpu_arm_get_sve,
>>> cpu_arm_set_sve, NULL, NULL, &error_fatal);
>>> +
>>> + /*
>>> + * sve_vq_map uses a special state while setting properties, so
>>> + * we initialize it here with its init function and finalize it
>>> + * in arm_cpu_realizefn().
>>> + */
>>> + arm_cpu_vq_map_init(cpu);
>>> + for (vq = 1; vq <= ARM_MAX_VQ; ++vq) {
>>> + char name[8];
>>> + sprintf(name, "sve%d", vq * 128);
>>> + object_property_add(obj, name, "bool", cpu_arm_get_sve_vq,
>>> + cpu_arm_set_sve_vq, NULL, NULL,
>>> &error_fatal);
>>> + }
>>> }
>>> }
>>>
>>> diff --git a/target/arm/helper.c b/target/arm/helper.c
>>> index f500ccb6d31b..b7b719dba57f 100644
>>> --- a/target/arm/helper.c
>>> +++ b/target/arm/helper.c
>>> @@ -5324,7 +5324,16 @@ static void zcr_write(CPUARMState *env, const
>>> ARMCPRegInfo *ri,
>>>
>>> /* Bits other than [3:0] are RAZ/WI. */
>>> QEMU_BUILD_BUG_ON(ARM_MAX_VQ > 16);
>>> - raw_write(env, ri, value & 0xf);
>>> + value &= 0xf;
>>> +
>>> + if (value) {> + /* get next vq that is smaller than or equal to
>>> value's vq */
>>> + uint32_t vq = value + 1;
>> ditto
>
> What does this 'ditto' map to? Oh, probably the discussion about vq's
> getting +1's and -1's. In this context 'value' is the ZCR_ELx
> representation of a VQ, which is VQ - 1, just like in the bitmap. To
> translate that to a VQ we need to do a '+ 1'.
Until here I follow. By the way in which doc did you find the
description of ZCR_ELx?
As I wrote in the comment
> here, we want to find the next smaller or equal VQ here, because this is
> the input VQ from the guest which may itself be a valid VQ, but if it's
> not, we need to step down to the next valid one. So we ask
> arm_cpu_vq_map_next_smaller() for 'vq + 1' in order to ensure 'vq' will
> be in the searched range. After we get a valid vq from
> arm_cpu_vq_map_next_smaller(), which must be at least '1', since that's
> required by the architecture and thus will always be present, we need to
> translate it back to the ZCR_ELx representation with the subtraction.
> Phew... We programmers do love adding and subtracting by one, right :-)
Got it now. but well I scratched my head
>
>>> + vq = arm_cpu_vq_map_next_smaller(cpu, vq + 1);
>>> + value = vq - 1;
>>> + }
>>> +
>>> + raw_write(env, ri, value);
>>>
>>> /*
>>> * Because we arrived here, we know both FP and SVE are enabled;
>>> diff --git a/target/arm/monitor.c b/target/arm/monitor.c
>>> index 157c487a1551..1e213906fd8f 100644
>>> --- a/target/arm/monitor.c
>>> +++ b/target/arm/monitor.c
>>> @@ -89,8 +89,24 @@ GICCapabilityList *qmp_query_gic_capabilities(Error
>>> **errp)
>>> return head;
>>> }
>>>
>>> +QEMU_BUILD_BUG_ON(ARM_MAX_VQ > 16);
>>> +
>>> +/*
>>> + * These are cpu model features we want to advertise. The order here
>>> + * matters as this is the order in which qmp_query_cpu_model_expansion
>>> + * will attempt to set them. If there are dependencies between features,
>>> + * as there are with the sve<vl-bits> features, then the order that
>>> + * considers those dependencies must be used.
>>> + *
>>> + * The sve<vl-bits> features need to be in reverse order in order to
>>> + * enable/disable the largest vector lengths first, ensuring all
>>> + * power-of-2 vector lengths smaller can also be enabled/disabled.
>>> + */
>>> static const char *cpu_model_advertised_features[] = {
>>> "aarch64", "pmu", "sve",
>>> + "sve2048", "sve1920", "sve1792", "sve1664", "sve1536", "sve1408",
>>> + "sve1280", "sve1152", "sve1024", "sve896", "sve768", "sve640",
>>> + "sve512", "sve384", "sve256", "sve128",
>>> NULL
>>> };
>>>
>>> diff --git a/tests/arm-cpu-features.c b/tests/arm-cpu-features.c
>> Please move all the tests in a separate patch.
>
> I'd prefer not to. I like keeping the test cases that test the new code in
> the same patch. It self documents what the test cases map to what code and
> allows immediate testing of the patch when bisecting. Also I'm not really
> sure how it makes review worse, as another patch would look the same, just
> in a different email.
A reviewier might not be familiar with both standard code and test and
feel reluctant to invest reading boths, hence reducing the chances to
collect R-b's. But that's my humble opinion.
>
>> Each day has enough trouble of its own ;-) sweat...
>
> I really appreciate the review! I realize it's generating sweat,
> especially with the European heat wave! You don't have to finish
> a patch before sending comments. I'm fine with multiple replies to
> the same patch if you want to break your review up into smaller
> bites.
Thanks. Yes breathing times are needed due to the heat and digestion of
the code ;-)
Thanks
Eric
>
> Thanks,
> drew
>
>>
>> Thanks
>>
>> Eric
>>> index 509e458e9c2f..a4bf6aec00df 100644
>>> --- a/tests/arm-cpu-features.c
>>> +++ b/tests/arm-cpu-features.c
>>> @@ -13,6 +13,18 @@
>>> #include "qapi/qmp/qdict.h"
>>> #include "qapi/qmp/qjson.h"
>>>
>>> +#if __SIZEOF_LONG__ == 8
>>> +#define BIT(n) (1UL << (n))
>>> +#else
>>> +#define BIT(n) (1ULL << (n))
>>> +#endif
>>> +
>>> +/*
>>> + * We expect the SVE max-vq to be 16. Also it must be <= 64
>>> + * for our test code, otherwise 'vls' can't just be a uint64_t.
>>> + */
>>> +#define SVE_MAX_VQ 16
>>> +
>>> #define MACHINE "-machine virt,gic-version=max "
>>> #define QUERY_HEAD "{ 'execute': 'query-cpu-model-expansion', " \
>>> "'arguments': { 'type': 'full', "
>>> @@ -137,6 +149,201 @@ static void assert_bad_props(QTestState *qts, const
>>> char *cpu_type)
>>> qobject_unref(resp);
>>> }
>>>
>>> +static void resp_get_sve_vls(QDict *resp, uint64_t *vls, uint32_t *max_vq)
>>> +{
>>> + const QDictEntry *e;
>>> + QDict *qdict;
>>> + int n = 0;
>>> +
>>> + *vls = 0;
>>> +
>>> + qdict = resp_get_props(resp);
>>> +
>>> + for (e = qdict_first(qdict); e; e = qdict_next(qdict, e)) {
>>> + if (strlen(e->key) > 3 && !strncmp(e->key, "sve", 3) &&
>>> + g_ascii_isdigit(e->key[3])) {
>>> + char *endptr;
>>> + int bits;
>>> +
>>> + bits = g_ascii_strtoll(&e->key[3], &endptr, 10);
>>> + if (!bits || *endptr != '\0') {
>>> + continue;
>>> + }
>>> +
>>> + if (qdict_get_bool(qdict, e->key)) {
>>> + *vls |= BIT((bits / 128) - 1);
>>> + }
>>> + ++n;
>>> + }
>>> + }
>>> +
>>> + g_assert(n == SVE_MAX_VQ);
>>> +
>>> + *max_vq = !*vls ? 0 : 64 - __builtin_clzll(*vls);
>>> +}
>>> +
>>> +static uint64_t sve_get_vls(QTestState *qts, const char *cpu_type,
>>> + const char *fmt, ...)
>>> +{
>>> + QDict *resp;
>>> + uint64_t vls;
>>> + uint32_t max_vq;
>>> +
>>> + if (fmt) {
>>> + QDict *args;
>>> + va_list ap;
>>> +
>>> + va_start(ap, fmt);
>>> + args = qdict_from_vjsonf_nofail(fmt, ap);
>>> + va_end(ap);
>>> +
>>> + resp = qtest_qmp(qts, QUERY_HEAD "'model': { 'name': %s, "
>>> + "'props': %p }"
>>> + QUERY_TAIL, cpu_type, args);
>>> + } else {
>>> + resp = do_query_no_props(qts, cpu_type);
>>> + }
>>> +
>>> + g_assert(resp);
>>> + resp_get_sve_vls(resp, &vls, &max_vq);
>>> + qobject_unref(resp);
>>> +
>>> + return vls;
>>> +}
>>> +
>>> +#define assert_sve_vls(qts, cpu_type, expected_vls, fmt, ...) \
>>> + g_assert(sve_get_vls(qts, cpu_type, fmt, ##__VA_ARGS__) ==
>>> expected_vls)
>>> +
>>> +static void sve_tests_default(QTestState *qts, const char *cpu_type)
>>> +{
>>> + /*
>>> + * With no sve-max-vq or sve<vl-bits> properties on the command line
>>> + * the default is to have all vector lengths enabled.
>>> + */
>>> + assert_sve_vls(qts, cpu_type, BIT(SVE_MAX_VQ) - 1, NULL);
>>> +
>>> + /*
>>> + * -------------------------------------------------------------------
>>> + * power-of-2(vq) all-power- can can
>>> + * of-2(< vq) enable disable
>>> + * -------------------------------------------------------------------
>>> + * vq < max_vq no MUST* yes yes
>>> + * vq < max_vq yes MUST* yes no
>>> + * -------------------------------------------------------------------
>>> + * vq == max_vq n/a MUST* yes** yes**
>>> + * -------------------------------------------------------------------
>>> + * vq > max_vq n/a no no yes
>>> + * vq > max_vq n/a yes yes yes
>>> + * -------------------------------------------------------------------
>>> + *
>>> + * [*] "MUST" means this requirement must already be satisfied,
>>> + * otherwise 'max_vq' couldn't itself be enabled.
>>> + *
>>> + * [**] Not testable with the QMP interface, only with the command
>>> line.
>>> + */
>>> +
>>> + /* max_vq := 8 */
>>> + assert_sve_vls(qts, cpu_type, 0x8b, "{ 'sve1024': true }");
>>> +
>>> + /* max_vq := 8, vq < max_vq, !power-of-2(vq) */
>>> + assert_sve_vls(qts, cpu_type, 0x8f,
>>> + "{ 'sve1024': true, 'sve384': true }");
>>> + assert_sve_vls(qts, cpu_type, 0x8b,
>>> + "{ 'sve1024': true, 'sve384': false }");
>>> +
>>> + /* max_vq := 8, vq < max_vq, power-of-2(vq) */
>>> + assert_sve_vls(qts, cpu_type, 0x8b,
>>> + "{ 'sve1024': true, 'sve256': true }");
>>> + assert_error(qts, cpu_type, "cannot disable sve256",
>>> + "{ 'sve1024': true, 'sve256': false }");
>>> +
>>> + /*
>>> + * max_vq := 3, vq > max_vq, !all-power-of-2(< vq)
>>> + *
>>> + * If given sve384=on,sve512=off,sve640=on the command line error
>>> would be
>>> + * "cannot enable sve640", but QMP visits the vector lengths in reverse
>>> + * order, so we get "cannot disable sve512" instead. The command line
>>> would
>>> + * also give that error if given sve384=on,sve640=on,sve512=off, so
>>> this is
>>> + * all fine. The important thing is that we get an error.
>>> + */
>>> + assert_error(qts, cpu_type, "cannot disable sve512",
>>> + "{ 'sve384': true, 'sve512': false, 'sve640': true }");
>>> +
>>> + /*
>>> + * We can disable power-of-2 vector lengths when all larger lengths
>>> + * are also disabled. The shorter, sve384=on,sve512=off,sve640=off
>>> + * works on the command line, but QMP doesn't know that all the
>>> + * vector lengths larger than 384-bits will be disabled until it
>>> + * sees the enabling of sve384, which comes near the end since it
>>> + * visits the lengths in reverse order. So we just have to explicitly
>>> + * disable them all.
>>> + */
>>> + assert_sve_vls(qts, cpu_type, 0x7,
>>> + "{ 'sve384': true, 'sve512': false, 'sve640': false, "
>>> + " 'sve768': false, 'sve896': false, 'sve1024': false, "
>>> + " 'sve1152': false, 'sve1280': false, 'sve1408':
>>> false, "
>>> + " 'sve1536': false, 'sve1664': false, 'sve1792':
>>> false, "
>>> + " 'sve1920': false, 'sve2048': false }");
>>> +
>>> + /* max_vq := 3, vq > max_vq, all-power-of-2(< vq) */
>>> + assert_sve_vls(qts, cpu_type, 0x1f,
>>> + "{ 'sve384': true, 'sve512': true, 'sve640': true }");
>>> + assert_sve_vls(qts, cpu_type, 0xf,
>>> + "{ 'sve384': true, 'sve512': true, 'sve640': false }");
>>> +}
>>> +
>>> +static void sve_tests_sve_max_vq_8(const void *data)
>>> +{
>>> + QTestState *qts;
>>> +
>>> + qts = qtest_init(MACHINE "-cpu max,sve-max-vq=8");
>>> +
>>> + assert_sve_vls(qts, "max", BIT(8) - 1, NULL);
>>> +
>>> + /*
>>> + * Disabling the max-vq set by sve-max-vq is not allowed, but
>>> + * of course enabling it is OK.
>>> + */
>>> + assert_error(qts, "max", "cannot disable sve1024", "{ 'sve1024': false
>>> }");
>>> + assert_sve_vls(qts, "max", 0xff, "{ 'sve1024': true }");
>>> +
>>> + /*
>>> + * Enabling anything larger than max-vq set by sve-max-vq is not
>>> + * allowed, but of course disabling everything larger is OK.
>>> + */
>>> + assert_error(qts, "max", "cannot enable sve1152", "{ 'sve1152': true
>>> }");
>>> + assert_sve_vls(qts, "max", 0xff, "{ 'sve1152': false }");
>>> +
>>> + /*
>>> + * We can disable non power-of-2 lengths smaller than the max-vq
>>> + * set by sve-max-vq, but not power-of-2 lengths.
>>> + */
>>> + assert_sve_vls(qts, "max", 0xfb, "{ 'sve384': false }");
>>> + assert_error(qts, "max", "cannot disable sve256", "{ 'sve256': false
>>> }");
>>> +
>>> + qtest_quit(qts);
>>> +}
>>> +
>>> +static void sve_tests_sve_off(const void *data)
>>> +{
>>> + QTestState *qts;
>>> +
>>> + qts = qtest_init(MACHINE "-cpu max,sve=off");
>>> +
>>> + /*
>>> + * SVE is off, so the map should be empty.
>>> + */
>>> + assert_sve_vls(qts, "max", 0, NULL);
>>> +
>>> + /*
>>> + * We can't turn anything on, but off is OK.
>>> + */
>>> + assert_error(qts, "max", "cannot enable sve128", "{ 'sve128': true }");
>>> + assert_sve_vls(qts, "max", 0, "{ 'sve128': false }");
>>> +
>>> + qtest_quit(qts);
>>> +}
>>> +
>>> static void test_query_cpu_model_expansion(const void *data)
>>> {
>>> QTestState *qts;
>>> @@ -159,9 +366,12 @@ static void test_query_cpu_model_expansion(const void
>>> *data)
>>> if (g_str_equal(qtest_get_arch(), "aarch64")) {
>>> assert_has_feature(qts, "max", "aarch64");
>>> assert_has_feature(qts, "max", "sve");
>>> + assert_has_feature(qts, "max", "sve128");
>>> assert_has_feature(qts, "cortex-a57", "pmu");
>>> assert_has_feature(qts, "cortex-a57", "aarch64");
>>>
>>> + sve_tests_default(qts, "max");
>>> +
>>> /* Test that features that depend on KVM generate errors without.
>>> */
>>> assert_error(qts, "max",
>>> "'aarch64' feature cannot be disabled "
>>> @@ -213,6 +423,13 @@ int main(int argc, char **argv)
>>> qtest_add_data_func("/arm/query-cpu-model-expansion",
>>> NULL, test_query_cpu_model_expansion);
>>>
>>> + if (g_str_equal(qtest_get_arch(), "aarch64")) {
>>> +
>>> qtest_add_data_func("/arm/max/query-cpu-model-expansion/sve-max-vq-8",
>>> + NULL, sve_tests_sve_max_vq_8);
>>> + qtest_add_data_func("/arm/max/query-cpu-model-expansion/sve-off",
>>> + NULL, sve_tests_sve_off);
>>> + }
>>> +
>>> if (kvm_available) {
>>> qtest_add_data_func("/arm/kvm/query-cpu-model-expansion",
>>> NULL, test_query_cpu_model_expansion_kvm);
>>>
>>
Re: [Qemu-arm] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties, Auger Eric, 2019/06/26
- Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties, Andrew Jones, 2019/06/27
- Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties, Auger Eric, 2019/06/27
- Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties, Andrew Jones, 2019/06/27
- Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties, Dave Martin, 2019/06/27
- Re: [Qemu-arm] [Qemu-devel] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties, Richard Henderson, 2019/06/27
Re: [Qemu-arm] [PATCH v2 07/14] target/arm/cpu64: max cpu: Introduce sve<vl-bits> properties, Richard Henderson, 2019/06/27