|
From: | Lucas Mateus Martins Araujo e Castro |
Subject: | Re: [RFC PATCH v2 5/7] target/ppc: Implemented xvf16ger* |
Date: | Tue, 10 May 2022 11:47:40 -0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 |
On 5/6/22 07:18, Lucas Mateus Castro(alqotel) wrote:
+static inline float32 float32_neg(float32 a)
+{
+ if (((a & 0x7f800000) == 0x7f800000) && (a & 0x007fffff)) {
+ return a;
+ } else {
+ return float32_chs(a);
+ }
+}
This is wrong -- even NaNs get their signs changed.
Negation and absolute value are non-arithmetic operations.
The PowerISA 3.1 (page 589) defines bfp_negate as:
bfp_NEGATE(x)
x is a binary floating-point value that is represented in the
binary floating-point working format.
If x is not a NaN, return x with its sign complemented. Otherwise,
return x
So this is what I based on to create this function
If you're matching hardware results, this suggests...
+ if (neg_mul) {
+ msum = float32_neg(msum);
+ }
+ if (neg_acc) {
+ aux_acc = float32_neg(at[i].VsrSF(j));
+ } else {
+ aux_acc = at[i].VsrSF(j);
+ }
+ at[i].VsrSF(j) = float32_add(msum, aux_acc, excp_ptr);
This "add" should be "sub" instead of using a separate negation, when required.
I do wonder about the double-negation vs nans.
But in this case some way to negate msum would still be necessary, so maybe move float32_neg to target/ppc/fpu_helper.c and change the name, I used 2 negations as a way to keep closer to the description, it is in the ISA as:
if “[pm]xvf16ger2pp” then v2 ← bfp_ADD(r1, acc)
if “[pm]xvf16ger2pn” then v2 ← bfp_ADD(r1, bfp_NEGATE(acc))
if “[pm]xvf16ger2np” then v2 ← bfp_ADD(bfp_NEGATE(r1), acc)
if “[pm]xvf16ger2nn” then v2 ← bfp_ADD(bfp_NEGATE(r1),
bfp_NEGATE(acc))
But it could easily be change to an add/sub instead like you said
It looks like this could be
float32_muladd(float32_one, msum, aux_acc, flags, &status)
with flags set to float_muladd_negate_* for neg_mul and neg_acc. Any NaNs would go
through pick_nan_muladd and fail to be altered.
It would have to be float32_muladd(musm, float32_one, aux_acc, ...) to match the hardware result (it looks like qemu preference in a target PPC is to return A over C and C over B if all are NaN in a muladd, but A over B in a add/sub if both are NaN, so the equivalent of add(A,B) is muladd(A, 1, B))
That aside, having a muladd would bring it closer to vsxger over
negate + add/sub but personally I think I prefer the latter to not
add an unnecessary muladd, any opinions?
I'm not sure if I'm suggesting actual use of muladd, for the simplicity, or if you should
have an inline check for nans. I might need to think about this in the morning.
r~
[Prev in Thread] | Current Thread | [Next in Thread] |