[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH v1 03/43] accel/tcg: Add gvec size changing operations
From: |
Anton Johansson |
Subject: |
Re: [RFC PATCH v1 03/43] accel/tcg: Add gvec size changing operations |
Date: |
Tue, 3 Dec 2024 19:08:18 +0100 |
On 22/11/24, Richard Henderson wrote:
> On 11/20/24 19:49, Anton Johansson wrote:
> > Adds new functions to the gvec API for truncating, sign- or zero
> > extending vector elements. Currently implemented as helper functions,
> > these may be mapped onto host vector instructions in the future.
> >
> > For the time being, allows translation of more complicated vector
> > instructions by helper-to-tcg.
> >
> > Signed-off-by: Anton Johansson <anjo@rev.ng>
> > ---
> > accel/tcg/tcg-runtime-gvec.c | 41 +++++++++++++++++
> > accel/tcg/tcg-runtime.h | 22 +++++++++
> > include/tcg/tcg-op-gvec-common.h | 18 ++++++++
> > tcg/tcg-op-gvec.c | 78 ++++++++++++++++++++++++++++++++
> > 4 files changed, 159 insertions(+)
> >
> > diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c
> > index afca89baa1..685c991e6a 100644
> > --- a/accel/tcg/tcg-runtime-gvec.c
> > +++ b/accel/tcg/tcg-runtime-gvec.c
> > @@ -1569,3 +1569,44 @@ void HELPER(gvec_bitsel)(void *d, void *a, void *b,
> > void *c, uint32_t desc)
> > }
> > clear_high(d, oprsz, desc);
> > }
> > +
> > +#define DO_SZ_OP1(NAME, DSTTY, SRCTY)
> > \
> > +void HELPER(NAME)(void *d, void *a, uint32_t desc)
> > \
> > +{
> > \
> > + intptr_t oprsz = simd_oprsz(desc);
> > \
> > + intptr_t elsz = oprsz/sizeof(DSTTY);
> > \
> > + intptr_t i;
> > \
> > +
> > \
> > + for (i = 0; i < elsz; ++i) {
> > \
> > + SRCTY aa = *((SRCTY *) a + i);
> > \
> > + *((DSTTY *) d + i) = aa;
> > \
> > + }
> > \
> > + clear_high(d, oprsz, desc);
> > \
>
> This formulation is not valid.
>
> (1) Generic forms must *always* operate strictly on columns. This
> formulation is either expanding a narrow vector to a wider vector or
> compressing a wider vector to a narrow vector.
>
> (2) This takes no care for byte ordering of the data between columns. This
> is where sticking strictly to columns helps, in that we can assume that data
> is host-endian *within the column*, but we cannot assume anything about the
> element indexing of ptr + i.
Concerning (1) and (2), is this a limitation imposed on generic vector
ops. to simplify mapping to host vector instructions where
padding/alignment of elements might differ? From my understanding, the
helper above should be fine since we can assume contiguous elements?
But maybe it doesn't make sense to add a gvec op. that is only
implemented via helper, I'm not sure.
> (3) This takes no care for element overlap if A == D.
Ah, good point!
> The only form of sign/zero-extract that you may add generically is an alias
> for
>
> d[i] = a[i] & mask
>
> or
>
> d[i] = (a[i] << shift) >> shift
>
> where A and D use the same element type. We could add new tcg opcodes for
> these (particularly the second, for sign-extension), though x86_64 does not
> support it, afaics.
I see, I don't think we can make this work for Hexagon vector ops., as
an example consider V6_vadduwsat which performs an unsigned saturated
add of 32-bit elements, currently we emit
void emit_V6_vadduwsat(intptr_t vec2, intptr_t vec7, intptr_t vec6) {
VectorMem mem = {0};
intptr_t vec5 = temp_new_gvec(&mem, 256);
tcg_gen_gvec_zext(MO_64, MO_32, vec5, vec7, 256, 128, 256);
intptr_t vec1 = temp_new_gvec(&mem, 256);
tcg_gen_gvec_zext(MO_64, MO_32, vec1, vec6, 256, 128, 256);
tcg_gen_gvec_add(MO_64, vec1, vec1, vec5, 256, 256);
intptr_t vec3 = temp_new_gvec(&mem, 256);
tcg_gen_gvec_dup_imm(MO_64, vec3, 256, 256, 4294967295ull);
tcg_gen_gvec_umin(MO_64, vec1, vec1, vec3, 256, 256);
tcg_gen_gvec_trunc(MO_32, MO_64, vec2, vec1, 128, 256, 128);
}
so we really do rely on the size-changing property of zext here, the
input vectors are 128-byte and we expand them to 256-byte. We could
expand vector operations within the instruction to the largest vector
size, but would need to zext and trunc to destination and source
registers anyway.
//Anton
- Re: [RFC PATCH v1 03/43] accel/tcg: Add gvec size changing operations,
Anton Johansson <=