qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 31/67] target/arm: Convert handle_fpfpcvt to decodetree


From: Richard Henderson
Subject: Re: [PATCH 31/67] target/arm: Convert handle_fpfpcvt to decodetree
Date: Fri, 6 Dec 2024 09:10:28 -0600
User-agent: Mozilla Thunderbird

On 12/6/24 07:48, Peter Maydell wrote:
+static bool do_fcvt_g(DisasContext *s, arg_fcvt *a,
+                      ARMFPRounding rmode, bool is_signed)
+{
+    TCGv_i64 tcg_int;
+    int check = fp_access_check_scalar_hsd(s, a->esz);
+
+    if (check <= 0) {
+        return check == 0;
+    }
+
+    tcg_int = cpu_reg(s, a->rd);
+    do_fcvt_scalar(s, (a->sf ? MO_64 : MO_32) | (is_signed ? MO_SIGN : 0),
+                   a->esz, tcg_int, a->shift, a->rn, rmode);
+
+    if (!a->sf) {
+        tcg_gen_ext32u_i64(tcg_int, tcg_int);

For the MO_16 and MO_32 input cases we already did a
zero-extend-to-64-bits inside do_fcvt_scalar().
Maybe we should put the tcg_gen_ext32u_i64() also
inside do_fcvt_scalar() in the cases of MO_64 input
MO_32 output which are the only ones that actually need it?

I thought about that.

(0) In that case the duplicate zero-extend will be optimized away.

(1) I thought it was clearer to retain the !sf test here rather
    than rely on a zero-extend elsewhere.

(2) In the scalar vector case, the best method for Vd.H is to clear
    the entire vector and only then store the 0th element directly
    from the bottom bits of either TCGv_{i32,i64}.

    Otherwise we wind up with two zero-extends which cannot be folded:
    tcg_gen_ext16u_i32 + tcg_gen_extu_i32_i64 or
    tcg_gen_extu_i32_i64 + tcg_gen_ext16u_i64.
    Fixing this duplication would require new tcg ops to
    extend-and-change-type.

    While Vd.S does not suffer the same fate, it's easiest to use
    the same method as Vd.H.  See patch 55.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]