|
From: | Richard Henderson |
Subject: | Re: [PATCH 31/67] target/arm: Convert handle_fpfpcvt to decodetree |
Date: | Fri, 6 Dec 2024 09:10:28 -0600 |
User-agent: | Mozilla Thunderbird |
On 12/6/24 07:48, Peter Maydell wrote:
+static bool do_fcvt_g(DisasContext *s, arg_fcvt *a, + ARMFPRounding rmode, bool is_signed) +{ + TCGv_i64 tcg_int; + int check = fp_access_check_scalar_hsd(s, a->esz); + + if (check <= 0) { + return check == 0; + } + + tcg_int = cpu_reg(s, a->rd); + do_fcvt_scalar(s, (a->sf ? MO_64 : MO_32) | (is_signed ? MO_SIGN : 0), + a->esz, tcg_int, a->shift, a->rn, rmode); + + if (!a->sf) { + tcg_gen_ext32u_i64(tcg_int, tcg_int);For the MO_16 and MO_32 input cases we already did a zero-extend-to-64-bits inside do_fcvt_scalar(). Maybe we should put the tcg_gen_ext32u_i64() also inside do_fcvt_scalar() in the cases of MO_64 input MO_32 output which are the only ones that actually need it?
I thought about that. (0) In that case the duplicate zero-extend will be optimized away. (1) I thought it was clearer to retain the !sf test here rather than rely on a zero-extend elsewhere. (2) In the scalar vector case, the best method for Vd.H is to clear the entire vector and only then store the 0th element directly from the bottom bits of either TCGv_{i32,i64}. Otherwise we wind up with two zero-extends which cannot be folded: tcg_gen_ext16u_i32 + tcg_gen_extu_i32_i64 or tcg_gen_extu_i32_i64 + tcg_gen_ext16u_i64. Fixing this duplication would require new tcg ops to extend-and-change-type. While Vd.S does not suffer the same fate, it's easiest to use the same method as Vd.H. See patch 55. r~
[Prev in Thread] | Current Thread | [Next in Thread] |