[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 13/22] target/loongarch: Add floating point arithmetic ins
|
From: |
Song Gao |
|
Subject: |
Re: [PATCH v2 13/22] target/loongarch: Add floating point arithmetic instruction translation |
|
Date: |
Tue, 27 Jul 2021 15:17:03 +0800 |
|
User-agent: |
Mozilla/5.0 (X11; Linux mips64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
Hi, Richard.
On 07/23/2021 01:44 PM, Richard Henderson wrote:
> On 7/20/21 11:53 PM, Song Gao wrote:
>> +uint64_t helper_fp_sqrt_d(CPULoongArchState *env, uint64_t fp)
>> +{
>> + fp = float64_sqrt(fp, &env->active_fpu.fp_status);
>> + update_fcsr0(env, GETPC());
>> + return fp;
>> +}
>> +
>> +uint32_t helper_fp_sqrt_s(CPULoongArchState *env, uint32_t fp)
>> +{
>> + fp = float32_sqrt(fp, &env->active_fpu.fp_status);
>> + update_fcsr0(env, GETPC());
>> + return fp;
>> +}
>
> I believe you will find it easier to take and return uint64_t, even for
> 32-bit operations. The manual says that the high bits may contain any value,
> so in my opinion you should not work hard to preserve the high bits, as you
> currently do with
>
>> + gen_load_fpr32(fp0, a->fj);
>> + gen_load_fpr32(fp1, a->fk);
>> + gen_helper_fp_add_s(fp0, cpu_env, fp0, fp1);
>> + gen_store_fpr32(fp0, a->fd);
>
> I think this should be as simple as
>
> gen_helper_fp_add_s(cpu_fpu[a->fd], cpu_env,
> cpu_fpu[a->fj], cpu_fpu[a->fk]);
>
> I also think that loongarch should learn from risc-v and change the
> architecture to "nan-box" single-precision results -- fill the high 32-bits
> with 1s. This is an SNaN representation for double-precision and will
> immediately fail when incorrectly using a single-precision value as a
> double-precision input.
>
> Thankfully the current architecture is backward compatible with nan-boxing.
>
by this method, the trans_fadd_s is
static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s * a)
{
TCGv_i64 fp0, fp1;
fp0 = tcg_temp_new_i64();
fp1 = tcg_temp_new_i64();
check_fpu_enabled(ctx);
gen_load_fpr64(fp0, a->fj);
gen_load_fpr64(fp1, a->fk);
gen_helper_fp_add_s(fp0, cpu_env, fp0, fp1);
gen_check_nanbox_s(fp0, fp0); /* from riscv */
gen_store_fpr64(fp0, a->fd);
tcg_temp_free_i64(fp0);
tcg_temp_free_i64(fp1);
return true;
}
uint64_t helper_fp_add_s(CPULoongArchState *env, uint64_t fp, uint64_t fp1)
{
uint32_t fp2;
fp2 = float32_add((uint32_t)fp, (uint32_t)fp1, &env->active_fpu.fp_status);
update_fcsr0(env, GETPC());
return (uint64_t)fp2;
}
is this right?
>> +/* Floating point arithmetic operation instruction translation */
>> +static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s * a)
>> +{
>> + TCGv_i32 fp0, fp1;
>> +
>> + fp0 = tcg_temp_new_i32();
>> + fp1 = tcg_temp_new_i32();
>> +
>> + check_fpu_enabled(ctx);
>> + gen_load_fpr32(fp0, a->fj);
>> + gen_load_fpr32(fp1, a->fk);
>> + gen_helper_fp_add_s(fp0, cpu_env, fp0, fp1);
>> + gen_store_fpr32(fp0, a->fd);
>> +
>> + tcg_temp_free_i32(fp0);
>> + tcg_temp_free_i32(fp1);
>> +
>> + return true;
>> +}
>
> Again, you should use some helper functions to reduce the repetition.
>
OK>> +static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
>> +{
>> + TCGv_i64 fp0, fp1, fp2, fp3;
>> +
>> + fp0 = tcg_temp_new_i64();
>> + fp1 = tcg_temp_new_i64();
>> + fp2 = tcg_temp_new_i64();
>> + fp3 = tcg_temp_new_i64();
>> +
>> + check_fpu_enabled(ctx);
>> + gen_load_fpr64(fp0, a->fj);
>> + gen_load_fpr64(fp1, a->fk);
>> + gen_load_fpr64(fp2, a->fa);
>> + check_fpu_enabled(ctx);
>
> Repeating check_fpu_enabled.
>
OK.
>> + gen_helper_fp_madd_d(fp3, cpu_env, fp0, fp1, fp2);
>> + gen_store_fpr64(fp3, a->fd);
>
> I think you might as well pass in the float_muladd_* constant to a single
> helper rather than having 4 different helpers.
>
OK
>
> r~
Again. thank you kindly help.
Thanks
Song Gao.
- [PATCH v2 11/22] target/loongarch: Add fixed point atomic instruction translation, (continued)
[PATCH v2 13/22] target/loongarch: Add floating point arithmetic instruction translation, Song Gao, 2021/07/21
[PATCH v2 15/22] target/loongarch: Add floating point conversion instruction translation, Song Gao, 2021/07/21
[PATCH v2 14/22] target/loongarch: Add floating point comparison instruction translation, Song Gao, 2021/07/21
[PATCH v2 16/22] target/loongarch: Add floating point move instruction translation, Song Gao, 2021/07/21