[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] tcg/optimize: Fix constant folding of setcond
From: |
wannacu |
Subject: |
Re: [PATCH] tcg/optimize: Fix constant folding of setcond |
Date: |
Sat, 7 Dec 2024 16:29:25 +0800 |
> The problem is stale data in tmp's arg_info, correct? Better would be adding
> fold_and(op2) a few lines later, after finishing setup of op2's arguments.
You're absolutely right, stale data in tmp's arg_info is the root issue here.
> (1) The inline assembly is incorrect, and (2) the test does not fail without
> your patch.
> So it is difficult for me to tell exactly what you're trying to test.
> Problems with the asm:
> - using edi directly instead of having a as an input,
> - passing uint8_t a, but using all 32-bits of edi; upper 24 are logically
> garbage.
> - not adding edx, ecx as clobbers, or better as temp arguments.
> - initializing res, but not adding as an input, or in-out argument.
Thank you for the detailed feedback and for pointing out the issues with the
inline assembly.
These points are all valid concerns, and I'll work on addressing them in the
patch.
After reading your response, I tested it again on another riscv
machine(licheepi 4a), and the
issue still persists.
In the error scenario, the `sete` instruction corresponds to the following op
and op_opt:
op:
setcond_i64 loc0,cc_dst,$0xffffffff,tsteq
op_opt:
and_i64 tmp14,cc_dst,$0xffffffff dead: 1 2 pref=0xffffffff
xor_i64 tmp0,tmp14,$0x1 dead: 1 2 pref=0xffffffff
> I *think* you want
>
> unsigned char test(unsigned a)
> {
> unsigned char res = 0xff;
> unsigned t1, t2;
>
> asm("lea -0x1160(%3), %1\n\t"
> "lea -0xd7b0(%3), %2\n\t"
> "cmp $0x9f, %1\n\t"
> "setbe %b1\n\t"
> "cmp $0x4f, %2\n\t"
> "setbe %b2\n\t"
> "or %2, %1\n\t"
> "cmp $0x200b, %3\n\t"
> "sete %0\n\t"
> : "+r"(res), "=&r"(t1), "=&r"(t2) : "r"(a));
> return res;
> }
>
> But as the test doesn't ever fail for me, I can't tell.
I also tested the updated inline assembly, and the test failed as expected.
I'm not sure why you couldn't reproduce the issue on your side; perhaps it's
easier to
reproduce on an RV64GC machine.
Thanks,
w