[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: chacha20-s390 broken in 8.2.0 in TCG on s390x
From: |
Philippe Mathieu-Daudé |
Subject: |
Re: chacha20-s390 broken in 8.2.0 in TCG on s390x |
Date: |
Wed, 3 Jan 2024 15:37:08 +0100 |
User-agent: |
Mozilla Thunderbird |
On 3/1/24 15:01, Philippe Mathieu-Daudé wrote:
On 3/1/24 12:53, Philippe Mathieu-Daudé wrote:
Hi Richard,
On 3/1/24 09:54, Michael Tokarev wrote:
03.01.2024 03:22, Richard Henderson wrote:
On 12/22/23 01:51, Michael Tokarev wrote:
...
git bisect points to this commit:
commit ab84dc398b3b702b0c692538b947ef65dbbdf52f
Author: Richard Henderson <richard.henderson@linaro.org>
Date: Wed Aug 23 23:04:24 2023 -0700
tcg/optimize: Optimize env memory operations
So far, this seems to work on amd64 host, but fails on s390x host -
where this has been observed so far. Maybe it also fails in some
other combinations too, I don't yet know. Just finished bisecting
it on s390x.
I haven't been able to build a reproducer for this.
Have you an image or kernel you can share?
Sure.
Here's my actual testing "image":
http://www.corpit.ru/mjt/tmp/s390x-chacha.tar.gz
It contains vmlinuz and initrd - generated on a debian s390x system
using standard
debian tools.
Actual command line I used when doing bisection:
~/qemu/b/qemu-system-s390x -append "root=/dev/vda rw" -nographic
-smp 2 -drive format=raw,file=vmlinuz,if=virtio -no-user-config -m 1G
-kernel vmlinuz -initrd initrd -snapshot
Reducing a bit further, it works when disabling rotli_vec opcode
(commit 22cb37b417 "tcg/s390x: Implement vector shift operations"):
---
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index fbee43d3b0..5f147661e8 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -2918,3 +2918,5 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType
type, unsigned vece)
case INDEX_op_orc_vec:
+ return 1;
case INDEX_op_rotli_vec:
+ return TCG_TARGET_HAS_roti_vec;
case INDEX_op_rotls_vec:
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index e69b0d2ddd..5c18146a40 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -152,3 +152,3 @@ extern uint64_t s390_facilities[3];
#define TCG_TARGET_HAS_abs_vec 1
-#define TCG_TARGET_HAS_roti_vec 1
+#define TCG_TARGET_HAS_roti_vec 0
#define TCG_TARGET_HAS_rots_vec 1
---
Finally changing the constraints on op_rotli_vec seems to fix it:
---
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index fbee43d3b0..b3456fe857 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -3264,13 +3264,13 @@ static TCGConstraintSetIndex
tcg_target_op_def(TCGOpcode op)
case INDEX_op_ld_vec:
case INDEX_op_dupm_vec:
+ case INDEX_op_rotli_vec:
return C_O1_I1(v, r);
case INDEX_op_dup_vec:
return C_O1_I1(v, vr);
case INDEX_op_abs_vec:
case INDEX_op_neg_vec:
case INDEX_op_not_vec:
- case INDEX_op_rotli_vec:
case INDEX_op_sari_vec:
case INDEX_op_shli_vec:
case INDEX_op_shri_vec:
case INDEX_op_s390_vuph_vec:
case INDEX_op_s390_vupl_vec:
return C_O1_I1(v, v);
---
But I'm outside of my comfort zone so not really sure what I'm doing...
(I was inspired by the "the instruction verll only allows immediates up
to 32 bits." comment from
https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg317099.html)
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Richard Henderson, 2024/01/02
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Michael Tokarev, 2024/01/03
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Philippe Mathieu-Daudé, 2024/01/03
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Philippe Mathieu-Daudé, 2024/01/03
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x,
Philippe Mathieu-Daudé <=
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Richard Henderson, 2024/01/03
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Michael Tokarev, 2024/01/17
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Alex Bennée, 2024/01/17
- Re: chacha20-s390 broken in 8.2.0 in TCG on s390x, Philippe Mathieu-Daudé, 2024/01/17