Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instruct

qemu-arm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instruct

From:	Richard Henderson
Subject:	Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions
Date:	Tue, 30 May 2023 10:21:47 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0

On 5/30/23 09:58, Ard Biesheuvel wrote:

On Tue, 30 May 2023 at 18:43, Richard Henderson
<richard.henderson@linaro.org> wrote:


On 5/30/23 06:52, Ard Biesheuvel wrote:

+#ifdef __x86_64__
+    if (have_aes()) {
+        __m128i *d = (__m128i *)rd;
+
+        *d = decrypt ? _mm_aesdeclast_si128(rk.vec ^ st.vec, (__m128i){})
+                     : _mm_aesenclast_si128(rk.vec ^ st.vec, (__m128i){});


Do I correctly understand that the ARM xor is pre-shift

+        return;
+    }
+#endif
+
       /* xor state vector with round key */
       rk.l[0] ^= st.l[0];
       rk.l[1] ^= st.l[1];


(like so)

whereas the x86 xor is post-shift

void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
{
     int i;
     Reg st = *v;
     Reg rk = *s;

     for (i = 0; i < 8 << SHIFT; i++) {
         d->B(i) = rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i & 15] + (i & ~15))]);
     }


(like so, from target/i386/ops_sse.h)?


Indeed. Using the primitive operations defined in the AES paper, we
basically have the following for n rounds of AES (for n in {10, 12,
14})

for (n-1 rounds) {
   AddRoundKey
   ShiftRows
   SubBytes
   MixColumns
}
AddRoundKey
ShiftRows
SubBytes
AddRoundKey

AddRoundKey is just XOR, but it is incorporated into the instructions
that combine a couple of these steps.

So on x86, we have

aesenc:
   ShiftRows
   SubBytes
   MixColumns
   AddRoundKey

aesenclast:
   ShiftRows
   SubBytes
   AddRoundKey

and on ARM we have

aese:
   AddRoundKey
   ShiftRows
   SubBytes

aesmc:
   MixColumns

What might help: could we do the reverse -- emulate the x86 aesdeclast 
instruction with
the aarch64 aesd instruction?


Help in what sense? To emulate the x86 instructions on a ARM host?


Well that too.  I meant help me understand the two primitives.

But yes, aesenclast can be implement using aese in a similar way,
i.e., by passing a {0} vector as the round key into the instruction,
and performing the XOR explicitly using the real round key afterwards.


Excellent, thanks.


r~

[Prev in Thread]

Current Thread

[Next in Thread]

[RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Ard Biesheuvel, 2023/05/30
- Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Philippe Mathieu-Daudé, 2023/05/30
- Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Richard Henderson, 2023/05/30
- Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Richard Henderson, 2023/05/30
  - Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Ard Biesheuvel, 2023/05/30
    - Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Richard Henderson <=
- Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Peter Maydell, 2023/05/30
  - Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions, Ard Biesheuvel, 2023/05/30

Prev by Date: Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions
Next by Date: [PATCH v3 01/20] target/arm: Add commentary for CPUARMState.exclusive_high
Previous by thread: Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions
Next by thread: Re: [RFC PATCH] target/arm: use x86 intrinsics to implement AES instructions
Index(es):
- Date
- Thread