Re: [Qemu-devel] [PATCH v2 2/3] tcg/aarch64: Use ADRP+ADD to compute tar

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 2/3] tcg/aarch64: Use ADRP+ADD to compute tar

From:	Richard Henderson
Subject:	Re: [Qemu-devel] [PATCH v2 2/3] tcg/aarch64: Use ADRP+ADD to compute target address
Date:	Thu, 29 Jun 2017 09:36:47 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0

On 06/29/2017 12:52 AM, Pranith Kumar wrote:

We use ADRP+ADD to compute the target address for goto_tb. This patch
introduces the NOP instruction which is used to align the above
instruction pair so that we can use one atomic instruction to patch
the destination offsets.

CC: Richard Henderson <address@hidden>
CC: Alex Bennée <address@hidden>
Signed-off-by: Pranith Kumar <address@hidden>
---
  accel/tcg/translate-all.c    |  2 +-
  tcg/aarch64/tcg-target.inc.c | 26 +++++++++++++++++++++-----
  2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index f6ad46b613..b6d122e087 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -522,7 +522,7 @@ static inline PageDesc *page_find(tb_page_addr_t index)
  #elif defined(__powerpc__)
  # define MAX_CODE_GEN_BUFFER_SIZE  (32u * 1024 * 1024)
  #elif defined(__aarch64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (128ul * 1024 * 1024)
+# define MAX_CODE_GEN_BUFFER_SIZE  (3ul * 1024 * 1024 * 1024)


The max is 2GB, because the 4GB range of ADRP is signed.
The end of the buffer must be able to address the beginning of the buffer.

  void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
  {
      tcg_insn_unit *code_ptr = (tcg_insn_unit *)jmp_addr;
-    tcg_insn_unit *target = (tcg_insn_unit *)addr;
+    tcg_insn_unit adrp_insn = *code_ptr++;
+    tcg_insn_unit addi_insn = *code_ptr;

- reloc_pc26_atomic(code_ptr, target);

-    flush_icache_range(jmp_addr, jmp_addr + 4);
+    ptrdiff_t offset = (addr >> 12) - (jmp_addr >> 12);
+
+    /* patch ADRP */
+    adrp_insn = deposit32(adrp_insn, 29, 2, offset & 0x3);
+    adrp_insn = deposit32(adrp_insn, 5, 19, offset >> 2);
+    /* patch ADDI */
+    addi_insn = deposit32(addi_insn, 10, 12, addr & 0xfff);
+    atomic_set((uint64_t *)jmp_addr, adrp_insn | ((uint64_t)addi_insn << 32));
+    flush_icache_range(jmp_addr, jmp_addr + 8);

(1) You don't need to load the ADRP and ADDI insns, because you know exactlywhat they are. (2) You should check to see if the branch is within 128MB anduse a direct branch in that case; it will happen quite often. See the ppc64ppc_tb_set_jmp_target for an example.

          /* actual branch destination will be patched by
+           aarch64_tb_set_jmp_target later, beware of retranslation */

We can actually drop the retranslation comment now; that's a cleanup thatshould be applied to all of the backends...

+        tcg_out_insn(s, 3406, ADRP, TCG_REG_TMP, 0);
+        tcg_out_insn(s, 3401, ADDI, TCG_TYPE_I64, TCG_REG_TMP, TCG_REG_TMP, 0);
+        tcg_out_callr(s, TCG_REG_TMP);


Don't use callr, use BR like you did for goto_long in the previous patch.


r~

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v2 0/3] Relax code buffer size limitation on aarch64 hosts, Pranith Kumar, 2017/06/29
- [Qemu-devel] [PATCH v2 2/3] tcg/aarch64: Use ADRP+ADD to compute target address, Pranith Kumar, 2017/06/29
  - Re: [Qemu-devel] [PATCH v2 2/3] tcg/aarch64: Use ADRP+ADD to compute target address, Richard Henderson <=
- [Qemu-devel] [PATCH v2 1/3] tcg/aarch64: Introduce and use long branch to register, Pranith Kumar, 2017/06/29
  - Re: [Qemu-devel] [PATCH v2 1/3] tcg/aarch64: Introduce and use long branch to register, Richard Henderson, 2017/06/29
- [Qemu-devel] [PATCH v3 3/3] tcg/aarch64: Enable indirect jump path using LDR (literal), Pranith Kumar, 2017/06/29
  - Re: [Qemu-devel] [PATCH v3 3/3] tcg/aarch64: Enable indirect jump path using LDR (literal), Richard Henderson, 2017/06/29

Prev by Date: Re: [Qemu-devel] [RFC v1 4/4] util/oslib-win32: Recursivly pass the timeout
Next by Date: Re: [Qemu-devel] [RFC v1 2/4] util/oslib-win32: Remove invalid check
Previous by thread: [Qemu-devel] [PATCH v2 2/3] tcg/aarch64: Use ADRP+ADD to compute target address
Next by thread: [Qemu-devel] [PATCH v2 1/3] tcg/aarch64: Introduce and use long branch to register
Index(es):
- Date
- Thread