qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation


From: Alex Bennée
Subject: Re: [Qemu-devel] [RFC 00/20] Do away with TB retranslation
Date: Thu, 10 Sep 2015 19:55:23 +0100

Richard Henderson <address@hidden> writes:

> I've been looking at this problem off and on for the last week or so,
> prompted by the sparc performance work.  Although I havn't been able
> to get a proper sparc64 guest install working, I see the exact same
> problem with a mips guest.
>
<snip>
> In the past we've talked about getting rid of retranslation entirely.
> It's clever, but it certainly has its share of problems.  I gave it
> a go this weekend.
>
<snip>
> Thoughts on the approach?

I've only had a quick glance so far but I'm fairly familiar with the
concept from a previous life. I'll aim to do a full review later once
I've gotten through my MTTCG review backlog.

Anyway some quick points:

 * You can save data by only marking faulting instructions

Assuming that all asynchronous instructions trigger at the end/prologue
of basic blocks you only actually need to record the address of
potentially faulting instructions. In fact only a few backend
instructions will actually synchronously fault.

Of course this does have the downside of having to mark all those
instructions in the front end.

 * This method can also be used for additional rectification data

AIUI we currently ensure all load/stores are barriers and ensure the CPU
register file is updated before the occur. However if you wanted to you
could drop that requirement and mark the target-host register pair and
only fish it out when required on a fault.

 * Test suites are essential if your going to get clever

Last time I went through this I built a SPARC test suite to cover all
faulting instructions in all the various addressing modes. It flushed
out a lot of bugs.

I appreciate the QEMU's aims may be a bit less demanding and not need to
be fully complete and fix up problems as we hit them in the field.
However consider at least a framework of a testcase for checking PC
rectification as it will help in validating those fixes.

 * Delay slot/nPCs are a pain

Faults in delay slots are a pain to get right although maybe QEMUs
architecture makes it a little easier to do. Fortunately for me I no
longer have to worry too hard about these architectures, good luck ;-)

Anyway anything that gets rid of the re-translation cost I'm broadly
supportive of. I shall review the code later!

>
>
> r~
>
>
> Richard Henderson (20):
>   tcg: Rename debug_insn_start to insn_start
>   target-*: Unconditionally emit tcg_gen_insn_start
>   tcg: Allow extra data to be attached to insn_start
>   target-arm: Add condexec state to insn_start
>   target-i386: Add cc_op state to insn_start
>   target-mips: Add delayed branch state to insn_start
>   target-s390x: Add cc_op state to insn_start
>   target-sh4: Add flags state to insn_start
>   target-cris: Mirror gen_opc_pc into insn_start
>   target-sparc: Tidy gen_branch_a interface
>   target-sparc: Split out gen_branch_n
>   target-sparc: Remove gen_opc_jump_pc
>   target-sparc: Add npc state to insn_start
>   tcg: Merge cpu_gen_code into tb_gen_code
>   target-*: Drop cpu_gen_code define
>   tcg: Add TCG_MAX_INSNS
>   tcg: Pass data argument to restore_state_to_opc
>   tcg: Save insn data and use it in cpu_restore_state_from_tb
>   tcg: Remove gen_intermediate_code_pc
>   tcg: Remove tcg_gen_code_search_pc
>
>  include/exec/exec-all.h       |   6 +-
>  target-alpha/cpu.h            |   1 -
>  target-alpha/translate.c      |  55 +++-------
>  target-arm/cpu.h              |   2 +-
>  target-arm/translate-a64.c    |  39 ++-----
>  target-arm/translate.c        |  75 ++++---------
>  target-arm/translate.h        |   8 +-
>  target-cris/cpu.h             |   1 -
>  target-cris/translate.c       |  64 +++---------
>  target-cris/translate_v10.c   |   3 -
>  target-i386/cpu.h             |   2 +-
>  target-i386/translate.c       |  86 ++++-----------
>  target-lm32/cpu.h             |   1 -
>  target-lm32/translate.c       |  55 ++--------
>  target-m68k/cpu.h             |   1 -
>  target-m68k/translate.c       |  64 +++---------
>  target-microblaze/cpu.h       |   1 -
>  target-microblaze/translate.c |  56 +++-------
>  target-mips/cpu.h             |   2 +-
>  target-mips/translate.c       |  73 ++++---------
>  target-moxie/cpu.h            |   1 -
>  target-moxie/translate.c      |  65 ++++--------
>  target-openrisc/cpu.h         |   1 -
>  target-openrisc/translate.c   |  54 ++--------
>  target-ppc/cpu.h              |   1 -
>  target-ppc/translate.c        |  56 +++-------
>  target-s390x/cpu.h            |   2 +-
>  target-s390x/translate.c      |  61 +++--------
>  target-sh4/cpu.h              |   2 +-
>  target-sh4/translate.c        |  71 ++++---------
>  target-sparc/cpu.h            |   2 +-
>  target-sparc/translate.c      | 189 ++++++++++++++-------------------
>  target-tricore/translate.c    |  53 ++++------
>  target-unicore32/translate.c  |  57 +++-------
>  target-xtensa/cpu.h           |   1 -
>  target-xtensa/translate.c     |  52 ++-------
>  tcg/tcg-op.h                  |  52 +++++++--
>  tcg/tcg-opc.h                 |   4 +-
>  tcg/tcg.c                     |  96 ++++++++---------
>  tcg/tcg.h                     |  14 ++-
>  tci.c                         |   9 --
>  translate-all.c               | 237 
> ++++++++++++++++++++++++------------------
>  42 files changed, 578 insertions(+), 1097 deletions(-)

-- 
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]