qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH] Updated Sparc support


From: Fabrice Bellard
Subject: Re: [Qemu-devel] Re: [PATCH] Updated Sparc support
Date: Wed, 14 May 2003 13:48:32 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020828

Rusty Russell wrote:
In message <address@hidden> you write:

I also plan to add direct block chaining. I will try to make it portable by using the 'goto *' gcc extension, but I don't know yet if it will work on every CPU. The direct block chaining will generate something like:

 'goto *addr'

at the end of some translated blocks to jump either to the CPU core or directly to the next translated block. 'addr' will be a global 'void *' variable. Since no code will be patched to change block chaining, it will simplify the instruction cache invalidation issues and the threading issues.


Hmm, I had a more ambitious idea, and that was to keep simple stats on
which block last followed each block: if it goes to the same block
more than N times in a row, coalesce/chain them.

As blocks get longer, you have more opportunities for register
lifetime analsis, which could eliminate redundant stores to registers
in particular.

I haven't got actual code, so I haven't mentioned it before...

Thoughts?

It could be interesting to avoid some condition codes computations. Currently it is not possible to do more because qemu has no generic IR and I think I won't have the time to add one. Julian Seward (of the valgrind project) is thinking about adding a more generic IR in valgrind to allow cross debugging, so it might be interesting for valgrind.

BUT, I have a much simpler approach "a la FX!32" which has the advantage of being very simple and which needs very little modification in qemu:

You can launch your executable a first time to record statistics. Then you launch a special tool 'qemuopt' which statically generates a dynamic library with gcc containing the host cpu code of the most used basic block chains.

'qemuopt' is very easy to do : I discovered that by noting that gcc optimizes very well 'static inline' local functions. So you just have to generate a C source containing approximately:

void genfunc(CPUX86State *env)
{
    uint32_t T0, EAX, EBX, ...;

    EAX = env->regs[R_EAX];
    EBX = env->regs[R_EBX];

#define OPPROTO 'static inline'
#include "op-i386.c"

    op_movl_T0_EAX();
    op_movl_EBX_T0();

    env->regs[R_EAX] = EAX;
    env->regs[R_EBX] = EBX;
}

Then gcc does all the hard work for us :-)

Fabrice.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]