[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lightning] More on work on lightning
From: |
Paolo Bonzini |
Subject: |
Re: [Lightning] More on work on lightning |
Date: |
Sun, 26 Sep 2010 09:22:37 +0200 |
2010/9/25 Paulo César Pereira de Andrade
<address@hidden>:
> How to get forward/context information?
> 2. Add a standard field to jit_state_t or jit_local_state to be filled
> by the programmer
That's possible. However, I think this does not belong in lightning
at all. lightning users could do inlining at a high level to ensure
big enough subroutines are generated and the prolog overhead is not
important. Register allocation could be done at a higher level too,
and so could constant propagation.
> printf("%d\n", JIT_R0);
> =>
> subi_l %sp %sp 12
> str_i %sp %r0
> movi_p %r0 "%d\n"
> pushr_l %r0
> calli @printf
> addi %sp %sp 16
>
> it could have allocated stack in prolog for all calls done in
> the function, and the above could have became:
> stxi_i 4 %sp %r0
> movi_p %r0 "%d\n"
> str_p %r0 %sp
> calli @printf
As a code-size efficiency concern I can share it. But regarding code
performance, I think there's hardly a difference between the two.
Using the alignment padding more efficiently however is a good idea.
> it is even more appealing when there are several sequential
> function calls (like in my interpreter that currently uses
> lightning mostly to glue calls to C functions).
Why not use regparm(3) calling convention on i386 instead?
> Another issue is things like:
> foo(0,0,0,0);
> =>
> prepare 4
> xorr_i %r0 %r0
> pushr_i %r0
> xorr_i %r0 %r0
> pushr_i %r0
> xorr_i %r0 %r0
> pushr_i %r0
> xorr_i %r0 %r0
> pushr_i %r0
> finish @foo
> addi_l %sp %sp 16
>
> but this requires significant extra information to understand
> what is going on.
You can do constant propagation at a higher level for this, but
however again I don't think there's anything to worry about regarding
performance..
> Another somewhat unrelated comments is about the
> initial trigonometric functions using x87. Probably they
> are not (easily?) available on other cpus
Not at all, actually. And the inter-unit moves probably make them not
so much faster compared to libc's sin, which you have to use anyway on
non-x86.
I'm sorry that I disagree on (almost) everything you wrote in this
mail, it's usually not the case. :)
Paolo