[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lightning] jit_calli requires jit_prolog
From: |
Paulo César Pereira de Andrade |
Subject: |
Re: [Lightning] jit_calli requires jit_prolog |
Date: |
Tue, 19 Aug 2014 13:50:23 -0300 |
2014-08-19 11:26 GMT-03:00 Paul Cercueil <address@hidden>:
> Hi Paulo,
Hi Paul,
> I think you misunderstood me; I'm not even trying to execute this code,
> Lightning segfaults when *generating* the code.
>
> Basically, if I do the following:
> _jit = jit_new_state();
> jit_prepare();
> jit_pushargi(0);
> jit_finishi(c_function);
Yes. The support for code outside of a prolog/epilog is very limited,
jit_finish* will not work, but jit_call* will, that is, calling a function
that does not receive arguments. This is a generalized limitation
to provide the same interface to all backends, because it *may* be
required to pass arguments on the stack, so the assertion is there
to ensure there is a stack frame to work on.
Usage for code outside a prolog/epilog in lightning 2.x was expected
to bea single "jmpi main" like first instruction, or a single jump to a
"calli abort"; almost everything else may have backend specific issues,
e.g. some backends do not have a division instruction, do not have
a carry flag, etc and then lightning synthesizes it with function
calls or temporary registers, that may need a save/load.
> Then Lightning will segfault when calling "jit_prepare". When enabling
> assertions, I get:
> lightning.c: 1135: _jit_prepare: Assertion '_jit->comp->function' failed.
>
> The same code compiles correctly if I generate a function context using
> jit_prolog and jit_epilog.
>
> Now, the thing is that I don't need prologs and epilogs. Let's take this
> example:
> [Block A] -> [Block B] -> [Block C] -> [Block D]
>
> To each block of code corresponds its own Lightning jit state. Switches
> between blocks are not function calls, but GOTOs (jump register or jump
> immediate).
lightning 2.x does not provide proper support for this usage. It can be
done if hacking the internals, for example, call jit_allocai for the stack
offset to save registers, know the jit_state_t internals, set the proper
"_jitc->function->regoff[regno]" stack offset for the other jit_state_t that
generate code for Block B and C, remove the assertions... would need
to create special labels with "jit_indirect()" to make lightning understand
code from anywhere may jump at that label. Well, not trivial.
> The first compiled block of code, Block A, is meant to save every single
> register used by Lightning to the stack (much like jit_prolog, but for the
> whole register set). The last one, Block D, will dequeue all the registers
> previously saved on the stack, and return to the calling function.
>
> This means that all the blocks in between, blocks B and C, can use freely
> the whole register set provided by Lightning, wihtout taking care of
> saving/loading them on the stack. Doing so would be a waste of RAM and CPU
> time, especially since I have thousands of intermediate blocks.
>
> Those intermediate blocks can be correctly generated by Lightning if
> jit_prolog and jit_epilog are not used, provided that jit_calli/jit_finishi
> are not used... otherwise Lightning crashes when generating.
lightning 2.x can be updated to support this, but it would not be trivial,
because it would need to reserve 1 to 3 registers for scratch on register
rich backends, or create stack space dynamically to save/restore
registers in "atomic" operations, that is, on some backends would
need something like
sub sp 8
st 0 sp tmp
<<actual jit_xyz implementation, that uses tmp>>
ld tmp sp 0
add sp 8
because it does not know if tmp is live.
For the moment, I suggest you to use my initial work on lightning
in github, that happens to have a mips port.
> -Paul
>
> Le 19/08/2014 14:54, Paulo César Pereira de Andrade a écrit :
>
>> 2014-08-18 10:59 GMT-03:00 Paul Cercueil <address@hidden>:
>>>
>>> Hi,
>>
>>
>> Hi,
>>
>>> I am currently developing a simple MIPS-to-everything dynamic recompiler
>>> based on Lightning. The recompiler generates blocks of machine code that
>>> are
>>> meant to be chained: each block jumps to the following one. This is what
>>> I
>>> want to do (MIPS code):
>>>
>>> # lightrec_emit_end_of_block:emitter.c:37
>>> 0x77d97060 move a0,s6
>>> 0x77d97064 lui t9,0x77d7
>>> 0x77d97068 ori t9,t9,0x2198
>>> 0x77d9706c jalr t9
>>> 0x77d97070 nop
>>> 0x77d97074 jr v0
>>> 0x77d97078 nop
>>
>>
>> This code should never crash, but you are probably having trouble
>> with save/reload in elsewhere.
>> I suggest making sure to test with lightning built with
>> --enable-assertions,
>> so that it would tell when/where there is a fault.
>>
>> lightning 2.x has a simple register allocator, and requires
>> entering/leaving
>> the jit as a function call, so that it knows and can handle "bounds" of
>> live
>> registers.
>>
>>> Basically, jump to a C function that will return the address of the next
>>> block of code (optionally compile it beforehand), then jump to the
>>> address
>>> returned.
>>>
>>> The problem is that Lightning will segfault when generating this code, as
>>> soon as jit_calli() is executed. Apparently, jit_prolog() and
>>> jit_epilog()
>>> are required for jit_calli() to work.
>>
>>
>> For mips this should work if using only to JIT_Rn, and not using too
>> much
>> complex constructs with too long live ranges for JIT_Rn (or it could
>> allocate
>> a callee save as a temporary). It probably is crashing because some JIT_Vn
>> is being changed, and without a prolog/epilog it cannot save (in prolog)
>> and reload (in epilog) the callee save register.
>>
>>> However, if I add jit_prolog() and jit_epilog(), Lightning will generate
>>> some code to save/restore registers, while I don't want that. This is
>>> especially bad since the epilog will never be reached (as the code jumps
>>> to
>>> another block), so the stack pointer will decrease at each prolog but
>>> never
>>> increase.
>>
>>
>> I suggest you make some benchmarking to make sure it really is
>> better to not use a function call sequence. If no callee save registers
>> are changed, it would be basically a sub/dec of the sp register.
>> But would be better if you could construct the jit to avoid jumps
>> for every small chunk; the major point of jit is avoiding branches
>> and jumps, usually at the cost of some code duplication.
>>
>>> What I would need, are either variants of jit_prolog() and jit_epilog()
>>> which don't save/restore the registers on the stack, or alternatively to
>>> have jit_calli() not require jit_prolog().
>>>
>>> Do you have suggestions on how to fix this issue?
>>
>>
>> There is a lightning 1.2 compatible mips port at
>> https://github.com/pcpa/lightning that probably will work for the need
>> you describe; main difference should be right to left argument
>> pushing (to match lightning 1.2 api).
>>
>>> Thanks,
>>>
>>> -Paul
>>
>>
>> Paulo
>>
>