[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lightning] Memory Allocation on Mac OS X
From: |
Paulo César Pereira de Andrade |
Subject: |
Re: [Lightning] Memory Allocation on Mac OS X |
Date: |
Tue, 28 Sep 2010 14:52:56 -0300 |
Em 28 de setembro de 2010 03:49, Paolo Bonzini <address@hidden> escreveu:
> On 09/28/2010 06:52 AM, Paulo César Pereira de Andrade wrote:
>>
>> 2. Knowing the range from two jit_prolog() calls, it can figure the range
>> of a
>> function. Could even know about dead code, but probably better to not
>> try
>> to be too much smart about jumps and jump targets, just slightly smart
>> :-)
>
> I'd rather leave maximum flexibility to the user of lightning. For example
> in GNU Smalltalk I'm not using the stack at all. There is a trampoline
> using jit_prolog, but all compiled methods have their entry point after
> jit_prolog's expansion (they jump from method to method storing the return
> addresses in heap-allocated frames). This assumes that jit_prolog emits the
> same code for all of its occurrences.
This is basically the same approach I am using. Every vm function starts
at some point after jit_prolog, and when calling a function, it saves the return
address in the heap stack. There is one stack per thread, and the stack
is relocatable, gc inspectable, and every entry has a type tag.
>> 3. Most important, this linked list should have information about non
>> relative
>> labels, so that it could even reallocate the buffer and adjust offsets.
>> This is
>> tricky, because to have it to work transparently, it would need to
>> modify the
>> api, that is, instead of having a label as a "jit_insn *", it would need
>> something
>> more like:
>> jit_movi_p(&my_label, r0, pointer);
>> jit_bner_i(&my_other_label, r0, 0);
>> otherwise, if it resizes the buffer, with either allocate a new one and
>> copy
>> contents, or try to extend with realloc or mremap, it may need to adjust
>> "pending" labels.
>
> I'm just throwing away the compiled code and restarting from scratch
> currently if I detect an overflow.
My code generation is done currently in a single buffer, and information is
being thrown away when finished with a function. But it should be changed
or at least keep information for either restart or repatch. because there is a
significant amount of jit_patch_movi calls, usually in code like:
return_address = jit_movi(JIT_R0, jit_forward());
jit_pusharg_p(JIT_R0);
...
jit_finish(some_vm_function);
jit_patch_movi(return_address);
jit_retval(JIT_R0);
jit_jmpr(JIT_R0);
that is used either to call a vm function and store the return address in
the heap stack, or, the returned value may be a switch entry, or return
from exception handler, etc; pseudo code above is just to have an idea
of usage of jit_patch_movi and jit_jmpr.
>> Just some ideas... This would also make it easier/safer to use the
>> change
>> I did to adjust the stack alignment in i386/x86_64 by patching the
>> immediate
>> to adjust %esp in the jit_prolog call, as it could figure out what routine
>> it did
>> jump to, but forward jumps to another function would still be tricky; the
>> problem
>> is that if it jumps to another function, and call another from there, if
>> it patches
>> the stack adjustment due to building a large frame, it will patch the
>> adjustment
>> of the wrong function, same if it calls jit_allocai to have some %ebp
>> relative
>> buffer, like is done for x87/sse conversion (but this one is mostly for
>> testing,
>> maybe should always allocate the first 8 stack bytes for it, or just
>> push/pop
>> always).
>
> Yes, these are the problems I was hinting at. The root idea is that
> lightning _is_ an assembler, so it is hard to define what the user can and
> cannot do. The instances of "undefined behavior" should be very limited,
> and JIT compilers can use many tricks that lightning should support.
The logic of adjusting the stack only once could be made an alternative
somehow, and I believe should be the default, as it will do the right thing
95+% of the time. And supporting jump from function to function is like
gcc supporting "goto *label;" with label being in different functions... The
"prettier" assembly dump I tell is due to too many sequences like:
sub $4,%esp
push %eax
xor %eax,%eax
push %eax
mov <<string>>,%eax
push %eax
call <<printf>>
add $16,%esp
dummy_label:
sub $4,%esp
push %eax
xor %eax,%eax
push %eax
mov <<string>>,%eax
push %eax
call <<printf>>
add $16,%esp
and similar. The new approach becomes more like:
mov $8(%esp),%eax
xor %eax,%eax
mov $4(%esp),%eax
mov <<string>>,%eax
mov (%esp),%eax
call <<printf>>
dummy_label:
mov $8(%esp),%eax
xor %eax,%eax
mov $4(%esp),%eax
mov <<string>>,%eax
mov (%esp),%eax
call <<printf>>
If the jump is to the same function, that is, under the effect of the same
jit_prolog call, there are no problems.
The call sequence to align the stack, e.g. if calling i386 function with 3
arguments, is to "sub $4,%esp" and after the call "add $16,%esp". It just
looks wrong the 2 extra instructions, frequently sequential and not merged.
Also, something that I changed that may have an explanation as to
"why not" force/assert matching declared number of arguments and
actually pushed arguments, is "stack polution" as described in lightning.info,
where it appears to allow to call jit_prepare multiple times, and not matching
the number of arguments actually passed.
> Paolo
Paulo