lightning
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Lightning] Re: Some questions about minor changes


From: Paulo César Pereira de Andrade
Subject: [Lightning] Re: Some questions about minor changes
Date: Fri, 24 Sep 2010 07:10:21 -0300

Em 24 de setembro de 2010 06:49, Paulo César Pereira de Andrade
<address@hidden> escreveu:
> Em 24 de setembro de 2010 05:54, Paolo Bonzini <address@hidden> escreveu:
>> On 09/24/2010 10:16 AM, Paulo César Pereira de Andrade wrote:

  Sorry for replying to myself...

>>>   About the questions:
>>> o Is it a problem to use byte or word opcodes, i.e. will it cause some
>>> kind
>>>   of register stall, or something, like setting %al or testing %ax, when
>>> only
>>>   that is used, e.g. setting %al as "hidden" counter of xmm registers used
>>>   when calling a varargs function, or test %ax after fnstw?
>>
>> I think it's good as long as you read the small register after possibly
>> writing the large one.  Writing the small register and reading the large one
>> later has performance penalties, instead.
>
>  I think it is ok then, as the abi specification says it only reads %al
> (or only cares about %al) when "parsing" the varargs, to know the number
> of float arguments in xmm registers, and for prototyped functions, it
> should write to %rax without reading it before at some point.
>
>>> o Would it be a good idea to either add new interfaces, or change the
>>>   existing one? e.g. jit_prepare and jit_prolog would be better with 3
>>>   arguments,to properly calculate the stack offsets, when keeping the
>>> stack aligned at
>>>   16 bytes, otherwise, it requires a not so clear logic, to "intercept"
>>> the
>>>   first call to jit_pusharg_t, jit_arg_t and/or jit_allocai.
>>
>> Can you explain exactly why?
>
>  Example of calling a function:
> prepare 7           <- use 6 gp registers and 1 stack slot
> prepare_f 3        <- use 3 fp registers
> prepare_d 9       <- use 5 remaining fp registers and 4 stack slots
> <<pusharg>>
> finish function
>
> in the first pusharg, it must first pad the stack to align it to 16 bytes,
> but if the number of stack slots is even, it does not need.

  And this is actually bogus in the current code, because it should actually
check for "_jitl.argssize&1" after jit_prolog, jit_prolog_f or jit_prolog_d,
and pad the stack in the first instruction after those. Without any
extra changes, it could require an add/sub in the 3 of the above.
The worst case would be odd integer count in jit_prolog requring a
"sub 8,%rsp", and then a jit_prolog_d with an odd double count follows,
and need to "add 8,%rsp", and to make things worse, a jit_prolog_f with
an odd float count could follow...

>  The i386 code does a "defensive" align the stack on jit_prolog. I
> updated it to use the code that was OS/X specific, as it is required
> for Linux also if using sse, and there may be code that takes the
> address of stack variables as sse opcode arguments.
>  The x86_64 code now does it on first push* or first allocai, because
> it also needs a prolog_{f,d} to figure out arguments on stack.

Thanks,
Paulo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]