[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Tinycc-devel] Code generator (x86_64-tccgen): help required
From: |
Michael Matz |
Subject: |
Re: [Tinycc-devel] Code generator (x86_64-tccgen): help required |
Date: |
Sun, 14 Feb 2021 01:09:59 +0100 (CET) |
User-agent: |
Alpine 2.21 (LSU 202 2017-01-01) |
Hello Dmitry,
On Sat, 13 Feb 2021, Dmitry Selyutin wrote:
I work on support for std atomics and currently attempt to switch from
the initial idea of calling the routines to inlining the assembly
code. Unfortunately I'm lost in the relationships between the code
generator and SValue stack.
Which inline code do you have in mind? I'm asking because as TCC isn't
really about generating speedy code we can at least try to make it not
larger than necessary. So, if inlined code is actually larger than the
argument setup and call instruction it might really be the better idea to
not inline. Having said this, to get you forward with the experiment:
Let's say I have a pair of values stored in SValue stack, and want to
load them into the registers %rdi and %rsi, and exactly these. This, I
presume, means that, unlike for situation with function call, I have to
save the registers.
Actually it's quite similar to function arg setup. (Probably not so
incidentally %rdi and %rsi are exactly the first two argument registers in
the ELF psABI). The routine to load an SValue into a specific register is
'load'. So, you have two values in vtop[0] and vtop[-1] and need
to load them into %rdi and %rsi, you would do roughly like this:
save_reg(TREG_RDI); // first make sure rdi/rsi aren't used in the vstack
save_reg(TREG_RSI);
load(TREG_RDI, vtop); // force vtop into rdi
vtop->r = TREG_RDI; // and record the fact that it's now in rdi
vswap(); // vtop is now the other value
load(TREG_RSI, vtop); // do same with rsi
vtop->r = TREG_RSI;
Now vtop[0] sits in rsi and vtop[-1] in rdi, the generated code reflects
this and the vstack state reflects this. Now you generate your inline
code that assumes stuff is in rdi/rsi and leaves result in rax:
emit_your_code();
And now you want to make it so that the vstack reflects the fact that some
value sits in rax. First remove the top two entries, then generate a new
one:
vpop(); // remove the old two inputs in rdi/rsi
vpop();
vset(type, TREG_RAX, 0); // make vtop represent %rax in given type
There are various variants of the above theme. If the value in rax
represents an int, then this might be easier:
vpushi(0);
vtop->r = TREG_RAX; // vtop is now %eax (i.e. as VT_INT)
If it's a pointer-sized value:
vpushd(0);
vtop->r = TREG_RAX; // vtop is now %rax as VT_SIZE_T aka VT_LLONG
Also, when exactly you vpop() the two input values depends a bit on how
you generate code in emit_your_code(). (Basically you want to pop them
from the vstack as soon as the values are consumed, but not earlier)
Generally I would advise against using direct emission of opcodes via g()
for the above argument setup (no matter that it's used in the call-arg
setup right now), and rather use more generic routines like save_reg and
load, like above. (With that you can be relatively sure that you don't
mess up other values that might still be live).
Of course, for special code you need in your inline code, you use direct
opcodes via g() and friends. But for some of the things you need you can
probably also use the normal tcc codegen via the vstack for values and
gen_op to generate operations on them.
You might even consider writing your special code as macros in tccdefs.h
which you then simply feed into the parser and codegen as if the user had
written normal C code (see e.g. what we do for the stdarg code). The less
special machine specific code the better. (E.g. your atomic code should
ideally work for other architectures, with possibly a very small
arch-specific core; doing that is harder when you start fiddling with
emitting opcodes directly).
Hope this helps. Good luck.
Ciao,
Michael.