[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correc
From: |
Kirill Smelkov |
Subject: |
Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly |
Date: |
Thu, 13 Dec 2012 20:55:41 +0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Daniel, Thomas, first of all I'm sorry it took me so long to reply.
( Also, if there are some stupid mistakes wrt arm assembly, please forgive
me. I've tried to make my analisys resilent to maybe-present
off-by-ones, and reply sooner than later, not spending more
20min*several-days to be 100% correct. )
On Thu, Dec 06, 2012 at 02:23:25PM +0100, Thomas Preud'homme wrote:
> Le jeudi 6 décembre 2012 04:50:55, vous avez écrit :
> > >
> > > Hi Kirill,
> > >
> > > did you make any progress on the issue since Daniel's comments? Could you
> > > let me know when you push a patch so that I can test it and bump master
> > > to equal the mob branch?
> >
> > Hi Thomas,
> >
> > Unfortunately not that much. I was trying to investigate
> > bounds-checking failures for `tcc -b -run tcc ...` tests first and it looks
> > I'm close to understand what is going on (old bcheck code is not adapted
> > to /proc/sys/kernel/randomize_va_space = 2). The plan was to go to arm
> > issues then, but I'm going very slow, because for tcc I have only 15-20
> > minutes per day and it is all in underground. Sorry. Could this issue
> > please wait me for some time? (probably week or so...)
>
> That's a worthwhile bug to fix. The ARM issue can wait, sure. We can also
> disable the test for ARM as this code never worked for ARM and it's thus not
> a
> regression.
On Fri, Nov 30, 2012 at 12:43:07AM +0100, Daniel Glöckner wrote:
> On Thu, Nov 29, 2012 at 09:43:33PM +0400, Kirill Smelkov wrote:
> > tcc first saves r0 & r1, and only then fp:
> >
> > $ ./arm-eabi-tcc -c y.c
> > $ arm-linux-gnueabi-objdump -d y.o
> >
> > 00000000 <f>:
> > 0: e1a0c00d mov ip, sp
> > 4: e92d0003 push {r0, r1}
> > 8: e92d5800 push {fp, ip, lr}
> > c: e28db00c add fp, sp, #12
> > 10: e1a00000 nop ; (mov r0, r0)
> > 14: e91ba800 ldmdb fp, {fp, sp, pc}
>
> I chose this stack frame format because it simplifies access to the
> arguments from within tinycc. At that time I didn't think of trying to
> be compatible to anything just to allow stack traces.
I see. Let's first see what is happenning with gcc and decide how it
should be, then come back here to maybe fixing/adjusting tcc.
> > gcc does not, but it will save r5,r6,etc. before fp(=r13 iirc) ...
> >
> > $ arm-linux-gnueabihf-gcc-4.7 -marm -c y.c
> > $ arm-linux-gnueabihf-objdump -d y.o
> >
> > 00000000 <f>:
> > 0: e52db004 push {fp} ; (str fp, [sp, #-4]!)
> > 4: e28db000 add fp, sp, #0
> > 8: e24dd00c sub sp, sp, #12
> > c: e50b0008 str r0, [fp, #-8]
> > 10: e50b100c str r1, [fp, #-12]
> > 14: e28bd000 add sp, fp, #0
> > 18: e8bd0800 pop {fp}
> > 1c: e12fff1e bx lr
>
> You should not look at a leaf function to derive the GCC stack frame.
> It is probably different from the generic stack frame because GCC
> knows this function will never be part of a stack trace done by
> another function of the final program.
Fair enough, thanks.
> Take a look at arch/arm/kernel/stacktrace.c inside the Linux kernel.
> They assume that every function pushes {fp, sp, lr, pc} and that
> fp points to the address where pc is stored. I don't know why
> GCC pushes pc. I can only imagine this being done to keep the
> stack aligned to 8 bytes. Or maybe it is for exception handling
> or association of stack frame debug info..
from arch/arm/kernel/stacktrace.c:
* With framepointer enabled, a simple function prologue looks like this:
* mov ip, sp
* stmdb sp!, {fp, ip, lr, pc}
* sub fp, ip, #4
*
* A simple function epilogue looks like this:
* ldm sp, {fp, sp, pc}
so since fp is initialized from ip-4 (old_sp-4) it points to (near ?)
fp, not pc, right?
also linux kernel is compiled with
-mapcs -mno-sched-prolog -mabi=apcs-gnu
and this changes frame layout compared to what gcc produces by default.
(sorry for that near - i don't remember stack and stmdb arm details
good enough)
>
> > 8: e92d0870 push {r4, r5, r6, fp} ; NOTE r4... go
> > before fp
>
> > 14: e28db00c add fp, sp, #12
>
> > 5c: e59b0000 ldr r0, [fp] ; BUG here!
>
> Why is this a bug in GCC? It will load the old value of fp to r0.
Hmm, after push here is how stack looks like:
r4 <- old_sp
r5 <- sp + 12
r6
fp
<- sp
(again, off-by-one error probably, but anywayy that sp+12 can't point to
old fp).
~~~~
Anyway, I've prepared more simple complete example to demonstrate the
bug:
---- 8< ---- fp.c
#include <stdio.h>
void *func(int x)
{
char *fp1 = __builtin_frame_address(1);
printf("Hello World! %i\n", x);
fp1 += x;
return fp1;
}
int main()
{
void *fp0 = __builtin_frame_address(0);
void *fp1 = func(0);
printf("fp0: %p fp1: %p\n", fp0, fp1);
return 0;
}
The code should output
Hello World! 0
fp0: <addr> fp1: <the-same-addr!>
right? on i386:
$ gcc -o fp_i386 -O2 fp.c
$ ./fp_i386
Hello World! 0
fp0: 0xafeda538 fp1: 0xafeda538
but it does not on arm, look:
$ arm-linux-gnueabi-gcc-4.4 -marm -o fp_arm -O2 fp.c
$ qemu-arm -L /usr/arm-linux-gnueabi ./fp_arm
Hello World! 0
fp0: 0x4080010c fp1: 0x8414
and the problem here is that gcc, as I think, incorrectly setups fp:
$ arm-linux-gnueabi-gcc-4.4 -marm -o fp.s -S -O2 fp.c
$ cat fp.s
.cpu arm9tdmi
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 2
.eabi_attribute 18, 4
.file "fp.c"
.text
.align 2
.global func
.type func, %function
func:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
stmfd sp!, {r4, r5, fp, lr}
add fp, sp, #12
mov r4, r0
ldr r5, [fp, #0] ; <- bug here, should be [fp, #-4]
mov r1, r4
ldr r0, .L3
bl printf
add r0, r5, r4
sub sp, fp, #12
ldmfd sp!, {r4, r5, fp, lr}
bx lr
.L4:
.align 2
.L3:
.word .LC0
.size func, .-func
.align 2
.global main
.type main, %function
main:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
stmfd sp!, {fp, lr}
mov r0, #0
add fp, sp, #4
bl func
mov r1, fp
mov r2, r0
ldr r0, .L7
bl printf
mov r0, #0
sub sp, fp, #4
ldmfd sp!, {fp, lr}
bx lr
.L8:
.align 2
.L7:
.word .LC1
.size main, .-main
.section .rodata.str1.4,"aMS",%progbits,1
.align 2
.LC0:
.ascii "Hello World! %i\012\000"
.space 3
.LC1:
.ascii "fp0: %p fp1: %p\012\000"
.ident "GCC: (Debian 4.4.6-11) 4.4.6"
.section .note.GNU-stack,"",%progbits
and if I apply the following patch
diff --git a/t/fp.s b/t/fp1.s
index 2047b05..85bb402 100644
--- a/t/fp.s
+++ b/t/fp1.s
@@ -20,7 +20,7 @@ func:
stmfd sp!, {r4, r5, fp, lr}
add fp, sp, #12
mov r4, r0
- ldr r5, [fp, #0]
+ ldr r5, [fp, #-4]
mov r1, r4
ldr r0, .L3
bl printf
it does work:
$ arm-linux-gnueabi-gcc-4.4 -o fp fp.s
$ arm-linux-gnueabi-gcc-4.4 -o fp1 fp1.s
$ qemu-arm -L /usr/arm-linux-gnueabi ./fp
Hello World! 0
fp0: 0x4080012c fp1: 0x8414
$ qemu-arm -L /usr/arm-linux-gnueabi ./fp1
Hello World! 0
fp0: 0x4080012c fp1: 0x4080012c
that's why I think gcc generated wrong code here.
And to the reference, here is how code would look like if being part of linux:
$ arm-linux-gnueabi-gcc-4.4 -o fpapcs.s -marm -mapcs -mno-sched-prolog
-S -O2 fp.c
$ arm-linux-gnueabi-gcc-4.4 -o fpapcs_gnu.s -marm -mapcs -mno-sched-prolog
-mabi=apcs-gnu -S -O2 fp.c
$ diff -u fp.s fpapcs.s
--- fp.s 2012-12-13 20:34:09.000000000 +0400
+++ fpapcs.s 2012-12-13 20:34:09.000000000 +0400
@@ -17,16 +17,17 @@
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
- stmfd sp!, {r4, r5, fp, lr}
- add fp, sp, #12
+ mov ip, sp
+ stmfd sp!, {r4, r5, fp, ip, lr, pc}
+ sub fp, ip, #4
mov r4, r0
ldr r5, [fp, #0]
mov r1, r4
ldr r0, .L3
bl printf
add r0, r5, r4
- sub sp, fp, #12
- ldmfd sp!, {r4, r5, fp, lr}
+ sub sp, fp, #20
+ ldmfd sp, {r4, r5, fp, sp, lr}
bx lr
.L4:
.align 2
@@ -40,17 +41,18 @@
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
- stmfd sp!, {fp, lr}
+ mov ip, sp
+ stmfd sp!, {fp, ip, lr, pc}
+ sub fp, ip, #4
mov r0, #0
- add fp, sp, #4
bl func
mov r1, fp
mov r2, r0
ldr r0, .L7
bl printf
mov r0, #0
- sub sp, fp, #4
- ldmfd sp!, {fp, lr}
+ sub sp, fp, #12
+ ldmfd sp, {fp, sp, lr}
bx lr
.L8:
.align 2
$ diff -u fpapcs.s fpapcs_gnu.s
--- fpapcs.s 2012-12-13 20:34:09.000000000 +0400
+++ fpapcs_gnu.s 2012-12-13 20:32:51.000000000 +0400
@@ -1,13 +1,3 @@
- .cpu arm9tdmi
- .fpu softvfp
- .eabi_attribute 20, 1
- .eabi_attribute 21, 1
- .eabi_attribute 23, 3
- .eabi_attribute 24, 1
- .eabi_attribute 25, 1
- .eabi_attribute 26, 2
- .eabi_attribute 30, 2
- .eabi_attribute 18, 4
.file "fp.c"
.text
.align 2
but it does not work either:
$ arm-linux-gnueabi-gcc-4.4 -o fpapcs -mapcs -mno-sched-prolog fpapcs.s
$ qemu-arm -L /usr/arm-linux-gnueabi ./fpapcs
Hello World! 0
fp0: 0x4080011c fp1: 0x83e0
~~~~
There are many arm abis and maybe I'm doing something stupid, but to me
all this looks like gcc generates incorrect code.
> > ... as probably required by arm calling convention (not looked at the spec
> > yet, but it seems reasanoble, given how push is really a block str and
> > that str stores register in ascending order).
>
> I have once seen a page in MSDN describing the ARM stack frame of
> Windows CE. I don't know if ARM specified how stack frames should look
> like.
>From gcc manual:
`-mapcs-frame'
Generate a stack frame that is compliant with the ARM Procedure
Call Standard for all functions, even if this is not strictly
necessary for correct execution of the code. Specifying
`-fomit-frame-pointer' with this option will cause the stack
frames not to be generated for leaf functions. The default is
`-mno-apcs-frame'.
so at least there is "ARM Procedure Call Standard". Sorry, no time to
search for it...
Thanks,
Undertimed Kirill
- Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly, Thomas Preud'homme, 2012/12/05
- Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly, Kirill Smelkov, 2012/12/05
- Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly, Thomas Preud'homme, 2012/12/06
- Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly, Kirill Smelkov, 2012/12/09
- Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly, grischka, 2012/12/09
- [Tinycc-devel] Fixes to bcheck and how it works correctly, Kirill Smelkov, 2012/12/11
- Re: [Tinycc-devel] Fixes to bcheck and how it works correctly, grischka, 2012/12/12
- Re: [Tinycc-devel] Fixes to bcheck and how it works correctly, grischka, 2012/12/12
- Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly,
Kirill Smelkov <=
Re: [Tinycc-devel] [PATCH] arm: Handle __builtin_frame_address(1) correctly, grischka, 2012/12/06