[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug ld/30333] New: [avr-ld] NOPs not removed after rcall for devices wi
From: |
sourceware-bugzilla at mhxnet dot de |
Subject: |
[Bug ld/30333] New: [avr-ld] NOPs not removed after rcall for devices with >8k of flash even with -mrelax |
Date: |
Tue, 11 Apr 2023 11:33:55 +0000 |
https://sourceware.org/bugzilla/show_bug.cgi?id=30333
Bug ID: 30333
Summary: [avr-ld] NOPs not removed after rcall for devices with
>8k of flash even with -mrelax
Product: binutils
Version: 2.40
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: ld
Assignee: unassigned at sourceware dot org
Reporter: sourceware-bugzilla at mhxnet dot de
Target Milestone: ---
Created attachment 14811
--> https://sourceware.org/bugzilla/attachment.cgi?id=14811&action=edit
Reproduction code & build script
I've been working on a bootloader for xmega3 cores recently and noticed that as
soon as I compile the code for devices with more than 8k of flash, the size of
the binary increases by more than 20 bytes (almost 5% of the bootloader binary
size). The issue isn't limited to xmega3, though, and I've used an older core
in the examples further down.
My assumption from reading various pieces of documentation is that `-mrelax` is
supposed to take care of replacing long calls with short calls and shrinking
the
holes in the binary accordingly. If it doesn't shrink the binary, there's no
obvious point apart from the short calls executing in one cycle less. From
glancing at the code, it seems that shrinking of some sort is implemented, but
it's not clear to me if it's doing what it's supposed to.
Here's an example to reproduce the behaviour:
static void __attribute__((__noinline__)) f(void)
{
*((volatile char *) 0x0140) = 42;
}
__attribute__((naked, section(".vectors"), noreturn)) void start(void)
{
f();
for(;;){}
__builtin_unreachable();
}
Compiling this with
avr-gcc -mmcu=atmega88 -Os -mrelax -nostartfiles -nostdlib -o x.elf x.c
yields:
00000000 <start>:
0: 01 d0 rcall .+2 ; 0x4 <f>
00000002 <.L3>:
2: ff cf rjmp .-2 ; 0x2 <.L3>
00000004 <f>:
4: 8a e2 ldi r24, 0x2A ; 42
6: 80 93 40 01 sts 0x0140, r24 ; 0x800140 <_end+0x40>
a: 08 95 ret
Compiling it instead for `atmega168`:
00000000 <start>:
0: 02 d0 rcall .+4 ; 0x6 <f>
2: 00 00 nop
00000004 <.L3>:
4: ff cf rjmp .-2 ; 0x4 <.L3>
00000006 <f>:
6: 8a e2 ldi r24, 0x2A ; 42
8: 80 93 40 01 sts 0x0140, r24 ; 0x800140 <_end+0x40>
c: 08 95 ret
Dropping the `-mrelax` will generate a `call` instead of an `rcall`+`nop`.
My expectation would be that, at least with `-mrelax`, I get an `rcall` without
a `nop` regardless of the flash size of the MCU.
If this isn't a bug, I'd like to understand why, as I haven't found any
documentation that would explain this behaviour.
I'm using `crossdev`-based builds of gcc/binutils on Gentoo Linux.
avr-gcc (Gentoo 13.0.1_pre20230305 p8) 13.0.1 20230305 (experimental)
GNU ld (Gentoo 2.40 p4) 2.40.0
The behaviour doesn't change if I e.g. use an older version of gcc.
I'm attaching a tarball with the reproduction code and a script to build ELF
and
disassembly files for two MCUs.
I'm more than happy to provide more information if needed.
--
You are receiving this mail because:
You are on the CC list for the bug.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug ld/30333] New: [avr-ld] NOPs not removed after rcall for devices with >8k of flash even with -mrelax,
sourceware-bugzilla at mhxnet dot de <=