[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[avr-gcc-list] Optimisation of bit set/clear in uint32_t
From: |
Dale Whitfield |
Subject: |
[avr-gcc-list] Optimisation of bit set/clear in uint32_t |
Date: |
Tue, 21 Apr 2009 17:49:01 +0200 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
Hi,
All this talk of super-optimisers, etc. brings this issue I've had,
to the fore again.
Perhaps there are suggestions on how to encourage the compiler to do the
optimisations rather than having to hand-code to get the best result.
Take this code:
volatile uint32_t status;
static inline void set_status(uint32_t flag, uint8_t set)
__attribute__((always_inline));
void set_status(uint32_t flag, uint8_t set) {
if (set) status |= flag;
else status &= ~flag;
}
Which when used in an interrupt generates:
ISR (INT7_vect)
{
4e9a: 1f 92 push r1
4e9c: 0f 92 push r0
4e9e: 0f b6 in r0, 0x3f ; 63
4ea0: 0f 92 push r0
4ea2: 11 24 eor r1, r1
4ea4: 8f 93 push r24
4ea6: 9f 93 push r25
4ea8: af 93 push r26
4eaa: bf 93 push r27
4eac: 80 91 1a 18 lds r24, 0x181A
4eb0: 90 91 1b 18 lds r25, 0x181B
4eb4: a0 91 1c 18 lds r26, 0x181C
4eb8: b0 91 1d 18 lds r27, 0x181D
4ebc: a0 64 ori r26, 0x40 ; 64
4ebe: 80 93 1a 18 sts 0x181A, r24
4ec2: 90 93 1b 18 sts 0x181B, r25
4ec6: a0 93 1c 18 sts 0x181C, r26
4eca: b0 93 1d 18 sts 0x181D, r27
set_status(0x00400000UL, 1);
4ed0: bf 91 pop r27
4ed2: af 91 pop r26
4ed4: 9f 91 pop r25
4ed6: 8f 91 pop r24
4ed8: 0f 90 pop r0
4eda: 0f be out 0x3f, r0 ; 63
4edc: 0f 90 pop r0
4ede: 1f 90 pop r1
4ee0: 18 95 reti
That's way too much code... Its fairly obvious where the optimisations
should be, although I can see that some have been done already.
If I recode the routine as follows, I get what I want:
void set_status(uint32_t flag, uint8_t set) {
volatile uint8_t *pstatus = (uint8_t *)&status;
if (set) {
if (flag & 0x000000ff)
pstatus[0] |= (uint8_t)(flag & 0xff);
if (flag & 0x0000ff00)
pstatus[1] |= (uint8_t)(flag >> 8);
if (flag & 0x00ff0000)
pstatus[2] |= (uint8_t)(flag >> 16);
if (flag & 0xff000000)
pstatus[3] |= (uint8_t)(flag >> 24);
} else {
if (flag & 0x000000ff)
pstatus[0] &= ~(uint8_t)(flag & 0xff);
if (flag & 0x0000ff00)
pstatus[1] &= ~(uint8_t)(flag >> 8);
if (flag & 0x00ff0000)
pstatus[2] &= ~(uint8_t)(flag >> 16);
if (flag & 0xff000000)
pstatus[3] &= ~(uint8_t)(flag >> 24);
}
}
Which is:
ISR (INT7_vect)
{
6b80: 1f 92 push r1
6b82: 0f 92 push r0
6b84: 0f b6 in r0, 0x3f ; 63
6b86: 0f 92 push r0
6b88: 11 24 eor r1, r1
6b8a: 8f 93 push r24
6b8c: 80 91 4e 18 lds r24, 0x184E
6b90: 80 64 ori r24, 0x40 ; 64
6b92: 80 93 4e 18 sts 0x184E, r24
set_status(0x00400000UL, 1);
6b98: 8f 91 pop r24
6b9a: 0f 90 pop r0
6b9c: 0f be out 0x3f, r0 ; 63
6b9e: 0f 90 pop r0
6ba0: 1f 90 pop r1
6ba2: 18 95 reti
Compiler options are:
avr-gcc -mmcu=atmega2560 -Wall -gdwarf-2 -std=gnu99 -Os -funsigned-char
-funsigned-bitfields -fpack-struct -fshort-enums -finline-limit=10
-DF_CPU=7372800UL
version 4.3.3
In a 75k byte application this reduces code size by 1632 bytes.
Dale.
- [avr-gcc-list] Optimisation of bit set/clear in uint32_t,
Dale Whitfield <=
- Re: [avr-gcc-list] Optimisation of bit set/clear in uint32_t, Georg-Johann Lay, 2009/04/21
- [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, David Brown, 2009/04/21
- Re: [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, Dale Whitfield, 2009/04/22
- Re: [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, Alex Wenger, 2009/04/22
- Re: [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, Dale Whitfield, 2009/04/22
- Re: [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, Alex Wenger, 2009/04/22
- [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, David Brown, 2009/04/22
- Re: [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, Paulo Marques, 2009/04/22
- [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, David Brown, 2009/04/22
- Re: [avr-gcc-list] Re: Optimisation of bit set/clear in uint32_t, Dale Whitfield, 2009/04/22