avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]


From: Andrew Hutchinson
Subject: [Fwd: Re: [avr-gcc-list] GCC-AVR Register optimisations]
Date: Thu, 10 Jan 2008 20:38:59 -0500
User-agent: Thunderbird 2.0.0.9 (Windows/20071031)


I tried they earlier example:

  char *p1 = x;
  foo(p1++);
  foo(p1++);
  foo(p1++);

etc

Even with different register allocation, the result is bad.

a) Basic flow that gcc creates is something like:

b=a+1;
R24=a;
foo(R24);
c=b+1;
R24=b;
foo(R24)

This needs 1 variable to be saved over call. But due to overlapped lifetimes, it creates 2. For example, both a and b must exist at same time.
If it had reversed ordering, it would not need this. For example

R24=a;
b=a+1
foo(R24);
R24=b;
c=b+1;
foo(R24)

with reordering, when  b is created, a is dead. So we only need 1 register

I am not experienced enough to know why gcc cannot optimise this case. But it looks like a weakness with gcc (not gcc-avr)

b) Register costs used to preference allocation are all equal for AVR - so there is no preference for ADIW regs (even when I removed !w)

So backend does not indicate a preference between R16=R16+1, R14=R14+1 or R28=R28+1

In current gcc, frame pointer (r28-29) does not get used for register allocation - clearly that would be the best "call saved" register which could use ADIW and avoid moves. It looks like allocation is made with frame pointer used. Then, if it is not required, it does not use R28-29. But it does not try allocation without frame_pointer.

I tried improved foo(p);p++; it produces much better code. Still not using R28 (for same reason) In this case, the increment is specified after function call, so we dont have overlapped lifetime of registers - only one is then used and all becomes simple.

Andy

PS Please report as a bug - gcc should be better than this.



Wouter van Gulik wrote:
The RTL dump will tell me why it chose R14 before.


What do you mean with RTL dump exactly? I tried looking through some dumps
but I could not make sense of it. I used -dP and --save-temps. But all
looked the same to me.

I recollect there were some odd "!" markers that stops the possibility of
ADIW registers being by reload for certain operations

That might be reason. If so I'll have to dig out why they were put in -
maybe to fix some other problem.


Well for all possible ADIW uses (addsi, addhi) it's a "!w".
If this could be undone much pointer arithmetic could be done better I
guess/hope.

Any clue on why "foo(p++)" gives even poorer code compared to "foo(p); p++"?

HTH,

Wouter


---- Wouter van Gulik <address@hidden> wrote:
Wouter van Gulik schreef:

Note that in some cases it could be very interesting to use r27, or Y,
register.

Should have written R28 of course.

Since gcc seems down at the moment I did some more testing.

Now consider this example:
void main(void)
{
        char *p = x;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
        foo(p); p+=65;
}
This must be done using a subi/sbci pare.

But the compiler now seems to realize that p is a constant offset to x.
So we now get:

main:
/* prologue: frame size=0 */
        push r16
        push r17
/* prologue end (size=2) */
        lds r16,x
        lds r17,(x)+1
        movw r24,r16
        call foo
        movw r24,r16
        subi r24,lo8(-(65))
        sbci r25,hi8(-(65))
        call foo
        movw r24,r16
        subi r24,lo8(-(130))
        sbci r25,hi8(-(130))

Here x is stored in r16 and the cumulative offset is added to R24

But if the compiler can realize this... Then why not do this for adds
within the adiw range?!?
So for p++/p+=1 we would get something like:

        movw r24, r16
        adiw r24, 1
        call foo
        movw r24, r16
        adiw r24, 2
etc..

This is just as small as the earlier suggested use of R28!

Wouter










reply via email to

[Prev in Thread] Current Thread [Next in Thread]