avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[avr-gcc-list] GCC-AVR Register optimisations


From: Andrew Hutchinson
Subject: [avr-gcc-list] GCC-AVR Register optimisations
Date: Wed, 09 Jan 2008 20:51:55 -0500
User-agent: Thunderbird 2.0.0.9 (Windows/20071031)

Hi all,

just spend some days going over gcc-avr and missed optimizations.

One area I looked at was register allocation - this is not gcc strong
point. However, the current settings we use are making life more
difficult than it needs be.

The current order is:

   R24,25,\
   18,19,\
   20,21,\
   22,23,\
   30,31,\
   26,27,\
   28,29,\
   17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,\

You can tweak it with gcc-avr/winavr compile option -morder1 to get a
better result  ( or -morder2 to get a much worse one!)

So is there an better order?

Registers 17 downwards are  call saved and push/popped in prescribed
order by prolog/epilog functions. Also R28,29 is potential frame pointer
and so that is best left alone. So the key registers are: R18-R27  & R30,31

With the order, there are several problems:

1) Initial register  allocation fragments the register set. For example,
allocating r25 will prevent R24-25 being used for 16bit register  and
prevent R22-25 and R24-27 being used as 32 bit registers. gcc register
allocator does not seem to overcome this fragmentation.

2) The situation is made worse by the order of  16bit+ register used for
call and return values - which are "allocated" in reverse order. eg
R24-R25, R22-24, R18-24.  This means that the function parameters or
return values are rarely  in the right place - except for 16bit values.

3) Allocating a byte to odd number register precluded it being extended
to 16bit value without a move.

So, I tried creating an order which would preserve the contiguous
register space and avoid the above issues as much as possible.
This is what I ended up with:

R18,26,22,30,20,24,19,21,23,25,27,31,28,29, \
   17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,\


The result is a 1.25% saving in code size for a simple mixed
application. Pretty good for such a simple change!

For more floating point, the saving might well be higher as it demands
more contiguous 32 bit registers.

On the same basis, the current order of called saved registers R2-R17
dictated by  (mcall) prolog limit further improvement is clearly
imperfect.  These are used less frequently, though their cost is much
higher. So its difficult to gauge impact. I might take a look at some
intense floating point functions to see if this if it is worth pursuing
reordering these too.


Andy











reply via email to

[Prev in Thread] Current Thread [Next in Thread]