avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] Optimization around mul and restoring r1


From: David Brown
Subject: Re: [avr-gcc-list] Optimization around mul and restoring r1
Date: Tue, 1 Oct 2013 13:50:16 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130806 Thunderbird/17.0.8

On 01/10/13 09:41, Simon Kirby wrote:
> Hello!
> 
> How difficult would it be to convince avr-gcc to not re-zero r1 instantly
> after mul, but delay it until a zero is actually needed? For example,
> see this actual avr-gcc output:
> 
>         Vneutral = (uint16_t)Vbus * t >> 8;
> ->
>         lds     r25, 0x20FD
>         mul     r24, r25
>         movw    r24, r0
>         eor     r1, r1
>         sts     0x205C, r25
> 
> Could be just:
> 
>         lds     r25, 0x20FD
>         mul     r24, r25
>         sts     0x205C, r1
>         eor     r1, r1
> 
> Saving one whole cycle! ;)
> 
> I understand that some things can really benefit from having a zero
> register around (cpse, cpc, adc, sbc low reg, etc.), but could it not be
> restored before returning, or when one of the zero-desiring instructions
> is needed? Perhaps the clobbering of flags is difficult...
> 
> In an assembly project I am maintaining, I use Z as an atomic interrupt
> vector (vector table contains "ijmp") by packing the entry points such
> that the high byte is always zero, then use ZH (r31) as a 0 constant.
> Not even close to avr-gcc ABI-compatible, but it's fast, and it can be
> reloaded without side effects.
> 
> Simon-
> 

Hi Simon,

I've seen some discussion before about removing the fixed r1 "zero"
register.  As I understand it, having the zero register makes a few
things a easier for implementing the avr port of gcc, and removing it
(or delaying it) would be a lot of work.  It has been suggested that r3
could be used as the zero register and r2 as the temporary register
(instead of r0 as today), to avoid conflict with the fixed-register
instructions such as LPM and MUL.  However, the pressure is lower in
that newer AVR's can do an LPM to other registers, and the cost of
breaking the ABI is significant.

In general, it would be a good thing to delay the zeroing of r1, and it
is worth searching the issue database for previous issues and filing a
new "missed optimisation" issue if it is not there already:

<http://gcc.gnu.org/bugzilla/>

Don't hope for a quick resolution - there are very few people with the
time and ability to work on the AVR port of gcc, and there are lots of
things higher in the priority list.  But nonetheless it is good to get
these things filed - then the ideas will not be lost.

Of course, if you are able to help directly, then as Eric says you
should feel free to contribute.


Another point worth noting is that what you are seeing here is not
simply an effect of the zero register - what you are seeing is a missed
optimisation opportunity when saving the high byte of a 16-bit value.
There are a lot more cases when this sort of thing could be improved.
Again, it is not an easy matter - if a case is common enough it is
possible to make a pattern for it that gcc can match specifically, but
ideally one would want a more general system.

mvh.,

David





reply via email to

[Prev in Thread] Current Thread [Next in Thread]