[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [avr-gcc-list] Speed challenge...
From: |
Peter N Lewis |
Subject: |
Re: [avr-gcc-list] Speed challenge... |
Date: |
Sat, 27 Apr 2002 16:01:10 +0800 |
It occurs to me that some of the general principles for optimizing
that I applied to this would be useful to some folks.
First off, you need to look at the source code. You can have avr-gcc
compile to a source output with:
avr-gcc -mmcu=at90s4433 -Wall -Os -I../include -S test.c -o test.s
(ie -S = compile only, and output to a .s source file).
Then you find the section of code you're interested in, and you look
at the instructions. Using an Instruction Set summary (like the one
at <http://www.atmel.com/atmel/acrobat/doc0856.pdf>, or better yet,
one with a table of instructions and cycles), look at the cycles used
by each instruction. Generally, on the AVR, instructions that read
or write to memory or IO or jumps take two cycles and most everything
else takes one cycle. The time taken is directly proportional to the
number of cycles, so trace through the instructions that will be
executed (following each loop) and count all the cycles. Now you
have a baseline to refer back to to see if you are mak9ing progress.
After that, it is mostly a matter of looking at the assembly code and
looking for wasted cycles (excess memory accesses, excess jumps, more
expensive than necessary instructions). With loops, pay particular
attention to the "increment" and end of loop test, since they can
often be rewritten to reduce the number of wasted cycles - using a
loop that counts towards zero is generally preferable so that the
decrement can double as the test for completion.
Of course, before you start doing any of this, you should look at the
high level code and ask yourself if you really need to be doing it
this way at all in the first place (for example, could your data be
pre-sorted). The techniques I'm talking about are for low level
optimization when you're sure you're doing the minimum that needs to
be done and just want to do it a bit faster. Generally you can get
about a factor of 2 improvement out of doing this - with high level
code changes you can often get *much* better improvement (factors of
ten or more).
HTH,
Peter.
--
<http://www.interarchy.com/> <ftp://ftp.interarchy.com/interarchy.hqx>
avr-gcc-list at http://avr1.org