avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] AVR assembly for fast bit bang


From: David Brown
Subject: Re: [avr-gcc-list] AVR assembly for fast bit bang
Date: Thu, 10 Nov 2005 15:31:15 +0100

I suspect you misunderstand what "optomization" means, especially when
applied to a small microcontroller.  Choosing "-Os" (or "-O2", which is very
similar) tells the compiler to generate small and fast code.  It is not
dangerous or risky.  Gcc does have a few risky optomisation passes that you
can explicitly enable - it always has cutting-edge features that have not
had enough testing to be considered safe enough for -O2 level.  But choosing
to run gcc with "-O0" will cripple your code as gcc tries to use the most
obvious code for each individual statement with a total disregard for
context or the features of the target processor.  It even turns off much of
the warning features of gcc (enabled by -W and -Wextra), since they share
the same code analysis.  Very occasionally, it can be of interest to use
only -O1 optomisation in connection with debugging or viewing generated
assembly code.  But normally -Os is the right optomisation for common use.
And if your code works without optomisation but fails when -Os is used, then
you can be 99% sure it is your code that is at fault.  Optomising with -Os
is almost always a far better idea than optomising by writing your own
assembly.

And don't forget, TI's DSPs (at least the 320F24x family I have used) are
horrible processors with vastly over-priced and buggy tools.  Don't tar
avr-gcc with the same brush.

Having said that, the main causes of slowness in your code is because you
haven't really thought about what you are asking the compiler/cpu to do, and
how you expect it to do it.  In particular, you have a "(1 << n)" expression
inside the loop.  This gives your code an O(n^2) complexity instead of O(n).
Either shift your data one bit each loop, or use a mask variable that you
shift one bit each loop.  As it stands, this expression will probably cause
a library function call in -O0 (it's probably inlined with -O2) for a huge
unnecessary overhead.

Secondly, you are using 16-bit data unnecessarily.  As a rule of thumb,
using 16-bit data takes three times as long as 8-bit data on the AVR.
Re-organize the code to send two 8-bit bytes for much faster code.

mvh.,

David



----- Original Message -----
From: "Mike S." <address@hidden>


Hello to all,
My code is:

#define ADS1210_PORT PORTF
#define SDIO_BIT     0x04 /* 0b0000 0100 PORTF.2*/
#define CLK_BIT      0x02 /* 0b0000 0010 PORTF.1*/
#define SDOUT_BIT    0x01 /* 0b0000 0001 PORTF.0*/

#define SDIO_LOW    ADS1210_PORT &= ~SDIO_BIT
#define SDIO_HIGH   ADS1210_PORT |=  SDIO_BIT
#define CLK_LOW   ADS1210_PORT &= ~CLK_BIT
#define CLK_HIGH  ADS1210_PORT |=  CLK_BIT
/*--------------------------------------------------------------------------
--+
| Function: write_data
|
|
|
| Args:     n_bits
|
|
|
| Action:  Writes <n_bits> in the device SDIO input pin
|
|
|
| Efects:
|
|
|
+---------------------------------------------------------------------------
-*/
void write_data (Word towrite, Byte nbits)
{
  Byte n;

  for(n = 0; n < nbits; n++)
  {

    CLK_HIGH;
    if( towrite & (0x0001 << n))
    {
      SDIO_HIGH;
    }
    else
    {
      SDIO_LOW;
    }
    CLK_LOW;

  }
}

I haven't decided yet if I will use the AT90CAN128 SPI module or do it
by bit bang! My uC runs at 16MHz. The device I'm interfacing with
supports serial clock rates up to 2 MHz. In the past, I already made a
driver for this device for the Philips XA-S3 in C and in Assembler
(the asm was for the bit bang part), and now I want to port that
driver to the AVR. The problem is that I don't have a lot of
experience with the AVR assembly! That is why I'm asking this question
(wrong guess David Kelly, but thanks for the reply). BTW, is your code
snippet correct?
Thanks for the reply Daniel O'Connor, but I usually don't use the
optimization until I try a couple of and optimization techniques. I
already had some bad experiences with the optimization in some Texas
Instruments DSPs...
I use AVR-GCC.


Thanks to all replies.

Thanks in advance
Bruno Miguel


On 11/9/05, David Kelly <address@hidden> wrote:
> On Tue, Nov 08, 2005 at 03:45:54PM +0000, Mike S. wrote:
> > Hello to all,
> > Can anyone tell me the best (faster) way to implement bit shifting
> > (serial synch protocol -in a bit bang fashion-) with two general
> > purpose digital pins (one pin for data the other for clock)? Using C
> > is not fast enough! I need assembly!
>
> Sounds like a homework assignment. Smart instructors monitor this list.
>
> People keep saying "C isn't fast enough." I don't belive it. First
> attempt:
>
> #include <avr/io.h>
>
> #define CLOCK_B (1<<0)
> #define BIT_B   (1<<1)
>
> void
> myjunk(uint8_t byte) {
>          uint8_t i;
>
>          for( i = 0 ; i < 8 ; i++ ) {
>                  PORTA |= CLOCK_B;       //  rising edge of clock
>                  if( byte & (1<<7) )
>                          PORTA |= BIT_B;         // set
>                  else
>                          PORTA &= ~BIT_B;        // clear
>                  byte <<= 1;
>                  PORTA &= ~CLOCK_B;      //  falling edge of clock
>          }
> }
>
> Compiled with
> "avr-gcc -O -gdwarf-2 -Wall -ffreestanding -mmcu=atmega64 -c junk.c"
>
> Output of "avr-objdump -S junk.o":
>
> myjunk(uint8_t byte) {
>          uint8_t i;
>
>          for( i = 0 ; i < 8 ; i++ ) {
>     0:   90 e0           ldi     r25, 0x00       ; 0
>                  PORTA |= CLOCK_B;       //  rising edge of clock
>     2:   d8 9a           sbi     0x1b, 0 ; 27
>                  if( byte & (1<<7) )
>     4:   88 23           and     r24, r24
>     6:   14 f4           brge    .+4             ; 0xc <myjunk+0xc>
>                          PORTA |= BIT_B;         // set
>     8:   d9 9a           sbi     0x1b, 1 ; 27
>     a:   01 c0           rjmp    .+2             ; 0xe <myjunk+0xe>
>                  else
>                          PORTA &= ~BIT_B;        // clear
>     c:   d9 98           cbi     0x1b, 1 ; 27
>                  byte <<= 1;
>     e:   88 0f           add     r24, r24
>                  PORTA &= ~CLOCK_B;      //  falling edge of clock
>    10:   d8 98           cbi     0x1b, 0 ; 27
>    12:   9f 5f           subi    r25, 0xFF       ; 255
>    14:   98 30           cpi     r25, 0x08       ; 8
>    16:   a8 f3           brcs    .-22            ; 0x2 <myjunk+0x2>
>    18:   08 95           ret
>
>
> Its not going to get much better than that.
>
> --
> David Kelly N4HHE, address@hidden
> ========================================================================
> Whom computers would destroy, they must first drive mad.
>


_______________________________________________
AVR-GCC-list mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list






reply via email to

[Prev in Thread] Current Thread [Next in Thread]