avr-libc-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-libc-dev] RFC: avr/bits.h


From: Nicolas Schodet
Subject: Re: [avr-libc-dev] RFC: avr/bits.h
Date: Fri, 4 Mar 2005 22:06:27 +0100
User-agent: Mutt/1.5.6+20040907i

* Joerg Wunsch <address@hidden> [050301 22:50]:
> > The interrupt flag bit is read as 1, and writen at 1, this clear the
> > interrupt bit! A warning could be added to the manual.
> There's a paragraph in the FAQ that explains why interrupt flags are
> written as 1 [...]

I mean, if there is a new bit manipulation header, it could be a good
idea to warn user about this or to add a link to the FAQ.

> > Another question about bit manipulation: what is the best way to
> > read 4 bytes into a double word?
> IIRC, a union works best.
> [...]

Yes, it work best because the code generated by gcc is not very good,
but the source is then non portable.

> > This could also be included in the bit manipulation header.
> Better not, as it is completely unrelated to *bits*.  That doesn't
> mean any such interface should be banned from avr-libc, but please
> don't try to hijack header files for things they have not originally
> been intented for.

Ok, then perhaps a byte.h header?

I made some tries, the union solution produce really better code (-Os
-mmcu=atmega128) but I can not understand why the code is not better.
Gcc (3.4.3) miss even very simple optimisations, I am missing something?

Here it is:

#include <stdint.h>

union _byte_access
{
    uint8_t v8[4];
    uint16_t v16[2];
    uint32_t v32;
};

static inline uint32_t
v8_to_v32 (uint8_t b3, uint8_t b2, uint8_t b1, uint8_t b0)
{
    union _byte_access ba;
    ba.v8[0] = b0;
    ba.v8[1] = b1;
    ba.v8[2] = b2;
    ba.v8[3] = b3;
    return ba.v32;
}

int
main (void)
{
    volatile uint32_t v32a, v32b;
    volatile uint8_t b3, b2, b1, b0;
    b3 = 3;
    b2 = 2;
    b1 = 1;
    b0 = 0;
    v32a = v8_to_v32 (b3, b2, b1, b0);
    /*
  e0:   59 85           ldd     r21, Y+9        ; 0x09
  e2:   4a 85           ldd     r20, Y+10       ; 0x0a
  e4:   3b 85           ldd     r19, Y+11       ; 0x0b
  e6:   2c 85           ldd     r18, Y+12       ; 0x0c

  Why is there mov here ? They seems useless.

  And why two mov instead of one movw ?
  
  e8:   82 2f           mov     r24, r18
  ea:   93 2f           mov     r25, r19
  ec:   a4 2f           mov     r26, r20
  ee:   b5 2f           mov     r27, r21

  f0:   89 83           std     Y+1, r24        ; 0x01
  f2:   9a 83           std     Y+2, r25        ; 0x02
  f4:   ab 83           std     Y+3, r26        ; 0x03
  f6:   bc 83           std     Y+4, r27        ; 0x04
    */
    v32b = (uint32_t) b3 << 24 | (uint32_t) b2 << 16 | (uint32_t) b1 << 8 | b0;
    /*

This one takes 43 instructions instead of 8:

  This is the load...

  f8:   89 85           ldd     r24, Y+9        ; 0x09
  fa:   99 27           eor     r25, r25
  fc:   aa 27           eor     r26, r26
  fe:   bb 27           eor     r27, r27

  and the shift...

 100:   b8 2f           mov     r27, r24
 102:   aa 27           eor     r26, r26
 104:   99 27           eor     r25, r25
 106:   88 27           eor     r24, r24

  useless code, again...

 108:   2a 85           ldd     r18, Y+10       ; 0x0a
 10a:   33 27           eor     r19, r19
 10c:   44 27           eor     r20, r20
 10e:   55 27           eor     r21, r21
 110:   a9 01           movw    r20, r18
 112:   33 27           eor     r19, r19
 114:   22 27           eor     r18, r18
 116:   82 2b           or      r24, r18
 118:   93 2b           or      r25, r19
 11a:   a4 2b           or      r26, r20
 11c:   b5 2b           or      r27, r21
 11e:   2b 85           ldd     r18, Y+11       ; 0x0b
 120:   33 27           eor     r19, r19
 122:   44 27           eor     r20, r20
 124:   55 27           eor     r21, r21
 126:   54 2f           mov     r21, r20
 128:   43 2f           mov     r20, r19
 12a:   32 2f           mov     r19, r18
 12c:   22 27           eor     r18, r18
 12e:   82 2b           or      r24, r18
 130:   93 2b           or      r25, r19
 132:   a4 2b           or      r26, r20
 134:   b5 2b           or      r27, r21
 136:   2c 85           ldd     r18, Y+12       ; 0x0c
 138:   33 27           eor     r19, r19
 13a:   44 27           eor     r20, r20
 13c:   55 27           eor     r21, r21
 13e:   82 2b           or      r24, r18
 140:   93 2b           or      r25, r19
 142:   a4 2b           or      r26, r20
 144:   b5 2b           or      r27, r21
 146:   8d 83           std     Y+5, r24        ; 0x05
 148:   9e 83           std     Y+6, r25        ; 0x06
 14a:   af 83           std     Y+7, r26        ; 0x07
 14c:   b8 87           std     Y+8, r27        ; 0x08
   */
    v32b = (uint32_t) b3 << 24 | (uint32_t) b2 << 16 | (uint32_t) b1 << 8 | b0;
    return 0;
}

I had better confidence in compilers before, is it general to gcc or just
specific to avr-gcc?

/Nicolas.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]