avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] Avr-gcc versions comparison.


From: Joerg Wunsch
Subject: Re: [avr-gcc-list] Avr-gcc versions comparison.
Date: Wed, 21 Feb 2007 22:37:51 +0100 (MET)

"Bruce D. Lightner" <address@hidden> wrote:

> My mistake.  I thought that I had remembered that Dmitry worked on
> the AVR-specific code generator logic, but maybe that was long, long
> ago.  So, who's the "expert" now?

I guess you're thinking of Anatoly here.  He's also a member of the
avr-libc team, but recently became the second maintainer (besides
Denis Chertykov, the original author of AVR-GCC) who's got the right
to modify the GCC code for the AVR target directly in the repository.

> I suspect that the degenerate behavior with "return 0" is related to
> some "generic" GCC "subroutine return" optimization, and the "trick"
> will be to get GCC to stop "optimizing" this *just* for "avr-gcc".
> The "wonderful" optimization is likely destroying the information
> that allows the AVR-specific code generation logic to use the I/O
> bit test instructions.

You know a bit too little about compilers. ;-) (Not that I would
pretend to understand much more about GCC's details though.)  No,
things are not that simple.  Most optimizations are really performed
in what GCC calls the middle-end, i.e. in the generic part of the
compiler.  (The frontend is what makes the language-specific part, so
there are frontends for C, C++, Ada, ObjectiveC, Java, and so on.  The
backend is the actual code generator for the target CPU.)  These
optimizations are performed on internal abstract instructions (called
insns) where the target code describes physical instructions of its
processor, or sequences of instructions that can be applied to solve
typical tasks (like handling 16-bit integers, for example).

Now the trick is to supply most optimal insns with their respective
patterns for when to apply them...  That way, the code generator will
be able to recognize all situations where it can summarize certain
things, and then emit the respective code.

>> ...  For the time being, I've seen
>> better results by turning bit tests into inline functions (return type
>> "bool").

> It tested just that, and it made no difference in my previously
> mentioned "regression tests".  But, I'm not surprised.  I suspect
> that "bool" true/false is the same as 0/1, once we get to the level
> of machine code.

Sure, it is.  However, for me it makes quite a difference (but I don't
own a GCC 3.x for the AVR anymore, sorry, so that's all with GCC 4.1.x
here).

> Why did you think that "bool" would make a difference?  You want to
> "inline" the function, but that is just removing the "return 0" in
> exchange for using extra program space.  A fine idea in theory, but
> in a resource-constrained embedded application, not usually an
> option!

It's not that much the "bool" that makes any kind of difference
(except to the reader of the code), but encapsulating the actual test
in an inline function simply helps the middle-end to recognize what's
the actual intention of the person writing the code, i.e. to test a
particular bit.  By implementing as inline function, no further
overhead will be caused, as the optimizer can easily recognize the
side-effects.  I get more optimal code for your test2() then as well,
only test1() remains a pathological case:

..global test2
        .type   test2, @function
test2:
/* prologue: frame size=0 */
/* prologue end (size=0) */
        ldi r25,lo8(0)
        sbis 48-0x20,3
        ldi r25,lo8(1)
..L2:
        ldi r24,lo8(1)
        eor r25,r24
        mov r24,r25
        clr r25
/* epilogue: frame size=0 */
        ret
/* epilogue end (size=1) */
/* function test2 size 9 (8) */
        .size   test2, .-test2
..global test1
        .type   test1, @function
test1:
/* prologue: frame size=0 */
/* prologue end (size=0) */
        in r24,48-0x20
        clr r25
        ldi r18,3
1:      lsr r25
        ror r24
        dec r18
        brne 1b
        andi r24,lo8(1)
        andi r25,hi8(1)
/* epilogue: frame size=0 */
        ret
/* epilogue end (size=1) */
/* function test1 size 10 (9) */
        .size   test1, .-test1
..global test3
        .type   test3, @function
test3:
/* prologue: frame size=0 */
/* prologue end (size=0) */
        ldi r24,lo8(0)
        ldi r25,hi8(0)
        sbic 48-0x20,3
        rjmp .L17
        ldi r24,lo8(1)
        ldi r25,hi8(1)
..L17:
        ret
/* epilogue: frame size=0 */
/* epilogue: noreturn */
/* epilogue end (size=0) */
/* function test3 size 7 (7) */
        .size   test3, .-test3

The good news is, if you get away from your silly
not-doing-anything-real test functions, to some more real world
example like:

void dosomethingreal(void)
{
  J1708_IN_PORT |= (1 << J1708_IN_BIT);
  while (!test_j1708_bit())
    ;
  J1708_IN_PORT &= ~(1 << J1708_IN_BIT);
  while (!test_j1708_bit())
    ;
}

it will generate really optimal code now:

..global dosomethingreal
        .type   dosomethingreal, @function
dosomethingreal:
/* prologue: frame size=0 */
/* prologue end (size=0) */
        sbi 50-0x20,3
..L7:
        sbis 48-0x20,3
        rjmp .L7
        cbi 50-0x20,3
..L9:
        sbis 48-0x20,3
        rjmp .L9
/* epilogue: frame size=0 */
        ret
/* epilogue end (size=1) */
/* function dosomethingreal size 7 (6) */
        .size   dosomethingreal, .-dosomethingreal

-- 
cheers, J"org               .-.-.   --... ...--   -.. .  DL8DTL

http://www.sax.de/~joerg/                        NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]