Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops

avr-gcc-list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops

From:	Wouter van Gulik
Subject:	Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1)
Date:	Tue, 29 Jan 2008 10:22:50 +0100
User-agent:	Thunderbird 2.0.0.9 (Windows/20071031)

Dmitry K. schreef:

On Friday 25 January 2008 22:35, Wouter van Gulik wrote:

__clzqi2:
    clr     r_count     ; load with 0
    com     r_count     ; invert (load with -1) + set carry
__clzqi2_loop:
    rol     r_arg1L         ; Rotate through carry
    inc     r_count         ; Carry not touch by inc
    brcc    __clzqi2_loop   ; Branch on no carry


That is splendid!

Thanks, the original idea is from the guy posting in the gcc bug reportfor clz.

After a superficial view:
. A short rcall/rjmp is not safe with unknown (big) library
for intermodule link. Also a conditional branch.

Hmm, this is all ready so for __mulqihi3 and __umulqihi3, they do a rjmpto __mulhi3. So I thought it was save.Also note that avr-libc's libm uses this exclusively. Maybe this isfixed by the linker relaxation?


On the conditional branch I totally agree. That is not a smart thing todo.

. Are this function intended for math functions? If so,
the strong size optimization is not a best solution (IMHO).
An addition of few words may speed up in few times for
some 32/16-bit functions.

Yes they are intended for math I guess. But I think they will never beused. CLZ/CTZ is only used when using floats and not linking withavr-libc's math library (libm). This has poor result any way, so userwill probably quickly switch to libm.

I mainly implemented CLZ/popcount to fix the huge use of RAM by thesefunctions.

CLZ is already a little optimized for speed. I could optimize further.Make the 16 bit check on a high byte being zero and then decide whichone he should do on byte base. But then I would also need to link in theQI implementation, using even more flash.The qi loop is 4 cycles, the hi loop is 5. Excluding the penalty forbranching etc. So we could have a negative effect on speed for some values.

There is no bug report from a user that he is missing the otherfunctionality, so I thought it is more a matter of having it so gccpasses the testsuite, not having it as fast,mean and lean implementation.


Wouter

[Prev in Thread]

Current Thread

[Next in Thread]

[avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1), Wouter van Gulik, 2008/01/25
- Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1), Paulo Marques, 2008/01/25
  - Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1), Paulo Marques, 2008/01/29
- Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1), Dmitry K., 2008/01/28
  - Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1), Wouter van Gulik <=
    - Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1), Dmitry K., 2008/01/29

Prev by Date: Re: [avr-gcc-list] problem with malloc() in avr-libc 1.6.1
Next by Date: Re: [avr-gcc-list] problem with malloc() in avr-libc 1.6.1
Previous by thread: Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1)
Next by thread: Re: [avr-gcc-list] [FIX] _clz and friends not found (test builtin-bitops-1)
Index(es):
- Date
- Thread