Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf

avr-libc-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf

From:	Paulo Marques
Subject:	Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf
Date:	Mon, 20 Jun 2011 11:04:00 +0100
User-agent:	Thunderbird 2.0.0.23 (X11/20090817)

Dmitry E. Oboukhov wrote:
> Hi, There!

Hi, Dmitry

> Now avr-libc uses division cycles when it converts number to string.
> 
> Benchmarks page (http://www.nongnu.org/avr-libc/user-manual/benchmarks.html)
> says that ltoa (12345L, s, 10) requires 3174 - 3136 cycles for the
> conversions.

We could probably improve this a lot for base 10 and small numbers. Just
look at a thread called "dev and mod may not be optimized", three years
ago...

> If we use the other algorithm, we can reduce the time more than three
> times.
> 
> in attache I placed a test-file that contains 'my_ultoa' function.
> 
> my_ultoa(12345L, s, 10)  requires 924 MCU cycles.
>     against ultoa(12345L, s, 10) - 3174

Yes, but how many cycles does "my_ultoa(99999L, s, 10)" require and how
does that fare against the standard ltoa(99999L, s, 10)?

> But that algorithm can be realized only for some radixes, so the
> function can be used if one of realized radixes is in third argument.
> 
> Also function __ultoa_invert can use the algorithm (it is called from
> *printf), so using the algorithm we can spend less time then current
> variant.
> 
> The algorithm can be realized for uint, it will more fast.
> 
> Sometimes conversion time is very important (for example if we realize
> some text network protocols) so I think that algorithm can be used in
> functions ltoa/itoa/*printf :)
> 
> What do You think about this algorithm?

I'm concerned that it might perform badly for very small numbers
compared to the regular ltoa. Also the fact that the time taken by the
algorithm depends a lot on the number (i.e. 99999 takes longer than 11111).

Attached is a version (that I didn't even compile test, to be honest)
that should alleviate the last problem a little and be faster in the
average case, but it is even larger than the original version. It only
works for bases up to 16, though.

-- 
Paulo Marques
Software Development Department - Grupo PIE, S.A.
Phone: +351 252 290600, Fax: +351 252 290601
Web: www.grupopie.com

"Nostalgia isn't what it used to be."

#include <avr/pgmspace.h>

static const unsigned long digits8[] PROGMEM = {
        1073741824,
        134217728,
        16777216,
        2097152,
        262144,
        32768,
        4096,
        512,
        64,
        8,
        1,
};

static const unsigned long digits10[] PROGMEM = {
        1000000000,
        100000000,
        10000000,
        1000000,
        100000,
        10000,
        1000,
        100,
        10,
        1,
};

static const unsigned long digits16[] PROGMEM = {
        268435456,
        16777216,
        1048576,
        65536,
        4096,
        256,
        16,
        1,
};

static const unsigned long digits32[] PROGMEM = {
        1073741824,
        33554432,
        1048576,
        32768,
        1024,
        32,
        1,
};

static const unsigned long digits36[] PROGMEM = {
        2176782336,
        60466176,
        1679616,
        46656,
        1296,
        36,
        1,
};

static const char letters[] PROGMEM =
        "0123456789"
        "abcdefghijklmnopqrstuvwxyz";

char *  my_ultoa (unsigned long __val, char *__s, int base) {
        unsigned char max;
        const unsigned long *d;
        unsigned char i, j, k, first = 1;
        unsigned long check, check2;

        switch(base) {
                case 8:
                        max = sizeof(digits8) / sizeof(unsigned long);
                        d = digits8;
                        break;

                case 16:
                        max = sizeof(digits16) / sizeof(unsigned long);
                        d = digits16;
                        break;

                case 32:
                        max = sizeof(digits32) / sizeof(unsigned long);
                        d = digits32;
                        break;

                case 36:
                        max = sizeof(digits36) / sizeof(unsigned long);
                        d = digits36;
                        break;

                default:
                        max = sizeof(digits10) / sizeof(unsigned long);
                        d = digits10;
                        break;
        }

        j = 0;
        check = pgm_read_dword(d);
        if (check < __val) {
                first = k = 0;
                while (check <= __val) {
                        __val -= check;
                        k++;
                }
                __s[j++] = pgm_read_byte(letters + k);
        }

        for (i = 1; i < max; i++) {
                check = pgm_read_dword(d + i);
                if (check > __val) {
                        if (first)
                                continue;
                        __s[j++] = '0';
                        continue;
                }

                first = 0;
                check2 = check << 3;
                if (check2 < __val) {
                        __val -= check2;
                        k += 8;
                }
                check2 >>= 1;
                if (check2 < __val) {
                        __val -= check2;
                        k += 4;
                }
                check2 >>= 1;
                if (check2 < __val) {
                        __val -= check2;
                        k += 2;
                }
                if (check < __val) {
                        __val -= check;
                        k ++;
                }

                __s[j++] = pgm_read_byte(letters + k);
        }

        if (first)
                __s[j++] = '0';
        __s[j] = 0;

        return __s;
}

char * my_ltoa (long __val, char *__s, int base) {
        if (__val >= 0)
                return my_ultoa((unsigned long)__val, __s, base);
        __s[0] = '-';
        return my_ultoa((unsigned long)(-__val), __s + 1, base) - 1;
}

[Prev in Thread]

Current Thread

[Next in Thread]

[avr-libc-dev] avr-libc: optimization for ltoa/printf, Dmitry E. Oboukhov, 2011/06/16
- Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf, Weddington, Eric, 2011/06/16
  - Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf, Dmitry E. Oboukhov, 2011/06/16
    - Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf, Joerg Wunsch, 2011/06/16
    - Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf, Dmitry E. Oboukhov, 2011/06/17
- Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf, Paulo Marques <=
  - Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf, Dmitry E. Oboukhov, 2011/06/20
  - Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf, Dmitry E. Oboukhov, 2011/06/20

Prev by Date: Re: [avr-libc-dev] AVR-libc-dev Digest, Vol 103, Issue 1
Next by Date: Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf
Previous by thread: Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf
Next by thread: Re: [avr-libc-dev] avr-libc: optimization for ltoa/printf
Index(es):
- Date
- Thread