octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Built-in base2dec and dec2base


From: Daniel J Sebald
Subject: Re: Built-in base2dec and dec2base
Date: Sun, 19 Aug 2012 20:58:16 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16

Rik,

I've gone over the builtin base2dec/dec2base code and created a second version of the patch file.

https://savannah.gnu.org/patch/index.php?7825

Changes are:

1) Added checks for non-negative and non-fraction.
2) Vary the width of division (via template) based on max size to bump up efficiency.
3) Modifications to ensure unsigned long numbers are done properly.

I think everything is covered now, and I've verified all the tests and a little more.

What I propose is that someone with more advanced understanding of the Octave classes suggesting ways to program this slightly better by utilizing code reuse or other member functions. John or you could also make some mods and create a new patch on Savannah. (I'd appreciate that actually as it would help understand the Octave classes better and the quirks such as having to work with "uint64" as opposed to "unsigned long" in order to get proper results.)

I'm including some numbers below because the Savannah site tosses out extra spaces so columns don't come through.

Dan


PERFORMANCE
-----------
I've redone some performance tests and found about 10% decrease in times of for the built-in function due to the added checks described in (1) above. Here are some performance results using a 10e6 size vector (strings or random numbers, depending upon base2dec vs. dec2base). The specs are seconds of CPU consumption.

COMMAND                      BUILT-IN        CURRENT
                             VERSION         SCRIPT
                                             VERSION
_______                      ________        _______

bin2dec(<char matrix>)       0.15298         0.56591
bin2dec(<cell vector>)       0.24196         1.7647
hex2dec(<char matrix>)       0.15298         0.55892
hex2dec(<cell vector>)       0.23996         1.7747
base2dec(<char mat>, '01')   0.18997         0.53792
base2dec(<cell vec>, '01')   0.27496         1.7217

dec2bin(<int vector>)        0.25796         1.1488
dec2bin(<cell vector>)       0.26296         3.6314
dec2hex(<int vector>)        0.14098         0.30096
dec2hex(<cell vector>)       0.16097         2.7606
dec2base(<int vec>, '01')    0.25696         1.1498
dec2base(<cell vec>, '01')   0.25896         3.5345

Furthermore, here are some related times:

cellstr(<char matrix>)   0.82187
num2cell(<int vector>)   0.064990


DIVISION SIZE
-------------
I've also done some tests to check if changing the division size is worth it. For example, to investigate 8 bits, I checked 2^8-1, 2^8 and 2^9. The idea is to try and get a feel for what portion of CPU consumption increase is due to the one character longer number and what is due to the wider division.

Judging from the numbers, it seems clear that there is some increase attributable to the wider division noticeable for the transition from a 32-bit divide to a 64-bit divide.

Here are some performance results using a 10e6 size vector, but non random and at the border of transition from 8, 16, 32, 64 bit. The specs are seconds of CPU consumption.

COMMAND                 BUILT-IN        CURRENT
                        VERSION         SCRIPT
                                        VERSION
_______                 ________        _______

dec2bin(2^8-1)          0.17397         0.68489
dec2bin(2^8)            0.19097         0.74089
dec2bin(2^9)            0.20797         0.88487
dec2bin(2^16-1)         0.25996         1.3968
dec2bin(2^16)           0.26996         1.3738
dec2bin(2^17)           0.28396         1.4608
dec2bin(2^32-1)         0.45993         2.6836
dec2bin(2^32)           0.58791         2.6416
dec2bin(2^33)           0.63190         2.6996
dec2bin(2^64-1)         1.1818          5.2662
dec2bin(2^64)           1.1788          5.2262
dec2bin(2^65)           1.1738          5.2612

dec2bin({2^8-1})        0.17297         3.2095
dec2bin({2^8})          0.18497         3.1865
dec2bin({2^9})          0.20297         3.2615
dec2bin({2^16-1})       0.26596         3.8494
dec2bin({2^16})         0.28996         3.7954
dec2bin({2^17})         0.28696         3.8954
dec2bin({2^32-1})       0.45493         5.1152
dec2bin({2^32})         0.59191         5.0582
dec2bin({2^33})         0.63190         5.3472
dec2bin({2^64-1})       1.1768          7.7158
dec2bin({2^64})         1.1788          7.6908
dec2bin({2^65})         1.1848          7.6748



On 07/29/2012 11:30 PM, Daniel J Sebald wrote:
I've done a first pass of built-in versions of base2dec() and
dec2base(), and I put a patch in the features portion of SourceForge

Carnë informed me I used the wrong tracker. (Thanks Carnë.) I moved the
patch to:

https://savannah.gnu.org/patch/index.php?7825

...

I tested a version of dec2base that uses the C++ library div() routine

http://www.cplusplus.com/reference/clibrary/cstdlib/div/

to generate quotient and remainder at the same time. Where I thought it
might cut the computations some, it turns out to be slower than using
operate divide followed by multiply and subtract. My guess is that the
div() routine for some reason doesn't know how to optimize for the
processor or the CPU doesn't have remainder as part of its div
instruction. Consequently it just introduces extra overhead of the div_t
structure.

COMMAND BUILT-IN BUILT-IN
DIVIDE div_t
MULTIPLY div()
_______ ________ _______

dec2bin(<int vector>) 0.22697 0.25996
dec2bin(<cell vector>) 0.23996 0.28496
dec2hex(<int vector>) 0.11998 0.12998
dec2hex(<cell vector>) 0.13898 0.16198
dec2base(<int vec>, '01') 0.22497 0.24796
dec2base(<cell vec>, '01') 0.23996 0.28396

Dan


--

Dan Sebald
email: daniel(DOT)sebald(AT)ieee(DOT)org
URL: http://www(DOT)dansebald(DOT)com


reply via email to

[Prev in Thread] Current Thread [Next in Thread]