[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Built-in base2dec and dec2base
From: |
Daniel J Sebald |
Subject: |
Built-in base2dec and dec2base |
Date: |
Sun, 29 Jul 2012 12:46:03 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16 |
Rik,
I've done a first pass of built-in versions of base2dec() and
dec2base(), and I put a patch in the features portion of SourceForge here:
https://sourceforge.net/tracker/?func=detail&aid=3551454&group_id=2888&atid=352888
Although the code is fairly well organized and functional, I'm just
becoming familiar with the classes and conventions for return values,
and there will need to be another pass to get the variable types
correct. A particular snag right now is accessing the octave values as
long ints as opposed to doubles. This became evident from the smart
hunk of test code:
% test
s0 = "";
for n = 1:13
for b = 2:16
pp = dec2base (b^n+1, b);
assert (dec2base (b^n, b), ['1',s0,'0']);
assert (dec2base (b^n+1, b), ['1',s0,'1']);
endfor
s0 = [s0,'0'];
endfor
which originally failed with a combination of n and b corresponding to
2^24, i.e., the limit of the float mantissa. I understand the classes,
but it is a case of finding the right member functions to do the trick.
Also, there is a slight amount of code duplication I'd like to reduce.
I'd like to experiment with strings as well. Somehow it seems like
there is a slight performance loss due to using the strings class, but I
may find it not worth optimizing that portion of things.
However, I will need to set this aside for a couple weeks. If you would
like to tweak some things and put a new patch on SourceForge, feel free.
OK, so here are some performance results using a 10e6 size vector
(strings or numbers, depending upon base2dec vs. dec2base). The numbers
are seconds of CPU consumption.
COMMAND BUILT-IN CURRENT
VERSION SCRIPT
VERSION
_______ ________ _______
bin2dec(<char matrix>) 0.13398 0.55891
bin2dec(<cell vector>) 0.21897 1.7447
hex2dec(<char matrix>) 0.14298 0.55192
hex2dec(<cell vector>) 0.22097 1.7277
base2dec(<char mat>, '01') 0.19697 0.52792
base2dec(<cell vec>, '01') 0.28396 1.7387
dec2bin(<int vector>) 0.22697 1.1598
dec2bin(<cell vector>) 0.23996 3.5465
dec2hex(<int vector>) 0.11998 0.30195
dec2hex(<cell vector>) 0.13898 2.7226
dec2base(<int vec>, '01') 0.22497 1.1548
dec2base(<cell vec>, '01') 0.23996 3.5535
Furthermore, here are some related times:
cellstr(<char matrix>) 0.86387
num2cell(<int vector>) 0.062990
Some observations:
1) There is roughly three times improvement at minimum. In the case of
cells, the built-in version is cooking with gas.
2) In theory the string-based version of base2dec should be fastest
because there are no ASCII tests to deal with. But I think the string
classes are a bit slower than the raw C strings. This could be
optimized further, but it is to the point where base2dec is so fast that
its cost is small compared to other string manipulations.
3) The builtin version brings the times down such that cellstr() stands
out as a critical time. It does no processing and is five times slower
than base2dec() which does processing. I'm going to look into cellstr()
at a later time. (I see now how "is_cellstr" checks the cache, as
mentioned at OctConf 2012.)
Questions (and I know the answer is "compatibility"):
1) In dec2base, if the input is a cell array, the output is a character
matrix. I would think that the string cell array would be preferred
without the zero padding in front... unless LEN is set.
2) In base2dec, if the input is a cell array row, the output is a cell
array column. My first inclination was to make the dimensions match
(until I tried the test case that checks this at which point I changed
dimensions). To me, it seems the logical thing to do with a cell array
is keep the output dimensions the same as the input dimensions. That is
sort of the point of cells.
Dan
- Built-in base2dec and dec2base,
Daniel J Sebald <=