Re: bin2dec behavior different from Matlab?

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bin2dec behavior different from Matlab?

From:	Daniel J Sebald
Subject:	Re: bin2dec behavior different from Matlab?
Date:	Fri, 16 Mar 2012 22:15:26 -0500
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16

On 03/16/2012 07:38 PM, Rik wrote:

On 03/16/2012 04:53 PM, Daniel J Sebald wrote:

One can do this.  In general, cellstr are slower than using indexing on
character arrays.  I tried the following and it works

s = char (strrep (cellstr (s), " ", ""));
s = strjust (s, "right");



Why is strjust necessary here?
If the white space is removed from the string the string will already be
justified as a consequence.  Remove the strjust() command and benchmark
again.

The algorithm depends on the character matrix being right justified.  The
char function produces a left-justified matrix.  Try 'char ("1", "111")' as
an example.

OK, we aren't thinking along the same line. What I'm wondering is ifthere is some method of doing the bin2dec group of functions without thecharacter matrix approach. With the advent of the cell array, the groupof routines that worked with character strings in a matrix configurationsort of fell out of favor. So now people programming scripts mightthink in terms of a cell array of character strings of binary numbers.That data might come from a file or whatever; it's just that it is moreconvenient to work with strings contained as cell array.

Also, strrep may not be so efficient because it is general.  It works
with two strings.  This process is only interested in the one character '
', so the isspace or != test might prove much faster.

You can use indexing for deletion within ordinary arrays but not for cell
strings.  Try ' cstr = {"1 0 1"; "1"}; cstr(isspace(cstr)) = "" ' and it
will simply error out.  regexprep() would work but it is slower that strrep.


Also, there may be a technique of using cellfun instead of converting
back to char that can save time.

I've benchmarked cellfun many times and it is slower than straight indexing.


There are a lot of optimization methods to explore here.

Feel free to improve the code.  It is available in Mercurial.  The
changeset is 14472:e995b1c97e13.

To create a test matrix I used

tvec = char (randi ([48 49], 1e6, 10));
tvec(randi(1e7, 1e6,1)) = " ";

which creates 1 million 10 digit binary numbers with about 10% of the
values being spaces.

The input you are choosing is a character matrix. Let's also create theequivalent cell array of character strings:


ctvec = cellstr(tvec);

I'm saying that ctvec is more likely to be the user's starting pointthese days, and that to convert that ctvec to a character array mightnot be the way to go.

Unfortunately, I don't have the latest Octave and until I can getMercurial working on my machine I can't do benchmark comparisons.

From my rough estimate, you have a fast machine. I have a Xeon quadcore running at of 3GHz and I'm not getting near the times you are withbin2dec. Perhaps there has been a lot of optimization in Octave overthe past couple versions.

Well, here is what I'm doing for a simple test, and you can experimentwith this little bit of code on your machine. I'm attaching a scriptfile called test_bin2dec.m which uses cellfun() to implement bin2dec.It is bare bones and doesn't do any sanity check on the characters beingbetween '0' and '1', but my point is to illustrate there is a differentapproach.

Now, if cellfun() turns out to be slightly slower, the clean code mightgive it an advantage. Or maybe we'll have to ask John to look at theinternal cellfun() routine because the point of that routine is loopingefficiency.


Running this little script

  tvec = char (randi ([48 49], 1e6, 10));
  tvec(randi(1e7, 1e6,1)) = " ";
  ctvec = cellstr(tvec);

  cpuzero = cputime();
  junk = bin2dec(tvec);
  cputime() - cpuzero

  cpuzero = cputime();
  junk = test_bin2dec(ctvec);
  cputime() - cpuzero

on my machine produces

octave:19> tvec = char (randi ([48 49], 1e6, 10));
octave:20> tvec(randi(1e7, 1e6,1)) = " ";
octave:21> ctvec = cellstr(tvec);
octave:22>
octave:22> cpuzero = cputime();
octave:23> junk = bin2dec(tvec);
octave:24> cputime() - cpuzero
ans =  423.04
octave:25>
octave:25> cpuzero = cputime();
octave:26> junk = test_bin2dec(ctvec);
octave:27> cputime() - cpuzero
ans =  68.398

So two questions come to mind from this:

1) The cellfun() based approach is five times faster than the version3.2.4 approach (granted, I left out several things), and the conversionctvec = cellstr(tvec) is relatively fast compared to these benchmarktimes so maybe converting to a cell array approach is better. (John'spoint I believe.) There may be a better approach than the power()routine, but I was just trying to illustrate cellfun().


2) What machine are you using that is so fast?!

Dan

test_bin2dec.m
Description: Text Data

[Prev in Thread]

Current Thread

[Next in Thread]

Re: bin2dec behavior different from Matlab?, (continued)
- Re: bin2dec behavior different from Matlab?, ahowe42, 2012/03/22
  - Re: bin2dec behavior different from Matlab?, Miguel Bazdresch, 2012/03/25

Prev by Date: Re: Octave-maintainers Digest, Vol 72, Issue 66
Next by Date: Re: why is the wiki page on building octave so lacking?
Previous by thread: Re: bin2dec behavior different from Matlab?
Next by thread: Re: bin2dec behavior different from Matlab?
Index(es):
- Date
- Thread