|
From: | Daniel J Sebald |
Subject: | Re: bin2dec behavior different from Matlab? |
Date: | Fri, 16 Mar 2012 13:12:06 -0500 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16 |
On 03/16/2012 12:52 PM, Rik wrote:
On 03/16/2012 04:53 AM, Jordi GutiƩrrez Hermoso wrote:On 16 March 2012 01:30, Daniel J Sebald<address@hidden> wrote:This change happened with changeset 11172:7e8ce65f73cf shown here http://hg.savannah.gnu.org/hgweb/octave/rev/7e8ce65f73cf which overhauled the string number conversion routines. (Maybe the intent was to move white space removal inside base2dec.) In my opinion tossing away white space makes better sense than treating white space as zeros. I also suggest adding a test to the file. There currently are several, but none that test the behavior of white space. E.g., %!assert (bin2dec ("1 0 1"), 5)This seems like a clear unintentional regression. Can you prepare a changeset?3/16/12 Daniel, The change is unintentional, however the fix is not absolutely straightforward. Matlab's version of the function is quite limited in that it works on just a single string. Octave's version also accepts character matrices (one string per row) and cell array of strings. In base2dec the space character is treated as 0 so that the following will work bin2dec ([" 101"; "1101"]) For the first string in the char matrix , the leftmost value is 2^3 (determined by position) * 0 (value of space) so that any spaces used for padding out a character matrix do not influence the resulting conversion.
I was wondering if this sort of construct tied in somehow. The thing is, that use of strings is sort of falling out of favor because it was always a little clumsy to work with character arrays. The cell array construct is preferred.
The code for 3.2.4 handled "1 0 1" by using a for loop over every row of the character matrix to remove spaces BEFORE calling base2dec (str,2). Looping in Octave is abysmally slow and not the way to do this. Even if we have to go back to some sort of looping structure it would be better to use regexprep, arrayfun, or cellfun.
Well, my philosophy for optimization is to optimize the method which is the most likely to be used. Since the character matrix approach isn't so common anymore, optimizing that is lower priority in deference to a more often used technique. Plus, the for loop is only slow if it is in fact used. Also, if someone is using the string matrix approach, I would guess there is somewhere else in the person's code that is just as inefficient as the bin2dec and base2dec routines might be on character matrices.
One option might be to write a custom string function that both right justifies and removes excess spaces. This would replace this part of base2dec.m starting at line 84. ## Right justify the values before anything else. s = strjust (s, "right");
Rather than redundantly call strjust and s = s(s != ' '), the latter would be better.
This hypothetical function, strjustsqueeze [justify and squeeze] can be based on strjust.m which is efficient. Perhaps you can investigate and propose some possible solutions and we can take it from there.
I would say to simplify as much as possible with string cell functions in mind as the most likely scenario. I would guess that putting the white space removal inside base2dec makes the most sense if there are other string/number conversion routines utilizing base2dec. Otherwise, leave it in bin2dec with the idea that base2dec remains lean and fast.
Dan
[Prev in Thread] | Current Thread | [Next in Thread] |