octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bin2dec and cellfun improvements


From: Daniel J Sebald
Subject: Re: bin2dec and cellfun improvements
Date: Tue, 20 Mar 2012 03:29:24 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16

John,

I've had this problem of timely cell construction in the back of my mind. Please let me know if the following idea sounds plausible, or if maybe there is a more elegant solution.

Recall, constructing cell variables seems to take quite long, as in this example

On 03/18/2012 03:33 PM, Daniel J Sebald wrote:

octave:8> cpustart = cputime(); x = repmat('abcd', 1e7, 1); cputime() -
cpustart
ans = 0.035994
octave:9> cpustart = cputime(); y = cellstr(x); cputime() - cpustart
ans = 11.118

I had suggested a sort of brainstorming idea in my second observation of that email.


2) cellstr seems to be way slower than it need be. Again, I don't know
the innards, but my gut feeling is that should be doable much more
quickly. I suppose each cell has to have its own memory allocation for
potential upcoming modifications to individual cells. C memory
allocation isn't flexible (can only malloc and free big hunks of
memory). Does C++ memory allocation bring anything to the table? That
is, by using "new" on an object size of, in this case, 1e7 allow one to
inherently compute the pointers to individual items and treat them
independent then reassign memory? (I'm guessing perhaps not because it
would be like creating Swiss cheese out the memory that the core OS gave
up.)

I wonder if this approach is still plausible even if C++ new/delete might not allow one dynamic memory allocation for the whole cell group (1e7 of them) followed by fine control of reassigning cells. Imagine if a cell variable were allowed to manage its own pool of memory in a way. Upon first creating a cell, often it is known on the basis of context what the size of the initial memory will be. For example, in the cellstr() case we know all cells are char type and there is a string length that is the same for all. If doing a general copy of a cell we could index through the original cell variable and determine how much memory is needed. How about the following strategy:

1) When creating a cell variable, the overall memory to accommodate every cell is computed.

2) Request from the OS that amount of memory. (In the case of 1e7 elements, quite a big hunk.)

3) Treat that big hunk of memory like a local pool and compute the memory pointer for all the elements within that pool. In the case of cellstr(), this is an easy computation. In other cases (e.g., nested cells) it may not be so easy.

4) Given that all the pointers are assigned, fill in the data for the cell elements just as though cell memory was created with an OS dynamic memory allocation for each element. I think this is where the big CPU savings comes in, i.e., no need to get memory 1e7 times, just once.

5) With new commands modifying the contents of the cell in question, don't destroy the cell element's old memory. Instead just leave it be and dynamically allocate new OS memory just like is currently done (I presume).

6) When the cell variable is cleared from Octave, delete the big pool of memory as well as the dynamically allocated memory from any cells modified by the user. It would be like creating Swiss cheese out of the original block of memory, but that's OK because the whole cheese goes back to the operating system. (Need a new analogy.)

Does this sound like too tricky of programming? There would need to be a pointer to the original pool of memory and probably a variable in each cell object indicating if the object resides in the local memory pool or from the OS; new/delete apply to the latter but not the former.

Dan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]