octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Rethinking octave_idx_type


From: Daniel J Sebald
Subject: Re: Rethinking octave_idx_type
Date: Sat, 26 Nov 2016 14:25:56 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2

On 11/26/2016 01:07 PM, Bernardo Sulzbach wrote:
On 11/25/2016 03:27 PM, Michael D Godfrey wrote:

Instead, it seems that we could define octave_idx_type to be ssize_t
(or ptrdiff_t, I think they are equivalent in practice).  Then things
like fread, fwrite, or simple element-by-element array operations that
don't require BLAS or LAPACK functions could work on larger arrays.


This appears to be a significant improvement.


This would also fix numel() calls returning overflown values on big
matrices under 64-bit systems too.

... something like 8-bit image data would fit in 8-32 G and overflow the index. But if the programmer uses y = f(x) there needs to be enough memory for two such large matrices. Otherwise one has to use x = f(x), because certainly operating on individual elements using indexing is slow.

Rik's concern about speed of 32-bit indexing vs 64-bit indexing is a good one. The answer is sort of CPU- and bus-dependent, but it certainly seems to me that speed takes precedent over the rarer case of 8+ G matrix/vector size. Certainly there are quite common cases where the user has > 8G data to process, but typically that is done not by bringing the whole data record in at once, but by processing using blocks of data. E.g., the filter routines return a state vector so that can be used recursively for the next block of data, etc. Being crafty about efficiently processing data, both CPU and memory, is what it's about.

I wonder if it makes sense to have 64-bit indexing be a different Octave type, because internally Octave would know when > 32-bit indexing is needed, e.g., ones(2^17). But that seems unnecessary too.

In fact, let's see what this does:

octave:9> x = ones(bitshift(int64(1), 17));
error: out of memory or dimension too large for Octave's index type
octave:9> x = ones(bitshift(int64(1), 16));
error: out of memory or dimension too large for Octave's index type
octave:9> x = ones(bitshift(int64(1), 15));

OK after the last command my system essentially attempted swapping out all memory and is now gradually coming back to life.

I like the error message. Perhaps it would make sense to differentiate those two conditions, i.e., rather than say it is this or that, something like

error: out of memory

vs.

error: dimension too large for Octave's index type (rebuild using option --index64)

Of course, one of these has to take precedence under the condition they are both true. I think the only way to test how much memory is available in a system is to actually try the malloc, right? So maybe the latter should take precedence.

My point is that with the above, only the users who know they need 64-bit indexing would build it as such. Assuming that 64-bit indexing takes a CPU hit compared to 32-bit indexing.

Dan

PS:  This struck me as odd:

octave:1> x = ones(2^33);
octave:2>
octave:2> size(x)
ans =

   0   0



reply via email to

[Prev in Thread] Current Thread [Next in Thread]