octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Slowness in function 'open'


From: Jordi Gutierrez Hermoso
Subject: Re: Slowness in function 'open'
Date: Fri, 22 Jun 2007 21:37:24 -0500

On 22/06/07, John W. Eaton <address@hidden> wrote:
On 22-Jun-2007, Jordi Gutierrez Hermoso wrote:

| I can't quite understand what they're doing at first glance. But if
| the problem is simply to read a matrix ASCII file, looks like there
| might be a better way to do it.

Before comparing speed, let's make sure we are talking about the same
function and that they handle the same set of requirements.

I made some changes:

    http://platinum.linux.pl/~jordi/octave/matrix-read-example.tar.gz

The only additions are in the operator>>(...) function for matrices in
linalg.cpp. My function is still slightly faster than Octave's, but it
naturally slowed down a little:

    octave2.9:3> tic, load A.m; toc
    Elapsed time is 7.072884 seconds.
    octave2.9:4> tic, system("foo"); toc
    Elapsed time is 2.766716 seconds.

The function that reads matrices must

  1. ignore comments (beginning with '%' or '#') anywhere on the line,
     not just when those characters appear in the first column

Check.

  2. ignore commas that separate numeric entries

Check, I think. What about whitespace, though? Should a comma be
treated exactly like whitespace? I mean, is this a grammatical matrix
ASCII file

   1, , , ,, 2
   3 4

and equivalent to [1 2; 3 4]? If so, I need to think a bit harder
about how to tell a stringstream that a comma should also be treated
as whitespace.

  3. ensure that all rows have the same number of elements and produce an
     error otherwise

Check. I already had this one, of course.

  4. handle reading Inf and NaN values even when the underlying C/C++
     library does not

Oy. That's complicated. Why do we need that? You mean we can't assume
std::numeric_limits<double>::quiet_NaN() will yield the value we want?
A standards-abiding C++ implementation has to report with
std::numeric_limits<double>::has_quiet_NaN() if it has decided to not
adhere to IEC 559. Are there still any major C++ implementations out
there for which this matters? Surely even the Playstation 3 adheres to
IEEE arithmetic, doesn't it? That's about the weirdest platform I can
remember on which someone has tried to run Octave as reported on the
help list. :-)


| One pass should suffice. You can read all the data into an
| std::list, which is what I do, and there are a few simple checks to
| make sure that the formatting of the ASCII matrix is correct.

How much extra memory is required to manage each element of a
std::list object?

An std::list is almost always implemented as a doubly-linked list (the
C++ standard specifies that finding an element in the list must be
O(n), for example). This means that besides all the doubles it has to
store, it has two store 2n pointers to doubles. On a 64-bit system
where I think pointers are the size of doubles, this may mean that the
list itself occupies 3 times the size of the matrix data it's storing.
Perhaps this is unacceptable, but on the other hand, you can pop
elements from the list as you're reading it, so it's only a temporary
storage space.

Can you give me a quick description of how Octave matrices are
represented internally? Are they contiguous in memory? Column-major
Fortran order? Is reshaping a matrix O(1)? If the std::list overhead
is unacceptable, we could use the same basic algorithm I wrote and
manipulate raw memory and then shape that memory into an Octave matrix
after reading the file. In C++, std::vector is the way to do such
raw-memory manipulation, but then I don't see how to avoid the
double-pass problem. Maybe read from the system the size of the file
to read from and use that as an estimate of how much space to allocate
with an std::vector.

The only std::list-specific functionality my code uses is that it
avoids the resizing problem that you solved with a double-pass. In a
way, it does this by hogging up space. Maybe there's a better
tradeoff.

| I try to always avoid macros, for instance. To
| paraphrase Stroustrup, macros are a deficiency in the code, the coder,
| or the coding language. ;-)

I guess maybe C/C++ is the wrong language for implementing Octave
then.  Either that, or you are saying that I'm deficient.

No, no. You're not deficient, not as I've seen so far. ;-) In C, since
it's *meant* to be ugly, you're supposed to use macros. In C++, you're
not, but sometimes it is unavoidable, especially for
implementation-specific details where the specific implementation of
C++ itself forces you to use a macro. However, C++ has acknowledged
deficiencies; after all, this is why they're trying to come up with a
new standard by 2009. It is indeed quite possible that with the newer
C++ standard Octave may still need a few macros to get things working.
I'm personally really looking forward to the new standard; a lot of my
pet peeves with C++ that Boost is fixing are scheduled to be
completely fixed by 2009.

- Jordi G. H.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]