octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

New importdata function testing


From: Rik
Subject: New importdata function testing
Date: Sun, 21 Oct 2012 09:07:37 -0700

10/20/12

Erik,

I did just a small test with importdata and it doesn't seem to work as
expected.

For a file, I used import.tst containing

1,2,3
4,5,6

And then in Octave, I used
importdata ('import.tst', ',')
warning: unrecognized escape sequence '\S' -- converting to 'S'
ans =

   NaN   NaN   NaN
   NaN   NaN   NaN

The warning doesn't look good and certainly that is not the correct data.

I am also concerned that the implementation reads the entire file into a
string and then uses a number of for loops and regexp which will be slow in
Octave.  I did a benchmark with the following:

x = rand (1e4, 10);
dlmwrite ('tst.csv', x, ',')
tic; y = dlmread ('tst.csv', ','); toc
Elapsed time is 0.209933 seconds.
tic; y = importdata ('tst2.csv', ','); toc
Elapsed time is 3.2 seconds.

I believe it would be faster  to have importdata check the header lines
only and then pass off the work to dlmread if possible.  dlmread is written
in C++ and, per the benchmarking above, is very fast.  This will work
whenever there are only header lines and column labels.  When there are row
labels the situation becomes messy, but I think you can still be faster by
avoiding loops entirely.  One solution would be to split the long string
returned from fileread into a character array or a cell array and then use
arrayfun or cellfun with a custom function which removed the row label and
then used sscanf on the remaining piece of string.

Overall, I think the function should eventually be in core Octave, but it
needs more testing and refining.  If that work is going to be done
immediately then we can keep it where it is.  Otherwise, I would move it to
Octave-Forge until it can graduate to core Octave.

--Rik


reply via email to

[Prev in Thread] Current Thread [Next in Thread]