octave-patch-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-patch-tracker] [patch #8140] Speed up importdata() ASCII CSV pro


From: Dan Sebald
Subject: [Octave-patch-tracker] [patch #8140] Speed up importdata() ASCII CSV processing using dlmread() as core
Date: Sat, 24 Aug 2013 04:31:11 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:18.0) Gecko/20100101 Firefox/18.0 SeaMonkey/2.15

Follow-up Comment #5, patch #8140 (project octave):

I suspected the existing headers weren't exactly right because it seemed like
too much was assumed about spaces and multiple lines.  Generally, I always
think of CSV as having just one possible header, so in the patch as a first
start I only handle one, barring any knowledge of how it is expected to behave
more generally.  This also hinges on how dlmread() handles headers as
well--dlmread, or it core code, may need adjustment.  Nonetheless, the current
header behavior is obviated by what is in the patch.

I just ran the tests on the patched file.  6 of the 13 tests failed.

Two of the failed tests have to do with the multiple text lines for column
headers.  This probably shouldn't be too hard to fix, but I'm still wondering
if multiple lines of header text should be supported.

The row headers example fails because the first column is being read as NaN
instead of being skipped--since dlmread allows specifying a range to extract,
I think I can fix that one easy enough.

The next fail is one that combines both column and row headers--no surprise
there.  The "exceptional values" test fails with some comment in there about
not knowing whether exceptional values in CSV are supported.

The missing values fails, but the data is read properly--the only problem is
that the return format isn't correct, conditioned on the number of output
arguments.

The very last one is carriage return r only for line breaks--dlmread() fails
on that an is creating complex numbers.  I think dlmread() should be modified
to support just r line breaks.  If not, then it is an easy fix of replacing
all r by n as a first step.  I don't like that fix though because it
necessitates scanning/processing the whole file in script code before letting
dlmread() do the work.  Right now, it is only if dlmread() creates NaNs that
the patch resorts to script code to do much of anything.

I won't have time for a few days, but I'll create a patch that behaves
according to the tests whether we know they are right or wrong, but I'm going
to leave the r test as is.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/patch/?8140>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]