help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: reading text data with textscan annoyingly slow


From: PhilipNienhuis
Subject: Re: reading text data with textscan annoyingly slow
Date: Wed, 2 Nov 2011 12:52:20 -0700 (PDT)

bpabbott wrote:
> 
> On Nov 2, 2011, at 6:40 AM, MarcelK wrote:
> 
>> http://octave.1599824.n4.nabble.com/file/n3972499/example1.ncf
>> example1.ncf 
>> 
>> Hi,
>> 
>> I'm using Octave 3.2.4. with Windows XP. (i686-pc-mingw32)
>> I'm also using GUIOctave 1.5.3. as frontend. 
>> 
>> I'm facing some problems reading data from a .ncf text file.
>> I've attached an example of such a file (I hope that worked).
>> It takes about 60 seconds to read one single ncf file 
>> However, in Matlab it takes not even a second.
>> 
>> Here's my code I use to read the data in:
>> 
>> 
>> function [Date1,headlines,nummatrix] = ncfread (filename)
>> 
>> fid=fopen(filename,'r'); 
>> 
>> %# read data headers  
>> headerdata=fgets(fid);
>> index=findstr(headerdata,'}');
>> ncols=length(index); 
>> headlines={};
>> headlines(1)=headerdata(1:index(1));
>> for mm=2:ncols
>>        headlines(mm)=headerdata(index(mm-1)+1:index(mm));
>> endfor
>> 
>> textformat=['%s %s',repmat('%f',1,ncols-2)];
>> 
>> datacell=textscan(fid,textformat);
>> 
>> Date1=datacell{1,1}{1};
>> 
>> 
>> timedata=datacell{2};
>> 
>> fclose(fid);
>> 
>> %# generate time vector (time in hours)
>> t=zeros(size(datacell,1),1);
>> timestring=char(timedata);
>> for jj=1:size(timestring,1)
>>    tstruct=strptime(timestring(jj,:),'%R');
>>        t(jj)=tstruct.hour+tstruct.min/60;
>> endfor
>> 
>> %# conversion cell>matrix
>> nummatrix=zeros(length(datacell{1}),size(datacell,2));
>> nummatrix(:,2)=t;
>> 
>> for ii=3:size(nummatrix,2)
>>      nummatrix(:,ii)=datacell{ii};
>> endfor
>> 
>> nummatrix(:,1)=[];
>> 
>> endfunction
>> 
>> 
>> My way of converting the "time string" (e.g. '10:00')  to time in hours
>> (e.g. 10.00) seems quite complicated to me, is there maybe a better way
>> to
>> achieve this?
>> 
>> Thanks in advance,
>> 
>> Marcel
> 
> Octave's textscan() is currently implemented as an m-file, while Matlab's
> has been written in c++. I expect large differences in speed. The
> developers are planning to implement Octave's textscan() in c++ as well.
> I'm optimistic the result will be very fast.
> 
> Even so, I am able to run your script is about 1 sec.
> 
>       tic (); ncfread ('example1.ncf'); toc()
>       Elapsed time is 1 seconds.
> 
> I'm running the developer's sources on MacOS, so it is possible that
> Octave's textread() has been improved or the slow performance is due to
> some problem between Octave and Windows.
> 
> I don't have an older copy of Octave to try, nor do I have a windows
> machine to work with.
> 
> Anyone else?
> 

There was a similar complaint some months ago about the string/text file
reading functions (I think in the bug tracker). 
Rik found out that (IIRC) strtrim() was the culprit. After replacing that,
execution times were much better.

I doubt if textscan.m/textread.m/strread.m from the development sources will
work with 3.2.4. So an Octave upgrade is needed in your case anyway.

You can try the 3.4.3 zip (7z) files (see
https://mailman.cae.wisc.edu/pipermail/octave-maintainers/2011-October/025505.html);
on my box these work with GUIOctave as well (but you need to explicitly set
gnuplot as graphics backend using "graphics_toolkit gnuplot).

Philip

--
View this message in context: 
http://octave.1599824.n4.nabble.com/reading-text-file-with-textscan-annoyingly-slow-tp3972499p3982636.html
Sent from the Octave - General mailing list archive at Nabble.com.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]