octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: C++ version of regexprep.cc


From: Paul Kienzle
Subject: Re: C++ version of regexprep.cc
Date: Tue, 2 May 2006 23:21:57 -0400


On May 2, 2006, at 7:45 PM, David Bateman wrote:

Paul Kienzle wrote:

On May 2, 2006, at 10:57 AM, David Bateman wrote:

I just noted, you didn't state whether this improved the speed of your xml code sufficiently or not... Or whether there is a another speed problem elsewhere.

mat2cell is slower than I expect. It needs an OCTAVE_QUIT in its loop.

The particular file that I'm trying to load is 500 kb, with about 6000 separate values, so 12000 separate open/close tags, and 24000 elements in the partition. I was expecting this to take a couple of seconds but instead it takes 2 1/2 minutes.

Running some tests, the behaviour of mat2cell is quadratic over [1000,10000]. Looking at the code I can't tell why.

You can try running 'speed' on it (I'm not tunneling X so I can't right now):

speed("v=mat2cell(s,1,p);","s=repmat('a',1,n);p=ones(1,n);",10000);


mat2cell is still fast compared to the for loop in xml2mat which builds the cell structure from the xml.

At this point I'm going to declare defeat and say that xml2mat won't run on octave with large files in reasonable time.

- Paul

Paul,

I think I've eliminated the quadratic behavior of mat2cell with the attached version. I now get

octave:1> n = 1000;
octave:2> s = repmat('a',1,n);p=ones(1,n);
octave:3> tic;v=mat2cell(s,1,p);toc Elapsed time is 0.021194 seconds. octave:4> tic;v=mat2cell(s,1,p);toc Elapsed time is 0.013100 seconds.
octave:5> n = 5000;
octave:6> s = repmat('a',1,n);p=ones(1,n);
octave:7> tic;v=mat2cell(s,1,p);toc Elapsed time is 0.044708 seconds.
octave:8> n = 10000;
octave:9> s = repmat('a',1,n);p=ones(1,n);
octave:10> tic;v=mat2cell(s,1,p);toc Elapsed time is 0.088394 seconds.

which is much more respectable. Does this help your xml2mat code, or are there other points that are slow?

The splitting step is now a much more reasonable 2.5 seconds.

The remainder of the file is messy for loops which could
probably be made faster if the code was restructured.

The code is full of puzzles such as the following:

   iis='i1';
   for ii=2:length(w.size);
     iis=[iis,',i',num2str(ii)];
   end
   nn=prod(w.size); %number of elements
   eval(['[',iis,']=ind2sub(w.size,[1:nn]);']); % generation of indexes
   iis='i1(ind)';
   for ii=2:length(w.size);
     iis=[iis,',i',num2str(ii),'(ind)'];
   end % indexes of indexes
   for ind=1:nn
     eval(['valor(',iis,')=tag_contents{i,4}(ind);'])
   end
   if exist('valor')==1;
     tag_contents{i,4}=valor;
     clear valor;
   end

Which as far as I can tell is just:

   tag_contents{i,4} = reshape(tag_contents{i,4},w.size);

The latter runs much faster 8-)

With the above transformation, I can now get almost all the way
through in reasonable time.

Now it is hitting a 'max recursion limit exceeded' message in
code even uglier than the above.  In addition, I believe this
code relies on the fact that fields in a structure are ordered.

So again I declare defeat and say this code cannot be made to
run in Octave.

Thanks for trying.

        - Paul



reply via email to

[Prev in Thread] Current Thread [Next in Thread]