[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: C++ version of regexprep.cc
From: |
David Bateman |
Subject: |
Re: C++ version of regexprep.cc |
Date: |
Tue, 02 May 2006 21:39:50 +0200 |
User-agent: |
Thunderbird 1.5 (Windows/20051201) |
Paul Kienzle wrote:
On May 2, 2006, at 10:57 AM, David Bateman wrote:
I just noted, you didn't state whether this improved the speed of
your xml code sufficiently or not... Or whether there is a another
speed problem elsewhere.
mat2cell is slower than I expect. It needs an OCTAVE_QUIT in its loop.
The particular file that I'm trying to load is 500 kb, with about 6000
separate values, so 12000 separate open/close tags, and 24000 elements
in the partition. I was expecting this to take a couple of seconds
but instead it takes 2 1/2 minutes.
Running some tests, the behaviour of mat2cell is quadratic over
[1000,10000]. Looking at the code I can't tell why.
You can try running 'speed' on it (I'm not tunneling X so I can't
right now):
speed("v=mat2cell(s,1,p);","s=repmat('a',1,n);p=ones(1,n);",10000);
mat2cell is still fast compared to the for loop in xml2mat which
builds the cell structure from the xml.
At this point I'm going to declare defeat and say that xml2mat won't
run on octave with large files in reasonable time.
I think we can therefore do better. In matlab I get
>> n = 1000;
>> s = repmat('a',1,n);p=ones(1,n);
>> tic;v=mat2cell(s,1,p);toc
Elapsed time is 0.030060 seconds.
>> n = 5000;
>> s = repmat('a',1,n);p=ones(1,n);
>> tic;v=mat2cell(s,1,p);toc
Elapsed time is 0.148529 seconds.
>> n = 10000;
>> s = repmat('a',1,n);p=ones(1,n);
>> tic;v=mat2cell(s,1,p);toc
Elapsed time is 0.295874 seconds.
So pretty clearly linear time. In octave (though not run on the same
machine as I'm using the vpn to get onto a machine to run matlab) I get
octave:1> n = 1000;
octave:2> s = repmat('a',1,n);p=ones(1,n);
octave:3> tic;v=mat2cell(s,1,p);toc
Elapsed time is 0.096954 seconds.
octave:4> n = 5000;
octave:5> s = repmat('a',1,n);p=ones(1,n);
octave:6> tic;v=mat2cell(s,1,p);toc
Elapsed time is 0.715096 seconds.
octave:7> n = 10000;
octave:8> s = repmat('a',1,n);p=ones(1,n);
octave:9> tic;v=mat2cell(s,1,p);toc
Elapsed time is 2.910286 seconds.
so, as you say, pretty clearly quadratic time, and not that competitive
(though I am comparing a bi-proc 2.4GHz Xeon P4 running matlab against a
1.6GHz P4M running octave). This is all the more depressing as the
matlab code is an m-file...
I think I see what the issue though. We are reallocating a ColumnVector
in the interior of the loop from the input arguments. I don't really see
a good way around this, but it can be confirmed that this is a issue, as
if we change the call to mat2cell a bit, then the allocation of the
ColumnVector is done once. That is
octave:23> n = 1000;
octave:24> s = repmat('a',1,n);p=ones(1,n);
octave:25> tic;v=mat2cell(s.',p);toc
Elapsed time is 0.011963 seconds.
octave:26> n = 5000;
octave:27> s = repmat('a',1,n);p=ones(1,n);
octave:28> tic;v=mat2cell(s.',p);toc
Elapsed time is 0.035385 seconds.
octave:29> n = 10000;
octave:30> s = repmat('a',1,n);p=ones(1,n);
octave:31> tic;v=mat2cell(s.',p);toc
Elapsed time is 0.064461 seconds.
Which is much nicer. Can you use this to speed-up your code? I'll see
what I can do to get the speed up in the other case..
D.
- Re: C++ version of regexprep.cc, (continued)
- Re: C++ version of regexprep.cc, Paul Kienzle, 2006/05/01
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/02
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/02
- Re: C++ version of regexprep.cc, Tom Holroyd (NIH/NIMH) [E], 2006/05/02
- Re: C++ version of regexprep.cc, Paul Kienzle, 2006/05/02
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/02
- Re: C++ version of regexprep.cc, Paul Kienzle, 2006/05/02
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/02
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/02
- Re: C++ version of regexprep.cc, Paul Kienzle, 2006/05/02
- Re: C++ version of regexprep.cc,
David Bateman <=
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/02
- Re: C++ version of regexprep.cc, John W. Eaton, 2006/05/02
- Re: C++ version of regexprep.cc, Tom Holroyd, 2006/05/02
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/03
- Re: C++ version of regexprep.cc, John W. Eaton, 2006/05/03
- Re: C++ version of regexprep.cc, Paul Kienzle, 2006/05/02
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/03
- Re: C++ version of regexprep.cc, Paul Kienzle, 2006/05/03
- Re: C++ version of regexprep.cc, David Bateman, 2006/05/04
- Re: C++ version of regexprep.cc, Alois Schloegl, 2006/05/04