octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fast set operations / verdict


From: Levente Torok
Subject: Re: fast set operations / verdict
Date: Sat, 9 Aug 2008 22:09:01 +0200
User-agent: KMail/1.9.6 (enterprise 0.20070907.709405)

Hi Jaroslav,

Thanks for the fast repsonse.

I copied out sources from octave.3.0.1 to make this comparision. I suffexed 
them with number 3 for the experiment.
Please note that set complement is much much more expensive than intersection. 
Here are my measurements:

l=10000000;
octave:7> x= floor(l * rand(l,1)); y=floor(l * rand(l,1)) ; t=time; 
z=intersection2( x, y ); dt=time -t
dt = 37.698
octave:8> x= floor(l * rand(l,1)); y=floor(l * rand(l,1)) ; t=time; 
z=intersection3( x, y ); dt=time -t
dt = 90.695
ratio > 2.4

octave:14> l=50000;x= floor(l * rand(l,1)); y=floor(l * rand(l,1)) ; t=time; 
z=complement2( x, y ); dt=time -t
dt = 0.12633
octave:15> l=50000;x= floor(l * rand(l,1)); y=floor(l * rand(l,1)) ; t=time; 
z=complement3( x, y ); dt=time -t
dt = 57.126

The reason why I hassle with is that I need I need to load data from a database 
with bare columns (dataware house)
and I need to make the joint of these columns here in octave unfortunately.

Levente

On Saturday 09 August 2008, you wrote:
> On Sat, Aug 9, 2008 at 4:40 PM, Levente Torok <address@hidden> wrote:
> > Dear All,
> >
> > I was in the need to have a fast set operations such as intersection and 
> > complement but the
> > those that I could use with the current octave versions (<3.0) are very 
> > very slow to my needs.
> 
> Well, set operations are not a typical bottleneck in computations
> people employ Octave for, so there's not much demand for improvement.
> Still, 2.9.x series are quite old - benchmarking should be done
> against development version (or a recent snapshot at least) to be of
> any value. There have been improvements to set functions since (though
> mostly for functionality rather than performance) and to sorting
> (which may make a significant difference).
> 
> > For this reason I wrote a version in STL C++.
> 
> OK, but if I were to do that, I'd simply wrap set_intersection from 
> <algorithm>.
> 
> > May be not optimal in terms of strorage (I convert everything to STL vector 
> > since it has fast iterators)
> > however this is still 100-500 times faster for vectors with 10^5 elements, 
> > at least,
> > than the versions currently supplied by octave. I enclose the sources.
> >
> 
> If you have any interest, I can test this against development version
> if I have the time. 100-500 is too much; if this is still true, we
> should improve something.
> Also note that if the set sizes are not asymptotically equal, then
> your approach is sub-optimal. (Instead of O(N*log(N) + M*log(M)) you
> can easily get O(N*log(N)+M))
Sure. 
See the first 10 lines of my sources.
They solve this problem, I guess, with worst case n log n optimality.

Levente


-- 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]