octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: benchmarks - sort


From: David Bateman
Subject: Re: benchmarks - sort
Date: Mon, 19 Jan 2004 17:34:37 +0100
User-agent: Mutt/1.4.1i

According to Paul Kienzle <address@hidden> (on 01/19/04):
> If you are going to implement it, keep compatibility
> for complex sort, since any code that uses it is going
> to depend on the order.
> 
> Complex sort could be used to match complex pairs.
> Chances are, though, that it is not going to be very
> reliable since the complex pairs may have differing
> magnitudes due to machine precision issues.  Has
> anyone actually seen complex sorts in the wild?
> 
> It is tempting to just punt on  complex sort and tell
> the users to use the following instead:
> 
>       [junk,idx] = sortrows([abs(x(:)),arg(x(:))])
>       y = x(idx);
> 
> That way they will know they are doing a meaningless
> operation.

Yes, but matlab has it so, if we do this we introduce another
annoying incompatiability....

> A couple of other points:
> 
> (1) Octave's existing sort is stable (i.e., equal values
> retain their order).

Is, I thought the divide and conquer algorithm it uses makes
it inevetiable that there are misorderings. In any case, radix
sorts are also stable, so the order is also maintained.

The complex ordering in octave doesn't respect the values phase
however.

> (2) Make sure to check the case of a sorted or almost
> sorted array.  These occur quite a bit in existing code
> (e.g., lookup and hist).

This is not so easy to do in radix sort, as there is no comparisons in
the code per-se. Its all done with table, so you tested for almost sorted
arrays would just significantly slow things down. Radix-sorts also loose
out in cases where there are lots of identical values in the array. But hey,
I've gained a factor of 8 in speed, so I think you still win with my version
of the code.

> BTW, the python folk are quite proud of their sort.  A description of  
> the algorithm is here:
> http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/ 
> listsort.txt?rev=1.1&content-type=text/vnd.viewcvs-markup
> 
> And the code is here:
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/ 
> Objects/listobject.c?rev=2.131&content-type=text/vnd.viewcvs-markup

Well, except for the case of partially sorted lists the Radix sort should
be faster as it breaks the O(lg(n!)) lower bound on comparison based sorts.
In fact it runs in linear time, and so it will be increasing better than 
a comparison based sort as n increases. Where the radix sort will loose
out is small partially sorted lists. The question is how small is small and
what structure in the original array we can exploit.

It will also be determined by the cost of the compare operation. It seems
that what is really costing us is the check to for NaN's...

Anyway, need more work to find the best solution. It might be a case of find
the break point where a radix sort is faster than a comparison sort and 
using the faster of the two algorithms...

Regards
David

-- 
David Bateman                                address@hidden
Motorola CRM                                 +33 1 69 35 48 04 (Ph) 
Parc Les Algorithmes, Commune de St Aubin    +33 1 69 35 77 01 (Fax) 
91193 Gif-Sur-Yvette FRANCE

The information contained in this communication has been classified as: 

[x] General Business Information 
[ ] Motorola Internal Use Only 
[ ] Motorola Confidential Proprietary



reply via email to

[Prev in Thread] Current Thread [Next in Thread]