bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sort: Add --threads option, which parallelizes internal sort


From: Ralf Wildenhues
Subject: Re: [PATCH] sort: Add --threads option, which parallelizes internal sort.
Date: Sat, 4 Apr 2009 08:38:01 +0200
User-agent: Mutt/1.5.18 (2008-05-17)

Hello Paul, Glen, Jim,

[ I already wrote this in part to Glen off-list, sorry for the
  duplication ]

* Paul Eggert wrote on Fri, Apr 03, 2009 at 09:57:54PM CEST:
> Of course we cannot reasonably expect this one performance improvement
> to make 'sort' run 16x faster on a 16-CPU machine.

No.  Let's discuss the 1 vs. 2 threads issue for now: the part running
concurrently takes O(N log N) work.  The parts not running concurrently
are: startup code, file reading, list splitting, list merging, and
output.  All of the latter are O(N) or less, with N being the input size.
The concurrent parts do not require any communication/synchronization
between the 2 threads.

This means that a parallelization which doesn't hit the memory bus
bandwidth limit in the concurrent stage, has to have (slowly) improving
1:2 thread behavior when N grows.  But the experiment does not show
that.  This indicates a possible issue.  To exclude bandwidth issues,
one can try to run this code on a system which has as big as possible
memory busses, and e.g., run the threads on separate processors (as
opposed to only separate cores or HT threads).

I'm not trying to veto this patch, it is certainly an improvement over
not having it, and improving things in steps is good; but I'm sure it
could be better even without parallelizing the merge part of the
algorithm.

I already noted off-list to Glen that you too can get an account on the
GCC compile farm, which has some decent multi-way systems.  But for the
above issue a bigger system isn't necessary.

Cheers,
Ralf




reply via email to

[Prev in Thread] Current Thread [Next in Thread]