[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#7182: sort -R slow
From: |
Davide Brini |
Subject: |
bug#7182: sort -R slow |
Date: |
Sat, 9 Oct 2010 22:06:28 +0100 |
On Sat, 9 Oct 2010 14:52:41 +0200 Ole Tange <address@hidden> wrote:
> I recently needed to randomize some lines. So I tried using 'sort -R'.
> I was astonished how slow that was. So I tested how slow a competing
> strategies are. GNU sort is two magnitudes slower than unsort and more
> than one magnitude slower than perl:
>
> $ time unsort file
> real 0m1.388s
>
> $ unsort --version
> unsort 1.1.2
>
> $ time perl -e 'print sort { rand() <=> rand() } <>' file
> real 0m6.621s
>
> $ time sort -R file
> real 4m8.403s
>
> $ sort --version
> sort (GNU coreutils) 8.5
>
> What is even scarier: sort without -R is faster than sort -R:
>
> $ time sort file
> real 0m53.553s
>
> I would expect sort -R to be faster than sort and faster than Perl if
> not as fast as unsort.
On my system, locale settings seem to impact the runtime significantly:
$ wc -l bigfile
1000000 bigfile
$ time LC_ALL=en_US.utf8 sort -R bigfile > /dev/null
real 1m29.302s
user 1m21.009s
sys 0m0.155s
$ time LC_ALL=C sort -R bigfile > /dev/null
real 0m38.881s
user 0m35.276s
sys 0m0.118s
However, shuf is much faster, and seems mostly unaffected by the locale
used:
$ time shuf bigfile > /dev/null
real 0m1.044s
user 0m0.833s
sys 0m0.042s
--
D.