[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: feature request: gzip/bzip support for sort
From: |
Jim Meyering |
Subject: |
Re: feature request: gzip/bzip support for sort |
Date: |
Tue, 16 Jan 2007 13:20:16 +0100 |
Dan Hipschman <address@hidden> wrote:
> Here's the patch for comments. Thanks,
I tried it and did some timings.
Bottom line: with a 4+GB file, dual-processor, I see a 19% speed-up,
but I think most of the savings is in reduced I/O.
--------------------------------------------
virtually no difference (~5%) for a file of size 324M, created like this:
running on a uniprocessor amd-64 3400:
$ seq 99999 > k
$ cat k k k k k k k k k k k k k k k k k k k k k k k k > j
$ mv j k
$ cat k k k k k k k k k k k k k k k k k k k k k k k k > j
$ shuf < j > sort-in
$ /usr/bin/time ./sort --compress=gzip < sort-in > out
100.11user 4.69system 1:48.67elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+761524minor)pagefaults 0swaps
$ /usr/bin/time ./sort < sort-in > out
93.16user 3.35system 1:40.35elapsed 96%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+137435minor)pagefaults 0swaps
-----------------------------------------
Trying similar, but with a 4.2GB file created like this,
running on a dual-processor with 2GB of RAM.
$ seq 9999999 > k 10M lines / 78888888 bytes
$ cat k k k k k k k k k > j 90M lines
$ cat j j j j j j j > k 630M lines / 4.62 GB
$ mv k sort-in
What does /tmp look like, after a few minutes?
$ du -sh /tmp/sort*
11M /tmp/sort0Sdnkk
11M /tmp/sort6cTaAE
11M /tmp/sortABogDY
...
----------- contrast with sizes during the run w/no compression:
216M /tmp/sort5NTqQu
216M /tmp/sortAjx50R
216M /tmp/sortKvyGIT
...
$ /usr/bin/time ./sort -T /tmp --compress=gzip < sort-in > out
1535.33user 71.76system 27:15.72elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+4985207minor)pagefaults 0swaps
$ /usr/bin/time ./sort -T /tmp < sort-in > out
$ /usr/bin/time ./sort -T /tmp < sort-in > out
./sort: write failed: /tmp/sortieA1nv: No space left on device
Command exited with non-zero status 2
588.79user 17.76system 17:20.70elapsed 58%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (2major+225191minor)pagefaults 0swaps
[Exit 2]
$ df -hT /tmp
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda3 reiserfs 12G 6.4G 5.4G 55% /
$ /usr/bin/time ./sort -T . < sort-in > out
754.03user 38.99system 33:42.35elapsed 39%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (2major+210437minor)pagefaults 0swaps
So, with just one trial each, I see a 19% speed-up.
- Re: feature request: gzip/bzip support for sort, (continued)
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/23
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/24
- Re: feature request: gzip/bzip support for sort, Eric Blake, 2007/01/24
- Re: feature request: gzip/bzip support for sort, Paul Eggert, 2007/01/24
- Re: feature request: gzip/bzip support for sort, Craig Macdonald, 2007/01/25
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/25
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/24
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/25
- Re: feature request: gzip/bzip support for sort,
Jim Meyering <=
- Re: feature request: gzip/bzip support for sort, Paul Eggert, 2007/01/16
- Re: feature request: gzip/bzip support for sort, Bauke Jan Douma, 2007/01/16
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/16
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/18
- Re: feature request: gzip/bzip support for sort, Philip Rowlands, 2007/01/18
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/18
- Re: feature request: gzip/bzip support for sort, Philip Rowlands, 2007/01/18
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/16
- Re: feature request: gzip/bzip support for sort, James Youngman, 2007/01/16
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/18