[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: feature request: gzip/bzip support for sort
From: |
Dan Hipschman |
Subject: |
Re: feature request: gzip/bzip support for sort |
Date: |
Mon, 15 Jan 2007 12:33:04 -0800 |
User-agent: |
Mutt/1.5.9i |
On Sat, Jan 13, 2007 at 10:07:59PM -0800, Paul Eggert wrote:
> 3. I can see where the user might be able to specify a better
> algorithm, for a particular data set. For that, how about if we have
> a --compress-program=PROGRAM option, which lets the user plug in any
> program that works as a pipeline? E.g., --compress-program=gzip would
> use gzip. The default would be to use "PROGRAM -d" to decompress; we
> could have another option if that doesn't suffice.
>
> An advantage of (3) is that it should work well on two-processor
> hosts, since compression can be done in one CPU while sorting is done
> on another. (Hmm, perhaps we should consider forking even if we use a
> built-in default compressor, for the same reason.)
I've started working on this, and have made good progress so far. There
are a lot of subtleties, though, like making sure the forked child
doesn't receive SIGINT and unlink all our temp files before it execs
(I've solved that problem), and making sure the compress process
finishes compressing the temp file before the corresponding decompress
process starts processing it (I've got a plan for that). Anyway, my
point is, I've gotten off to a good start, but it's going to take a lot
of testing to make sure I've done it right due to all these race
conditions.
The actual compression is obviously a lot better (using gzip / bzip2),
and it shouldn't be hard to extend the code so sort can read and write
externally compressed files, which is what the OP wanted. It's not
faster (not even close) on my machine, though. Of course, I've only got
one CPU, and a slow one at that :-)
Dan
- feature request: gzip/bzip support for sort, Craig Macdonald, 2007/01/13
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/13
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/13
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/13
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/14
- Re: feature request: gzip/bzip support for sort,
Dan Hipschman <=
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/15
- Re: feature request: gzip/bzip support for sort, Paul Eggert, 2007/01/16
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/20
- Re: feature request: gzip/bzip support for sort, James Youngman, 2007/01/21
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/21
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/21
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/21
- Re: feature request: gzip/bzip support for sort, Jim Meyering, 2007/01/21
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/21
- Re: feature request: gzip/bzip support for sort, Dan Hipschman, 2007/01/21