coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sort with header/skip-lines support


From: Pádraig Brady
Subject: Re: Sort with header/skip-lines support
Date: Fri, 11 Jan 2013 00:11:14 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1

On 01/10/2013 09:57 PM, Assaf Gordon wrote:
Hello,

I'd like to re-visit an old issue: adding header-line/skip-lines support to 
'sort'.

It has been discussed few times in the past, but IMHO the suggested workarounds 
fall short:
1. Sometimes using 'bash' specific constructs [1]
2. No error checking (e.g. running head/tail/sed without checking for errors)
3. Using multiple input files is convoluted.
4. Suggestions work for regular files, but not for pipes [2].

The attached draft patch is based on Jim Hester's patch [3], rebased to the 
latest sort, with some fixes and tests.
It seems to work fine, except one glaring omission: it only works when output 
is STDOUT because creating the output file is a brute-force ugly hack.

The syntax is
   sort --skip-lines=N [other options]

That's a bit ambiguous and might suggest that the header line
was not output after the sort? Maybe keep consistent with
`join` and `numfmt` and use --header.


The two tests are:
   make check TESTS=tests/misc/sort-skip-lines SUBDIRS=.
   make check TESTS=tests/misc/sort-skip-lines-bigfiles SUBDIRS=. 
RUN_EXPENSIVE_TESTS=yes

If this is something you are willing to consider, I'm happy to hear comments 
and suggestions and improve it.

Alternatively, perhaps this is a good candidate for a "contrib" script, but I'm 
not sure how do go about developing a shell script that is posix compliant, has robust 
error checking, and still be a full 'drop-in' replacement for sort (many options 
combinations).

Thanks,
  -gordon

[1] - bash work-around: 
http://lists.gnu.org/archive/html/coreutils/2010-11/msg00084.html
[2] - no pipe support: 
http://lists.gnu.org/archive/html/bug-coreutils/2007-07/msg00215.html

Note the pipe issue might be handled with `stdbuf -i0 head ...`
but head doesn't use stdio so that won't work.
But recent sed can be used for this like: `seq -u 1q`
http://git.sv.gnu.org/gitweb/?p=sed.git;a=commit;h=737ca5e
Note that commit is 4 years old, but only recently released sed 4.2.2 contains 
it.

[3] - Jim's patch: 
http://lists.gnu.org/archive/html/coreutils/2010-11/msg00091.html

Thanks for collating the previous threads on this subject.

I'm on the fence on how warranted this is TBH.
We'd need stronger arguments for it I think.

thanks,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]