bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34488: Add sort --limit, or document workarounds for sort|head error


From: Assaf Gordon
Subject: bug#34488: Add sort --limit, or document workarounds for sort|head error messages
Date: Fri, 15 Feb 2019 08:37:48 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

severity 34488 wishlist
retitle 34488 doc: sort: expand on "broken pipe" (SIGPIPE) behavior
stop

Hello,

On 2019-02-15 7:43 a.m., 積丹尼 Dan Jacobson wrote:
Things start out cheery, but quickly get ugly,

$ for i in 9 99 999 9999 99999; do seq $i|sort -n|sed 5q|wc -l; done
5
5
5
5
sort: write failed: 'standard output': Broken pipe
sort: write error
5
sort: write failed: 'standard output': Broken pipe
sort: write error

Therefore, kindly add a sort --limit=n,

I don't think this is wise, as "head -n5" does exactly that in much more
generic way.

and/or on (info "(coreutils) sort invocation")
admit the problem, and give some workarounds, lest
our scripts occasionally spew error messages seemingly randomly,
just when the boss is looking.

Just to clarify: why do you think this a "problem" ?

This is the intended behavior of most proper programs:
Upon receiving SIGPIPE they should terminal with an error,
unless SIGPIPE is explicitly ignored.
The errors are not "random" - they happen because you explicitly
cut short the output of a program.

It is an important indication about how your pipe works,
and sort is not to blame, e.g.:

    $ seq 100000 | head -n1
    1
    seq: write error: Broken pipe

    $ seq 1000000| cat | head -n1
    1
    cat: write error: Broken pipe
    seq: write error: Broken pipe

This is a good indication that the entire output was not consumed,
and is very useful and important in some cases, e.g. when a program
crashes before consuming all input.

Here's a contrived example:

   $ seq 1000000 | sort -S 200 -T /foo/bar
sort: cannot create temporary file in '/foo/bar': No such file or directory
   seq: write error: Broken pipe

I force "sort" to fail (limiting it's memory usage and pointing it to
non-existing temporarily directory).
It is then good to know that seq's output was cut short and not consumed.

If you know in advance you will trim the output of a program,
either hide the stderr with "2>/dev/null",
or use the shell's "trap PIPE" mechanism.

And no fair saying "just save the output" (could be big) "into a file
first, and do head(1) or sed(1) on that."

If you want to consume all input and just print the first 5 lines,
you can use "sed -n 1,5p" instead of "sed 5q" - no need for a temporary
file.


I'm marking this as a documentation "wishlist" item,
and patches are always welcomed.

regards,
 - assaf






reply via email to

[Prev in Thread] Current Thread [Next in Thread]