bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] system() should return != 0 when the process is killed


From: Aharon Robbins
Subject: Re: [bug-gawk] system() should return != 0 when the process is killed
Date: Thu, 10 Mar 2016 22:33:19 +0200
User-agent: Heirloom mailx 12.5 6/20/10

Hi. Re this:

> Date: Thu, 3 Mar 2016 17:58:25 +0100
> From: Tobia Conforto <address@hidden>
> To: address@hidden
> Subject: [bug-gawk] system() should return != 0 when the process is killed
>
> Hello
>
> It looks like gawk's system() only returns bits 8 to 15 (or s>>8) of the
> value returned by system(3), discarding any information about the
> termination of the child process by a signal.
>
> This breaks use cases like this (and probably others):
>
>     do {
>         ...
>     } while (! system("sleep 10"))
>
> where the intention is to break the loop when the user interrupts (^C) the
> child process.
>
> It would arguably be better if gawk returned a composite code, as
> traditionally done by most shells and interpreters, for example (s&127)+128
> if killed, s>>8 otherwise, as in Bash.
>
> Other awk implementations (nawk, bwk-awk, and mawk) always return values !=
> 0 when the child process executed by system() is killed by a signal.
>
> -Tobia

As I mentioned earlier, your final paragraph is incorrect. That does
not matter much though.

Here is the note that I sent to Brian Kernighan:

> Hi.
>
> There's a bunch of discussion on the gawk bug list about the return
> value of system().  Your version takes the return value of the C system(3)
> function and divides it by 256.  This is fine and good for a normal exit:
>
> $ nawk 'BEGIN { r = system("exit 42") ; print "got", r }'
> got 42
>
> It's somewhat weirder if the process run by system(3) is killed by
> a signal, say SIGINT (^C):
>
> $ nawk 'BEGIN { r = system("sleep 2"); print "got", r }'
> ^Cgot 0.0078125
>
> This is SIGINT:
>
> $ nawk 'BEGIN { print  0.0078125 * 256 }'
> 2
>
> Is this what you were aiming for?
>
> My guess from reading the code is that you were looking to get only
> the high 8 bits out of the exit status but that because awk uses
> floating point internally you ended up with the fractional result when
> death-by-signal occurs.
>
> In any case, I'm going to try to rationalize gawk's behavior a bit. But
> I'd like to know what the intent was, if you can tell me.
>
> In any case, I'm going to try to rationalize gawk's behavior a bit. But
> I'd like to know what the intent was, if you can tell me.
>
> Thanks!
>
> Arnold

Here is his response (published by permission):

| Hi, Arnold --
| 
| The return value of system() was problematic, since it was
| so oddly encoded in the early days of Unix, and perhaps still
| is.  I think your guess about what's going on is correct.
| Looking at the code now, it's embarrassing; at the last it
| should be converted to an integer before being sent back.
| 
| Let me what the rational behavior should be and I can try
| to fix mine too, though probably I will just let sleeping
| features lie.
| 
| Brian

Here is the diff I will be pushing to the repo. The return value
of system() will vary based on command line options:

* Default: exit status if normal exit, 256 + signum if death-by-signal

* --traditional: Full 16 bit value divided by 256, as in current BWK awk

* --posix: Full 16 bit value.

Brian Kernighan may adjust his awk, or he may not. I've sent him
diffs to do exit value / 256 + signal, but I don't yet whether or
not he will incorporate the changes.

Thank you for the report.

Arnold

P.S. I know this does not address the return value of close().
That's a whole 'nother can of worms.
-------------------------------------------------------
diff --git a/ChangeLog b/ChangeLog
index fa434b2..59eff5c 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2016-03-10         Arnold D. Robbins     <address@hidden>
+
+       * builtin.c (do_system): Improve return values of system().
+
 2016-03-08         Arnold D. Robbins     <address@hidden>
 
        * profile.c (print_instruction): Fix duplicate case not caught
@@ -7,6 +11,8 @@
 
        * profile.c (print_instruction): Further improvements in
        instruction dump, especially for when pretty-printing.
+       * builtin.c (do_system): Augment the logic for the return
+       value so that death-by-signal info is available too.
 
 2016-03-03         Arnold D. Robbins     <address@hidden>
 
diff --git a/NEWS b/NEWS
index 9386bae..31c8471 100644
--- a/NEWS
+++ b/NEWS
@@ -27,6 +27,9 @@ Changes from 4.1.3 to 4.1.x
 7. The profiler / pretty-printer now chains else-if statements instead
    of causing cascading elses.
 
+8. The return value of system() has been enhanced to convey more information.
+   See the doc.
+
 Changes from 4.1.2 to 4.1.3
 ---------------------------
 
diff --git a/builtin.c b/builtin.c
index a62437a..da664e3 100644
--- a/builtin.c
+++ b/builtin.c
@@ -2061,7 +2061,7 @@ NODE *
 do_system(int nargs)
 {
        NODE *tmp;
-       int ret = 0;
+       AWKNUM ret = 0;         /* floating point on purpose, compat Unix awk */
        char *cmd;
        char save;
 
@@ -2081,8 +2081,26 @@ do_system(int nargs)
 
                os_restore_mode(fileno(stdin));
                ret = system(cmd);
-               if (ret != -1)
-                       ret = WEXITSTATUS(ret);
+               /*
+                * 3/2016. What to do with ret? It's never simple.
+                * POSIX says to use the full return value. BWK awk
+                * divides the result by 256.  That normally gives the
+                * exit status but gives a weird result for death-by-signal.
+                * So we compromise as follows:
+                */
+               if (ret != -1) {
+                       if (do_posix)
+                               ;       /* leave it alone, full 16 bits */
+                       else if (do_traditional)
+                               ret /= 256.0;
+                       else if (WIFEXITED(ret))
+                               ret = WEXITSTATUS(ret); /* normal exit */
+                       else if (WIFSIGNALED(ret))
+                               /* use 256 since exit values are 8 bits */
+                               ret = WTERMSIG(ret) + 256;
+                       else
+                               ret = 0;        /* shouldn't get here */
+               }
                if ((BINMODE & BINMODE_INPUT) != 0)
                        os_setbinmode(fileno(stdin), O_BINARY);
 
diff --git a/doc/ChangeLog b/doc/ChangeLog
index 87a1bc9..afe5841 100644
--- a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ -1,3 +1,8 @@
+2016-03-07         Arnold D. Robbins     <address@hidden>
+
+       * gawktexi.in: Document system() return values.
+       * gawk.1: Add a pointer to the manual about same.
+
 2016-02-23         Arnold D. Robbins     <address@hidden>
 
        * sidebar.awk: Globally replace [[:space:]] with [ \t] so that
diff --git a/doc/gawk.1 b/doc/gawk.1
index da1f583..ca51a53 100644
--- a/doc/gawk.1
+++ b/doc/gawk.1
@@ -13,7 +13,7 @@
 .              if \w'\(rq' .ds rq "\(rq
 .      \}
 .\}
-.TH GAWK 1 "Dec 17 2015" "Free Software Foundation" "Utility Commands"
+.TH GAWK 1 "Mar 7 2016" "Free Software Foundation" "Utility Commands"
 .SH NAME
 gawk \- pattern scanning and processing language
 .SH SYNOPSIS
@@ -2259,6 +2259,7 @@ Execute the command
 .IR cmd-line ,
 and return the exit status.
 (This may not be available on non-\*(PX systems.)
+See the manual for the full details on the exit status.
 .TP
 \&\fBfflush(\fR[\fIfile\^\fR]\fB)\fR
 Flush any buffers associated with the open output file or pipe
diff --git a/doc/gawktexi.in b/doc/gawktexi.in
index ff5672a..c284f84 100644
--- a/doc/gawktexi.in
+++ b/doc/gawktexi.in
@@ -17530,7 +17530,7 @@ it is all buffered and sent down the pipe to 
@command{cat} in one shot.
 @cindex interacting with other programs
 Execute the operating system
 command @var{command} and then return to the @command{awk} program.
-Return @var{command}'s exit status.
+Return @var{command}'s exit status (see further on).
 
 For example, if the following fragment of code is put in your @command{awk}
 program:
@@ -17569,6 +17569,33 @@ When @option{--sandbox} is specified, the 
@code{system()} function is disabled
 (@pxref{Options}).
 @end quotation
 
+On POSIX systems, a command's exit status is a 16-bit number. The exit
+value passed to the C @code{exit()} function is held in the high-order
+eight bits. The low-order bits indicate if the process was killed by a
+signal (bit 7) and if so, the guilty signal number (bits 0--6).
+
+Traditionally, @command{awk}'s @code{system()} function has simply
+returned the exit status value divided by 256. In the normal case this
+gives the exit status but in the case of death-by-signal it yields
+a fractional floating-point address@hidden private correspondance,
+Dr.@: Kernighan has indicated to me that the way this was done
+was probably a mistake.} POSIX states that @command{awk}'s
address@hidden()} should return the full 16-bit value.
+
address@hidden steers a middle ground.
+By default, it returns just the exit status. The
address@hidden option causes @command{gawk} to divide
+the return vaue by 256, just as Brian Kernighan's @command{awk} does.
+With @option{--posix}, it returns the full 16-bit value.
+
+If the process was killed by a signal, @command{gawk}'s @code{system()}
+returns 256 + @var{sig}, where @var{sig} is the number of the signal
+that killed the process.  Since exit values are eight bits, where the
+values range from 0--255, using 256 + @var{sig} lets you clearly distinguish
+normal exit from death-by-signal.
+
+If some kind of error occurred, @code{system()} returns @minus{}1.
+
 @end table
 
 @sidebar Controlling Output Buffering with @code{system()}



reply via email to

[Prev in Thread] Current Thread [Next in Thread]