[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Question on $IFS related differences (Was: Question on $@ vs $@$@)
From: |
Robert Elz |
Subject: |
Re: Question on $IFS related differences (Was: Question on $@ vs $@$@) |
Date: |
Wed, 18 Sep 2024 22:00:59 +0700 |
Date: Wed, 18 Sep 2024 07:21:30 -0400
From: Greg Wooledge <greg@wooledge.org>
Message-ID: <Zuq3umfwkcx3TAtM@wooledge.org>
| On Wed, Sep 18, 2024 at 08:05:10 +0300, Oğuz wrote:
| > It boils down to this:
| > f(){ echo $#;}; set "" "" ""; IFS=x; f $*
| > bash, NetBSD and FreeBSD sh, and ksh88 all agree and print 2. pdksh
| > prints 3 but mksh and oksh print 1. dash, ksh93, yash, and zsh print
| > 0.
There is no right answer there, 0 and 3 are the most likely results, but
1 and 2 are also possible.
| At the risk of sounding like a broken record, using an unquoted $* or $@
| in a context where word splitting occurs is just *begging* for trouble.
That's true, and while bug-bash perhaps isn't the best list for this
(I deleted the other shell lists from this reply, as I don't think I
get to send to those, as I'm not subscribed) that also isn't the point.
As best I can tell there isn't really a shell implementers list (the
austin group list gets some of it, but that covers the whole of the
POSIX standard, and weird details of how some library function should
work aren't really relevant to shell implementors).
Steffen is concerned with what the implementation is supposed to do
in these situations, not how some random script behaves, or doesn't.
Giving guidance that you would give to a user having problems with
a script they're trying to write isn't appropriate here, the only
discussion that matters is what is the correct behaviour.
And for the example from Oğuz, the value of IFS should be completely
irrelevant, as there's nothing anywhere in that example which actually
needs splitting. The expansion of $* (unquoted, in a context where
field splitting occurs) is supposed to produce 3 fields (since there are
3 set numeric parameters) each of which contains nothing ("") - each of
those is then subject to field splitting, but when there's nothing, there's
nothing.
The standard says that "any empty fields may be discarded" (that's
actually before the field splitting is to happen, but here it makes no
difference). Note the "any ... may", so the implementation is allowed
to, but not required to, discard any of the three empty fields that
have been produced. So it can discard none, (answer 3) or discard all
of them (answer 0) which are the more reasonable choices, or it can
discard 1 (answer 2) or 2 (answer 1) of the three. Any of those is possible.
| Please don't do this in your scripts.
So for a script writer, that's good advice, for an implementor of a
shell, it is useless, it is also useless for determining whether or
not a shell has a bug or not.
Here, as best I can tell, none do - though I know that the shell I
maintain (the NetBSD sh) gets to this point more by a fluke than
anything else, treating the $* as if it were "$*" - except unquoted
and thus subject to field splitting (perhaps bash does the same thing).
Then the expansion of $* above gives xx (not nothing) which is then
field split, which produces 2 fields (as each x is really a field terminator,
no field follows the final one). It doesn't matter what IFS[0] is for
this, as long as it isn't white space, the same result will always happen
(the expansion inserts it, field splitting removes it). When IFS[0] is
white space, different rules apply to the field splitting algorithm, which
is why the results differ in that case.
Nothing allows empty fields produced by field splitting to be discarded,
so we end up with 2 fields remaining.
How 0 or 3 are produced is easy to see (either all, or none, of the empty
fields are discarded, either of which is a reasonable choice) - then field
splitting would happen on the ones not discarded, but cannot split anything
when the fields are empty.
I'm not sure what the implementation mechanics are which would actually
produce 1 field as the result.
So Steffen, if you were writing a shell, then you could do whatever you
like in this case, the value of IFS really should not matter at all, and
either 0 or 3 fields are sensible answers. For a MUA, I think you get
to do whatever you like, and trying to copy the various bizarre shell
behaviour in this case (that different shells implement this differently
is why the standard is so vague about what happens) doesn't make much sense.
And certainly, if you're writing a script, just don't do things like this.
kre