[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: FIELDWIDTHS broken in awk 3.1.5
From: |
Glenn Zazulia |
Subject: |
Re: FIELDWIDTHS broken in awk 3.1.5 |
Date: |
Wed, 25 Jan 2006 20:01:18 -0700 |
User-agent: |
Thunderbird 1.5 (X11/20051201) |
Aharon,
I found your FIELDWIDTHS patch which fixes a problem that I also noticed
with the 3.1.5 gawk release.
I also noticed a related possible problem with FIELDWIDTHS in the gawk
3.1.5 release. In gawk 3.1.4 and earlier, the statement
FIELDWIDTHS = ""
used to work -- that is, it was executed without error and a subsequent
"print FIELDWIDTHS" statement showed that any previous value was unset.
Running that statement through gawk 3.1.5 produces the following fatal
error:
gawk: (FILENAME=- FNR=1) fatal: invalid FIELDWIDTHS value, near `'
Now, one might question what that statement should do, and one might
argue that instead of being a new bug, this might be a fix of previously
improper behavior. I don't think so. Since the FIELDWIDTHS variable is
a gawk extension and since the documentation doesn't state what the
behavior should be in this case, it's difficult for me to argue one way
or the other. However, this change in behavior in 3.1.5 breaks existing
gawk scripts, which is a problem.
Let me illustrate with an existing script segment:
...
line = $0;
FIELDWIDTHS = "2 2 2 2";
$0 = hexmask;
ddecmask = "";
for (i = 1; i < NF; i++)
ddecmask = ddecmask strtonum("0x" $i) ".";
ddecmask = ddecmask strtonum("0x" $NF);
FIELDWIDTHS = "";
FS = FS;
$0 = line;
...
Notice that this script temporarily switches to fixed field mode to
process a particular value, and then it switches back to regular
variable field-separator mode for further input processing. Granted,
there are alternative equivalent methods that could avoid using the
FIELDWIDTHS mechanism entirely, but that's besides the point. This code
segment used to work properly in previous gawk releases and now breaks.
Honestly, the 'FIELDWIDTHS = ""' statement is actually unimportant in
this case since the actual statement that restores the default record
splitting behavior is the subsequent "FS = FS" statement. So, for the
time-being, I have removed the statement that clears out FIELDWIDTHS
variable in order that the script work with the latest gawk release.
However, I had previously cleared it out as an extra precaution and to
avoid confusion.
[The fact that FS and FIELDWIDTHS could both be set simultaneously is
the confusion that I was trying to avoid. I think this issue stems from
the design decision that setting either variable both defines the field
separator and (possibly) switches record splitting modes. That's not a
big deal, though, and I don't mean to go off on that tangent.]
So, let me return to the question of whether 'FIELDWIDTHS = ""' should
be a legal statement. Again, since this is an extension with no
standards to follow and lacking any documentation that states otherwise,
I would argue that this statement should not suddenly cause a fatal
error because it never used to do so and the previous behavior did not
seem to be problematic -- even if not apparently useful. Notice the
following behavior in gawk 3.1.4 and earlier:
$ echo 'ab cd' | gawk '{ print NF, $1, $2 }'
2 ab cd
$ echo 'ab cd' | gawk 'BEGIN { FS = "" } { print NF, $1, $2 }'
5 a b
$ echo 'ab cd' | gawk 'BEGIN { FIELDWIDTHS = "" } { print NF, $1, $2 }'
0
I see where in the source code to make the fix to restore the previous
behavior, but I wasn't going to bother creating a proposed patch if you
weren't convinced that the changed behavior needs to be fixed.
Please let me know.
Glenn
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: FIELDWIDTHS broken in awk 3.1.5,
Glenn Zazulia <=