Hi,
On Sun, May 21, 2023 at 09:46:50AM -0500, Ed Morton wrote:
In the gawk manual
underhttps://www.gnu.org/software/gawk/manual/html_node/Regexp-Operator-Details.html
we have this statement:
In POSIX |awk| and |gawk|, the ‘*’, ‘+’, and ‘?’ operators stand
for themselves when there is nothing in the regexp that precedes
them.
while in the POSIX spec
underhttps://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04_03
we have this statement:
*+?{
The <asterisk>, <plus-sign>, <question-mark>, and <left-brace>
shall be special except when used in a bracket expression (see RE
Bracket Expression
<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05>).
Any of the following uses produce undefined results:
*
If these characters appear first in an ERE
So the gawk manual statement says that /+foo/ in any POSIX awk will
match the literal string "+foo" while the POSIX spec statement says
it's undefined behavior.
Should the gawk manual be tweaked to clarify/explain what it
currently says about POSIX awk since it apparently contradicts the
POSIX spec?
Stupid question: when something says that the behavior is undefined, is
it not the case that a given implementation is entitled to make its
own choice about how to handle that situation? If so, why is gawk's
choosing to match "+foo" at odds with POSIX? If it's "undefined", do
you instead expect it to throw an error?
Regards,
Andy