bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

doc tweak re backslashes in bracket expressions


From: Ed Morton
Subject: doc tweak re backslashes in bracket expressions
Date: Sun, 3 Nov 2024 07:50:10 -0600
User-agent: Mozilla Thunderbird

Just a small tweak suggestion for the gawk documentation regarding backslashes inside bracket expressions.
https://www.gnu.org/software/gawk/manual/html_node/Bracket-Expressions.html 
currently says (**emphasis mine**):
The treatment of ‘\’ in bracket expressions is compatible with other awk implementations **and is also mandated by POSIX**.
but POSIX, at least this 2024 incarnation of the spec, seems pretty 
clear (see references below*) that a backslash inside a bracket 
expression is not an escape character so per POSIX these would be 
compliant behavior:
$ printf 'a\\d\n' | grep -E '[\]'
a\d
$ printf 'a\\d\n' | sed -En '/[\]/p'
a\d
while these would not:

$ printf 'a\\d\n' | awk '/[\]/'
awk: cmd. line:1: /[\]/
awk: cmd. line:1:  ^ unterminated regexp
$ printf 'a\\d\n' | awk --posix '/[\]/'
awk: cmd. line:1: /[\]/
awk: cmd. line:1:  ^ unterminated regexp
so maybe either remove that "and is also mandated by POSIX" statement or 
provide a reference to where that behavior IS mandated by POSIX to clear 
up any confusion.
    Ed.

*From the current, 2024, POSIX regexp spec, https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html (**emphasis mine**):
> [9.1 Regular Expression Definitions](https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_01)
> ...
> escape sequence
>
> The escape character followed by any single character, which is
> thereby "escaped". The escape character is a \<backslash\> that is
> **neither in a bracket expression** nor itself escaped.
which tells us that a backslash within a bracket expression is not an 
escape character, and this:
> [9.3.5 RE Bracket Expression](https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_03_05)
>
> ... When the bracket
> expression appears within an ERE, the special characters ... and '```\```' (... > and \<backslash\>, respectively) shall **lose their special meaning within
> the bracket expression**
which reiterates that a backslash within a bracket expression has no 
special meaning, and there's nothing I can see in [the POSIX awk 
spec](https://pubs.opengroup.org/onlinepubs/9799919799/utilities/awk.html) 
to override the above definitions.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]