[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
doc tweak re backslashes in bracket expressions
From: |
Ed Morton |
Subject: |
doc tweak re backslashes in bracket expressions |
Date: |
Sun, 3 Nov 2024 07:50:10 -0600 |
User-agent: |
Mozilla Thunderbird |
Just a small tweak suggestion for the gawk documentation regarding
backslashes inside bracket expressions.
https://www.gnu.org/software/gawk/manual/html_node/Bracket-Expressions.html
currently says (**emphasis mine**):
The treatment of ‘\’ in bracket expressions is compatible with other
awk implementations **and is also mandated by POSIX**.
but POSIX, at least this 2024 incarnation of the spec, seems pretty
clear (see references below*) that a backslash inside a bracket
expression is not an escape character so per POSIX these would be
compliant behavior:
$ printf 'a\\d\n' | grep -E '[\]'
a\d
$ printf 'a\\d\n' | sed -En '/[\]/p'
a\d
while these would not:
$ printf 'a\\d\n' | awk '/[\]/'
awk: cmd. line:1: /[\]/
awk: cmd. line:1: ^ unterminated regexp
$ printf 'a\\d\n' | awk --posix '/[\]/'
awk: cmd. line:1: /[\]/
awk: cmd. line:1: ^ unterminated regexp
so maybe either remove that "and is also mandated by POSIX" statement or
provide a reference to where that behavior IS mandated by POSIX to clear
up any confusion.
Ed.
*From the current, 2024, POSIX regexp spec,
https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html
(**emphasis mine**):
> [9.1 Regular Expression
Definitions](https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_01)
> ...
> escape sequence
>
> The escape character followed by any single character, which is
> thereby "escaped". The escape character is a \<backslash\> that is
> **neither in a bracket expression** nor itself escaped.
which tells us that a backslash within a bracket expression is not an
escape character, and this:
> [9.3.5 RE Bracket
Expression](https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_03_05)
>
> ... When the bracket
> expression appears within an ERE, the special characters ... and
'```\```' (...
> and \<backslash\>, respectively) shall **lose their special meaning
within
> the bracket expression**
which reiterates that a backslash within a bracket expression has no
special meaning, and there's nothing I can see in [the POSIX awk
spec](https://pubs.opengroup.org/onlinepubs/9799919799/utilities/awk.html)
to override the above definitions.
- doc tweak re backslashes in bracket expressions,
Ed Morton <=