bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20638: BUG: standard & extended RE's don't find NUL's :-(


From: Linda Walsh
Subject: bug#20638: BUG: standard & extended RE's don't find NUL's :-(
Date: Mon, 25 May 2015 19:13:06 -0700
User-agent: Thunderbird



Paul Eggert wrote:
Linda Walsh wrote:

Perhaps you want to tell me where the documentation on the
standard and/or extended RE's is that you use?
----
Here is another:
*POSIX Extended Regular Expression Syntax: (http://www.boost.org/doc/libs/1_43_0/libs/regex/doc/html/boost_regex/syntax/basic_extended.html)


Escapes

The POSIX standard defines no escape sequences for POSIX-Extended regular expressions, except that:

   * Any special character preceded by an escape shall match itself.
* The effect of any ordinary character being preceded by an escape is undefined. * An escape inside a character class declaration shall match itself: in other words the escape character is not "special" inside a character class declaration; so [\^] will match either a literal '\' or a '^'.

However, that's rather restrictive, so the following standard-compatible extensions are also supported by Boost.Regex:
Escapes matching a specific character

The following escape sequences are all synonyms for single characters:

Escape
Character

\a
'\a'

\e
0x1B

\f
\f

\n
\n

\r
\r

\t
\t

\v
\v

\b
\b (but only inside a character class declaration).

\cX
An ASCII escape sequence - the character whose code point is X % 32

\xdd
A hexadecimal escape sequence - matches the single character whose code point is 0xdd.

\x{dddd}
A hexadecimal escape sequence - matches the single character whose code point is 0xdddd.

\0ddd
An octal escape sequence - matches the single character whose code point is 0ddd.

\N{Name}
Matches the single character which has the symbolic name name. For example \\N{newline} matches the single character \n.

*

We're talking about grep, so the relevant documentation is the grep manual, not the awk manual or other random stuff you might find on the Internet. Type 'info grep'. Or if you're in Emacs, type 'C-h i m grep RET'.
-----
Again another example of \000 octal and \x hex.

Most desccriptions of the chars grep takes say it was designed so that
awk, sed, tr -- any core linux util that takes regexes - to be *the ssame* so people didn't have to learn a different syntax for each tool.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]