[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Unquoted special characters in regexps
From: |
Luc Teirlinck |
Subject: |
Re: Unquoted special characters in regexps |
Date: |
Mon, 27 Feb 2006 18:30:13 -0600 (CST) |
None of the messages I sent on this (or on anything else) in the last
few days made it to emacs-devel, although all other people's
responses did, be it after some delay. I just got messages saying
that local delivery failed. So I will have to repeat some things that
I already said before.
Richard Stallman wrote:
However, that doesn't necessarily mean the manual is wrong.
There is more than one way to understand the word "special".
At the most literal level, ] is not special; if you write it
without \\, the regexp compiler won't misunderstand it.
`]', like `-' are only special in the context of a character
alternative, that is if, before you type them, you are in a character
alternative. By contrast, `[' and all other special characters
(except `^') are only special outside that context.
All characters that are special outside character alternatives are
never special if you precede them with a backslash. This is true even
for `^'. This is why it is good to precede them with a backslash even
if they are not special. That way, the reader can see that they are
not special, without studying the regexp.
On the other hand, a backslash, _never_ eliminates the special meaning
of a `]' or `-' with a special meaning.
There are two questions here. Whether a `]' outside a character
alternative should be quoted or not and whether any changes to the
Elisp manual are required. In this posting, I will only discuss the
first.
First of all, there are (surprisingly) many occurrences of "\\]" in
the Emacs source, where the `]' _is_ special and closes a character
alternative that contains a slash. Reportedly quoting a `]' with a
backslash _inside_ a character alternative works in some other regexp
implementations such as AWK. So if I see "\\]" I have to worry about
three possibilities: it might deliberately close a character
alternative which includes a slash, it might do so by accident because
the author tried to quote a `]' inside a character alternative (and
hence the regexp is buggy), or it might be a deliberately quoted `]'
outside a character alternative.
If I see `]' without preceding "\\", I only have to worry about
whether or not it closes a character alternative, and not about the
third possibility of a bug.
In summary I believe that quoting a `]' outside a character
alternative only adds clutter and a third possibility to worry about.
There are places in the Emacs code that quote a `]' outside a
character alternative. Even if we decide that this is undesirable, I
do not fancy finding and changing them all. But we could change the
behavior of `regexp-quote' and `regexp-opt' which currently quote
such `]'. That could be done with the following trivial patch, which
I could install if that is what we decide to do:
===File ~/search.c-diff=====================================
*** search.c 06 Feb 2006 16:02:24 -0600 1.206
--- search.c 27 Feb 2006 00:16:42 -0600
***************
*** 3066,3072 ****
for (; in != end; in++)
{
! if (*in == '[' || *in == ']'
|| *in == '*' || *in == '.' || *in == '\\'
|| *in == '?' || *in == '+'
|| *in == '^' || *in == '$')
--- 3066,3072 ----
for (; in != end; in++)
{
! if (*in == '['
|| *in == '*' || *in == '.' || *in == '\\'
|| *in == '?' || *in == '+'
|| *in == '^' || *in == '$')
============================================================
Message not available
Message not available
Re: Unquoted special characters in regexps, martin rudalics, 2006/03/02
Re: Unquoted special characters in regexps, Luc Teirlinck, 2006/03/02
Re: Unquoted special characters in regexps, martin rudalics, 2006/03/03
Re: Unquoted special characters in regexps, Luc Teirlinck, 2006/03/03
Re: Unquoted special characters in regexps, martin rudalics, 2006/03/03
Re: Unquoted special characters in regexps, Luc Teirlinck, 2006/03/03
Re: Unquoted special characters in regexps, Luc Teirlinck, 2006/03/03