emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unquoted special characters in regexps


From: Luc Teirlinck
Subject: Re: Unquoted special characters in regexps
Date: Mon, 27 Feb 2006 18:59:34 -0600 (CST)

I can install any one or both of the two chunks of the patch to
lispref/searching.texi included below, if desired.  (I sent this
bafore, but it never arrived at emacs-devel).

The first chunk just would eliminate `]' from the list of characters
that are described as special outside a character alternative.

The second chunk rephrases the following:

    For example, a string with unbalanced square brackets is invalid
    (with a few exceptions, such as `[]]'),

That is incorrect or at least ambiguous (how exactly do you define
balanced?) as the examples below show.

ELISP> (string-match "]]]]" "]]]]")
0
ELISP> (string-match "[[]" "[")
0

One accurate way to restate it would be that a string whose square
brackets _with special meaning _ do not balance is invalid.  This
would be (unless I overlook something) without exceptions: in `[]]'
the square brackets with special meaning do balance.  In the patch
below I formulated it differently.

===File ~/searching.texi-diff===============================
*** searching.texi      06 Feb 2006 16:02:08 -0600      1.68
--- searching.texi      26 Feb 2006 10:25:06 -0600      
***************
*** 237,243 ****
  special constructs and the rest are @dfn{ordinary}.  An ordinary
  character is a simple regular expression that matches that character and
  nothing else.  The special characters are @samp{.}, @samp{*}, @samp{+},
! @samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new
  special characters will be defined in the future.  Any other character
  appearing in a regular expression is ordinary, unless a @samp{\}
  precedes it.
--- 237,243 ----
  special constructs and the rest are @dfn{ordinary}.  An ordinary
  character is a simple regular expression that matches that character and
  nothing else.  The special characters are @samp{.}, @samp{*}, @samp{+},
! @samp{?}, @samp{[}, @samp{^}, @samp{$}, and @samp{\}; no new
  special characters will be defined in the future.  Any other character
  appearing in a regular expression is ordinary, unless a @samp{\}
  precedes it.
***************
*** 740,747 ****
  
  @kindex invalid-regexp
    Not every string is a valid regular expression.  For example, a string
! with unbalanced square brackets is invalid (with a few exceptions, such
! as @samp{[]]}), and so is a string that ends with a single @samp{\}.  If
  an invalid regular expression is passed to any of the search functions,
  an @code{invalid-regexp} error is signaled.
  
--- 740,747 ----
  
  @kindex invalid-regexp
    Not every string is a valid regular expression.  For example, a string
! that ends inside a character alternative without terminating @samp{]}
! is invalid, and so is a string that ends with a single @samp{\}.  If
  an invalid regular expression is passed to any of the search functions,
  an @code{invalid-regexp} error is signaled.
  
============================================================






reply via email to

[Prev in Thread] Current Thread [Next in Thread]