emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Matches for multiline regexps


From: Luc Teirlinck
Subject: Re: Matches for multiline regexps
Date: Fri, 17 Jun 2005 21:48:32 -0500 (CDT)

Richard Stallman wrote:

       Additional remark: from simpler examples. it appears that they are
       _intended_ to be line numbers.  If so, this is a bug.

   Yes, it seems to be a bug in counting the line numbers.  Could you fix
   that too?

I will take a look at it, but first a decision has to be made on how
we treat overlapping matches.  (I am talking about matches that
themselves overlap.  I have no problem handling a match that starts on
the same line on which a previous match ended, but later on the line,
so that the matches themselves do not overlap, only one of their lines.)

The current occur implementation for multiline regexps has _several_
problems.  Apart from getting the line numbers wrong, the matches do
not get correctly displayed: only their first line is shown.  The
current implementation _tries_ to "correctly" (in one of the two
possible interpretations of what is "correct") find all matches in
case there are overlapping matches.  But it does not come close to
succeeding in that.  Worse, it has to pay for its attempt to do so by
failing to find all matches in more natural cases where there are no
overlapping matches and only one possible interpretation of "correct".
The present occur implementation differs radically in philosophy with
all other word or regexp search functions in Emacs and is backward
incompatible with Emacs 21.

I propose to have occur treat overlapping matches the same as the
other Emacs search functions do, which is also the way occur behaved
before Emacs 22.  That is, given a buffer with the following five lines:

11
11
11
11
11

`M-x occur RET 11 C-q C-j 11 RET' will find two matches, one on line 1
and one on line 3.  Those are the only matches that
`C-M-s 11 C-q C-j 11 RET C-s C-s C-s...' at beginning of buffer is
going to find.  It is what occur does in Emacs 21.  Implementing this
correctly seems relatively easy and does not require paying a price in
efficiency.  If this interpretation is good enough for C-M-s, then why
not for occur?

Trying to fix occur to handle the other interpretation of "correct"
(matches at lines 1, 2, 3 and 4) is possible but more difficult.  (The
current occur version can do that correctly in this example, but fails
for many other examples.)  Even a completely correct implementation
would still present problems.  It could make the handling of more
natural regexps less efficient, it clashes with all other search
functions in its philosophy, and it would not be clear how to display
all multiline matches in a way that is clear and avoids excessive
redundancy, because there could be a _lot_ of overlapping lines
between matches.  With my proposal only _consecutive_ entries in the
*Occur* buffer could overlap and the overlap would be at most one
line.  With a correct implementation of the other interpretation,
there is no limit in amount of overlap.

Sincerely,

Luc.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]