[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: will we ever have zero width assertions in regexps?
From: |
Ilya Zakharevich |
Subject: |
Re: will we ever have zero width assertions in regexps? |
Date: |
Sat, 29 Jan 2011 22:28:48 +0000 (UTC) |
User-agent: |
slrn/0.9.8.1pl1 (Linux) |
On 2011-01-29, Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> As I said, I put some optimizations which in most (AFAIK) practical
>> senses remove such pathologies. (The underlying problems remain; the
>> optimizations are only "heuristic"; but one needs to be extra
>> inventive to circumvent the optimizations.)
>
> A typical case could look something like "foo *(.*?) *bar". when
> matching "foo ..<many space>.. baZ".
No, this is a polynomial-time problem. My optimization does nothing
for such cases. And I do not think such a REx would provide any
problem in real life - unless you have many hundreds of consecutive
spaces. (And unless Emacs' REx engine is particularly slow per OPCODE.)
But I start to see the difference - it is in usage scenarios. Many
Perl REx matches are done "per-line", not "per-file". And in Emacs,
one would rarely narrow the buffer before a match. This creates a
major skew in REx matches - in Emacs the string to match against would
be quite often a couple of orders of magnitude larger.
Essentially, in Perl, a sloppy REx which leads to a non-linear
polynomial time matching would be mostly unnoticed speed-wise. Only
those which lead to exponential-time match bite hard.
So my patches for Perl improve only the exponential cases of
"sloppyness". In Emacs, polynomial-time may give quite noticable
problems too. Hmm, AFAICS, one can slighly modify my patches to
handle polynomial-time matches too. I would think about it more (but
do not promise any action! ;-).
> Basically, provide a primitive like (match-string RE STRING LIMIT) that
> can not only say "matched between START and END", but also "reached
> LIMIT within yet finding a match, here's the suspended SEARCH-STATE at
> LIMIT", so you can later resume the search starting at LIMIT by passing
> that state.
match-with-continuation. An interesting idea. I already implemented
it for Perl (to support (??{}), but it is not exposed to the user.
Would one want this in non-interactive situations?
Thanks,
Ilya
- will we ever have zero width assertions in regexps?, Le Wang, 2011/01/26
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/26
- Re: will we ever have zero width assertions in regexps?, Le Wang, 2011/01/26
- Message not available
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/26
- Re: will we ever have zero width assertions in regexps?, Ilya Zakharevich, 2011/01/27
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/27
- Re: will we ever have zero width assertions in regexps?, Ilya Zakharevich, 2011/01/28
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/28
- Re: will we ever have zero width assertions in regexps?,
Ilya Zakharevich <=
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/31
- Re: will we ever have zero width assertions in regexps?, Ilya Zakharevich, 2011/01/31
- Re: will we ever have zero width assertions in regexps?, Stefan Monnier, 2011/01/31