emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] add compiled regexp primitive lisp object


From: dmcc2
Subject: Re: [PATCH] add compiled regexp primitive lisp object
Date: Wed, 31 Jul 2024 22:33:54 +0000

> On Tuesday, July 30th, 2024 at 09:02, Philip Kaludercic <philipk@posteo.net> 
> wrote:
> 
> No comments on the patch from me, I am just curious, did you notice any
> performance improvements? Or is this just cleaning up the codebase?
> 
> --
> Philip Kaludercic on peregrine

I failed to provide context: very reasonable question! ^_^ This was spurred by 
a discussion from the day before on how to introduce a lisp-level API for 
composing search patterns 
(https://lists.gnu.org/archive/html/emacs-devel/2024-07/msg01201.html), where I 
concluded that codifying compiled regexps into a lisp object would be a useful 
first step towards understanding the tradeoffs of introducing other matching 
logic beyond regex-emacs.c. I received a reply 
(https://lists.gnu.org/archive/html/emacs-devel/2024-07/msg01203.html) 
indicating that patches would be the appropriate next step, and then got to 
work. I was incredibly pleased about how delightful and straightforward it was 
to create this first draft and wanted to share progress, but didn't think 
further than that before falling asleep ^_^!

(btw, the pdumper API is incredibly cool and much less complex than I expected.)

I think a useful prototype of this workstream would involve:
(1) add new Lisp_Regexp primitive object constructed via `make-regexp' (this 
patch; done),
(2) store match-data in the Lisp_Regexp instead of a thread-local (done 
locally) & extend match data accessors like `match-data' to extract from an 
optional Lisp_Regexp arg (the way `match-string' accepts an optional string 
arg),
(3) add new Lisp_Match primitive object (or maybe just use a list for now) for 
match functions to write results into instead of mutating the Lisp_Regexp 
match-data (I believe this will make regexp matching entirely 
reentrant/thread-safe) & extend match data accessors to accept Lisp_Match as 
well.

At that point, I am guessing it will be relatively easy to construct a 
benchmark that produces a very clear speedup (construct 100 random regexps and 
search them in a loop) and demonstrably avoids recompiling via a profile 
output. There are also likely to be benchmarks more representative of typical 
emacs workload, which I would be delighted to receive suggestions for.

I think the next steps are clear enough, so I'm planning to ping this list 
again when I have a working prototype achieving such a benchmark. Since the 
inline diff seemed ok this time, I will also provide an inline diff for that 
unless the diff exceeds +1000 lines (not expected), in which case I will attach 
a patch file.

Thanks,
Danny



reply via email to

[Prev in Thread] Current Thread [Next in Thread]