emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What's missing in ELisp that makes people want to use cl-lib?


From: Dmitry Gutov
Subject: Re: What's missing in ELisp that makes people want to use cl-lib?
Date: Fri, 17 Nov 2023 04:09:37 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 16/11/2023 16:30, João Távora wrote:

But it's not only a matter of performance.  Take the
'seq-remove-at-position' generic.  Presumably someone thought that
operationq was common enough to merit a separate entry point. In CL,
this very operation can be done (probably faster) with cl-remove using
keyword arguments.

Probably. Like I said, it looks more like Clojure's stdlib, but due to the absence of efficient "persistent data structures" the performance profile is different.

If that is the only drawback we can find, it is an unavoidable one.

Could you try to explain what I should find in the second example?
What do you mean by "outlaw"?

What I meant is that the only way to get the same performance out of
seq.el is to have early #'sequencep checks that bypass the generics
completely, and this makes custom sequences based on sequences
impossible.

If we optimize for short lists -- maybe. But for longer lists the dynamic dispatch (if performed once) shouldn't be a problem.

Does causing worse performance for a short time constitute "outlawing"
something?

But you should also find in that "m6sparse" example that the logic is
broken -- not only in terms of performance -- until its author
implements seq-contains-pred.  So this is pure "Incompatible Lisp
changes" material (which I also think tanked performance should be btw.)

But in fact is is already broken by all the "list optimzations" in
seq.el. Optimizations, before yours, that caused whatever the contract
was to be violated.  To be able to use `seq-drop-while` for my m6sparse
sequence, I have to add implementations to all those generic entry
points, which is just akward.

Were those the 'list' type specializations?

And even outlaw more stuff.  For example, these generics even have
specializers on all mandatory arguments?
For example, why does seq-do have FUNCTION as a required argument???

Because FUNCTION is applied to SEQUENCE?


It should be &optional or a &key, with a default value.  Which
cl-degeneric obviously supports. That way specializations on FUNCTION
become impossible,  or at least become much harder and there's less
risk of tanking user code.  Design mistake IMO.

I'm reasonably sure nobody expects function to be anything but a
straight function (symbol or a lambda), because that's how 'seq-do' is
used throughout the code.

Yes, but putting as a required argument in the arglist means users can
specialize for it, and that isn't needed.  Not sure anyone does it or
even if that makes the generic even slower.

And what would happen if FUNCTION were optional and omitted by the caller?

But that is the price one has to pay for correcting a design mistake.
We're not going to do this every Tuesday.
Sure, but that price increases manyfold if we start suggesting
seq.el as a replacement for all your sequence processing needs.

We can first fix the mistake and then go on to continue "suggesting it
as a replacement". Or not.

I don't exactly see it that way, though. And you give an impression of
arguing for the opposite: toward never using it at all.

Not at all.  Maybe you missed some of my previous messages.  I think
seq.el's support for polymorphic sequences, though a little bit flawed
in some respects, is very useful.  For example, I've been pondering
using it in eglot.el to try to speed up JSON parsing.  If some kind of
seq-plist-get and seq-plist-destructuring-bind can be designed, I might
be able to skip consing much of the useless elements of a gigantic JSON
blob and parse just the parts I need.  Of course, not easy, but I think
seq.el is the tool for that.

plist like Elisp plist? It's difficult to write a type predicate for.

Why working on the :m6-sparse extension, I noticed Emacs becomes
noticeably slower, and I suspect that's because while I was editing,
all the seq  functions I was writing where being probed for
applicability, while core things like completion, buffer-switching
etc are calling seq-foo generics.

It could be helpful to do some profiling and see where the slowdown
came from. Could it come exactly from the set operations?

Not sure.  It might have come from tracing seq.el functions, for
example.  You might say that it's my fault I was tracing them, but
should I be punished in Emacs usability just for trying to use Emacs to
iteratively develop a seq.el extension?

I'm not sure I understand. If you added tracing to 'car', it would also slow Emacs down, wouldn't it? Is that a problem with the design of 'car'?

Anyway tracing basic staples such as seq-do and seq-let gives some
insight as to where they are used and what shape of arguments they are
called with in your normal programming activities.  Small lists seem to
appear a lot more often.  But expect a massive slowdown while tracing:
even with modest seq.el usage in current core, these generics are called
a lot already.

That both sounds like a compliment and a pressing motivation to iron out the most apparent performance pitfalls.

I find this performance aspect very bad.  Maybe it can be obviated,
but only if you drop the '(if (sequence-p seq)' bomb into seq.el
somehow.  I don't see how we can avoid that one.

I don't quite see the need. And it's unlikely to be of reliable help:
my observations say that method dispatch simply becomes slower as soon
as a generic function gets a second implementation. And that
implementation might arrive from any third-party code.

Exactly.  The entry point generics probably can never be avoided.  I
think we agreed that noone -- user or library -- should add
implementations to them.

Um, no. I'm saying that seq.el function should be written in a way that causes the dynamic dispatch to happen only once (or perhaps a few times), as opposed to doing it N times (for every element of a sequence) or more.

seq-let might suffer for a similar problem.

Having overrides for entry points, OTOH, should be fine enough, and whatever added cost the dispatch itself incurs, it still happens once, and can be made up for by the more efficient specialized implementation of the function's logic.

That's why I think not making them defuns was
another design mistake.  But other intermediary generics _can_ be
skipped and would bring a performance boost to seq.el.

Other intermediate generics, in general, can be skipped by defining different methods at earlier points.

Alright.  That all said, here's the latest results, which I gathered
using the attached sequence-benchmarks.el are also attached in
results.txt.

I gathered each set of timings by running these two things

    src/emacs -Q -nw sequence-benchmarks.el -f emacs-lisp-byte-compile-and-load

First of all, when I try to run the above command (on your branch, with your attachments from the last email), it ends with:

Compiling file /home/dgutov/examples/cl-lib-vs-seq/sequence-benchmarks.el at Fri Nov 17 03:57:31 2023
sequence-benchmarks.el:73:2: Error: Wrong type argument: listp, more

So before I proceed further, could you make sure that you ran the tests with these exact files? Superficially, it looks like just (require 'cl-lib) is missing, but maybe something else inside was also different?

I also tried to produce some pretty-printed comparisons, but the above command (even with "fixed" require statement) doesn't produce anything to *Messages* or stdout when ran with --batch.

I can evaluate individual joaot/with-benchmark-group forms, and they print -- how'd you call it -- minified Lisp data to messages, but it's not formatted the same way as in your results.txt, nor is it easy to get out of the -nw session because the clipboard is naturally not shared. Is -nw needed? I expected it'd be used to print something to stdout.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]