Re: What's missing in ELisp that makes people want to use cl-lib?

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: What's missing in ELisp that makes people want to use cl-lib?

From:	Dmitry Gutov
Subject:	Re: What's missing in ELisp that makes people want to use cl-lib?
Date:	Tue, 14 Nov 2023 13:47:32 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 14/11/2023 12:34, João Távora wrote:

[Replying to both of your mails here]

On Tue, Nov 14, 2023 Dmitry Gutov <dmitry@gutov.dev<mailto:dmitry@gutov.dev>> wrote:

 > And if the list has just 1 element (or zero?), the gap would be even
 > larger.
 >
 > Only seeing a 1.6x difference here is a nice surprise, actually.

These two statements together seem contradictory.  Why would
you be happy to see a particular factor for a given
list length if you know the factor is unbounded in general
for small lists?

I don't think it's unbounded, just high. And 12-element list is a nicesize for this particular comparison. Programs using tiny list would intheir majority be tiny as well, so it wouldn't matter.

Though of course there are exceptions: compilers, for example.

And it's only in the small lists case.  If I pass my own predicate to
both cl-set-difference and your best seq-difference-3

(defun myequal (a b)
   (equal a b))

(setq cc (make-list 12 "blabla"))
(setq list2 (make-list 12 "shooveedoowaa"))

(when nil
   ;; (0.106080265 4 0.03849865399999963)
   (benchmark-run 10000 (cl-set-difference cc list2 :test #'myequal))
   ;; (0.5504704210000001 39 0.394207416)
   (benchmark-run 10000 (seq-difference-3 cc list2 #'myequal))

    )

I get the 5.5x slowdown again (in one experiment, not shown,

Right. This is still for the small lists case. Here we suffer thedynamic dispatch at least twice, along with funcall indirection. Again,would be great to get these ratios down, but the main scenarios where I*did* worry about the performance of a sequence primitive, have alwaysbeen large lists.

I got a whopping 200x slowdown, but I can't reproduce it right now
from Emacs -Q, maybe I had some method on some crucial seq.el generic)

That would be helpful to know as well.

 > > This is all interesting, until one ponders what happens if an existing
 > > seq.el user somewhere has:
 > >
 > > (cl-defmethod seq-contains-p ((seq my-voodoo-seq)

> > (elt (eql :secret-voodoo)) &optional_tesfn)

 > >    (invoke-voodoo-priests seq))
 > >
 > > making use of seq.el's support for abstract polymorphic sequences.
 > >
 > > With seq.el 2.24 a seq-difference operation would consider this user's
 > > method, with seq.el 2.24.dmitry (i.e. your fast seq-difference-3) it
 > > simply won't.  This user's code is clearly broken.
 > >
 > > But was the user allowed to do that in the first place?  If not,
 > > why is seq-contains-p a public generic function?
 >
 > It was allowed, and yes, indeed, it's a breaking change. So we should at
 > least examine the existing public code out there and see whether such
 > overrides exist.

Not so easy, I'm afraid.  seq-contains-p is not the only such generic,
it's just one I happened to spot first.  seq-do is a generic and in that
group mentioned in the sparse seq.el documentation about mandatory
customization, meaning presumably it is essential for custom sequences.

See https://github.com/search?q=%22%28cl-defmethod+seq-do%22&type=code<https://github.com/search?q=%22%28cl-defmethod+seq-do%22&type=code> for

a list of custom sequences that use it.

Your seq-difference-3 doesn't call seq-do like the original seq-difference
does, so the set difference operations for those custom sequences might
well be broken.  What's even more problematic is that it skips seq-do
entirely for certain predicates and not entirely for certain other
predicates.

The way I understand this, is any new sequence type has to implementseq-do. As soon as that happens, a lot of (probably all) sequencefunctions in seq.el start working on that type.

But the author is also free to provide specialized implementations forindividual functions (usually for better performance) -- and thoseimplementations don't have to use seq-do. Indeed, in most cases theoptimization would involve cutting out the overhead that seq-do bringsin the general case.

It's still bizarre.  Here's how I think: if an application is customizing
the entry point directly, why even call the entry point?  OK, so what if
a library is customizing the entry point?   It's presumably because that
library provides a data type for applications to instantiate and use.  But
if the library does that, all calling guarantees of other generic
functions are lost for, say, user method for subtypes of that data type
(or :around, or :after, etc).  So it's also nonsensical.  The only think

where it _could_ make a shred of sense is in a kind of "private library"where

the application controls both the data type and knows exactly the shortcuts
taken.  But then "private library" is an oxymoron.

An application shouldn't be customizing separate entry points. Separatelibs (providing new data types) can do that. E.g. if we have a libproviding lazy sequences, it can provide its own set of implementationsfor seq.el. And when doing that, make sure the resulting behavior isconsistent between functions.

I haven't though much about the interaction with :around, :after, etc.Maybe they'll be fine, or maybe it would be better to avoid it. Lookfine on the superficial level, though.

Oh, don't get me wrong: cl-lib.el's implementation details are not pretty
and not easy, likely typical library code (though i've seen much much
worse).  What's fundamentally easier about optimizing cl-lib is that the
contract it offers is much, much more well specified.

Possibly because it comes as a result of careful design in committees of
very capable programmers in the 90's, before the CL winter, that were
already on to this class of difficulties and wanted to make a common
interface.
> The interface they made for sequences is not fully generic, but there
_are_ a lot of different implementations for that interface, each CL
compiler has  one.  Most of the tricks, like what I did to optimize
cl-{n}set-difference  are standard stuff in the CL world.   There are even
reference LOOP implementations (likely much more performant than ours,
though possibly not compatible with some non-standard edge cases our
cl-loop has of which I know a couple).

I do believe it's helpful to have it around.

 > > The case in the CL world against generic functions' performance
 > > is often not that the dispatch is slow, but that
 > > employing them too liberally makes framework optimization
 > > hard, because you restrict yourself from optimization
 > > opportunities.
 >
 > I don't know if that's a major issue: just like cl-some has different
 > branches, seq-some could have three different definitions. The
 > approaches to optimization seem transferable, more or less.

Seq-some also calls into a lot of user customizable code, so it'll
suffer from the same class of problems.  Say, are you going to skip
the seq-do generic sometimes there as well?

I believe so.

 > I believe the actual value it provides is a list of implementations that
 > are quite compact and as such easy to fix/extend/modify. And if cl-lib
 > is handy for someone with CL experience, I'll guess that seq.el rings
 > more familiar to the Clojure users.

Maybe in the superficial user interface, because of names etc.  Or
are you saying seq.el interface is extracted from Clojure's standard much
like cl-lib is from CL standard?  (Do Clojure generic functions work like
ours?)  Then by all means we should rush to study that contract closely,
to know what we can and can't do as optimizers.

It looks very much inspired, both in naming and in the implementationapproach. But it's not a carbon copy (much farther from it than incl-lib's example), and our VM is also quite different from JVM inperformance characteristics.

If someone were to write up a direct comparison, that would be great, ofcourse.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: What's missing in ELisp that makes people want to use cl-lib?, (continued)

Prev by Date: Re: What's missing in ELisp that makes people want to use cl-lib?
Next by Date: Re: What's missing in ELisp that makes people want to use cl-lib?
Previous by thread: Re: What's missing in ELisp that makes people want to use cl-lib?
Next by thread: Re: What's missing in ELisp that makes people want to use cl-lib?
Index(es):
- Date
- Thread