bug#10386: CODE wishlist: íncludedelete-duplicates.el

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#10386: CODE wishlist: íncludedelete-duplicates.el

From:	Drew Adams
Subject:	bug#10386: CODE wishlist: íncludedelete-duplicates.el
Date:	Wed, 28 Dec 2011 08:56:48 -0800

> >     http://mwolson.org/static/dist/elisp/delete-duplicates.el
> 
> This is already provided in Emacs24+.

`delete-dups' has been in Emacs since Emacs 22, actually.

> What is not provided is a fast version of remove-duplicates.
> http://article.gmane.org/gmane.emacs.devel/139546/match=remove+dups

+1

And the cl-seq.el version of `remove-duplicates' is *particularly* slow.  Even a
classic list dups removal algorithm is much faster.

Here's a comparison using `equal' and a list of strings (`elp-results'):

hash-remove-dups     1           0.031         0.031
list-remove-dups     1           5.813         5.813
remove-duplicates    1           122.875       122.875

Where:

(defun list-remove-dups (list)
  (let ((tail  list)
        new)
    (while tail
      (unless (member (car tail) new) (push (car tail) new))
      (pop tail))
    (nreverse new)))

(defun* hash-remove-dups (seq &key (test 'equal))
  (let ((cont  (make-hash-table :test test)))
    (loop for elm in seq
       unless (gethash elm cont)
       do (puthash elm elm cont)
       finally return (loop for i being the hash-values
                                in cont collect i))))

With all 3 functions byte-compiled, using these calls:

(hash-remove-dups    B :test 'equal)
(list-remove-dups    B) ; uses `equal'
(remove-duplicates   B :test 'equal)

And with this list B (initialized anew each time):

(let ((seq (loop for i from 1 to 10000
             collect
             (format "%s" (random most-positive-fixnum)))))
  (append seq seq))

With B 10 times smaller (1000):

hash-remove-dups     1           0.0           0.0
list-remove-dups     1           0.047         0.047
remove-duplicates    1           1.172         1.172

With B 10 times bigger (100000), the difference between hash and classic list is
even greater (fuggedabowt cl-seq's `remove-duplicates' in this case):

hash-remove-dups     1           0.359         0.359
list-remove-dups     1           1209.578      1209.578
remove-duplicates    la-la...

[Prev in Thread]

Current Thread

[Next in Thread]

bug#10386: CODE wishlist: ínclude delete-duplicates.el, Jari Aalto, 2011/12/28
- bug#10386: CODE wishlist: ínclude delete-duplicates.el, Thierry Volpiatto, 2011/12/28
  - bug#10386: CODE wishlist: íncludedelete-duplicates.el, Drew Adams <=

Prev by Date: bug#10348: 24.0.92; Save and load window states
Next by Date: bug#10381: 24.0.92; delete-other-windows-vertically depends on the window split order
Previous by thread: bug#10386: CODE wishlist: ínclude delete-duplicates.el
Next by thread: bug#10387: CODE wishlist: search-prop.el
Index(es):
- Date
- Thread