[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: What to do for faster `remove-duplicates'?
From: |
Oleh Krehel |
Subject: |
Re: What to do for faster `remove-duplicates'? |
Date: |
Wed, 06 May 2015 15:30:45 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux) |
Stefan Monnier <address@hidden> writes:
>> So should `cl-remove-duplicates' be a 2-in-1 (lists for small input,
>
> Don't know/don't care about cl-remove-duplicates, but for delete-dups, I'd
> welcome a patch which switches to a hash-table after the Nth element.
I attach the patch. I did a bunch of `benchmark-run' and it seems that
100 elements is the breaking point.
> PS: BTW, helm-fast-remove-dups's docstring says:
>
> This is same as `remove-duplicates' but with memoisation.
>
> but it doesn't actually use memoization, AFAICT.
It does in a way, for an optimized-out "is-element-in-collection"
function.
Oleh
>From c845e6384283c17b61c3a0401213e4a1837807c5 Mon Sep 17 00:00:00 2001
From: Oleh Krehel <address@hidden>
Date: Wed, 6 May 2015 15:21:23 +0200
Subject: [PATCH] lisp/subr.el (delete-dups): Use a hash table
* lisp/subr.el (delete-dups): When there are more than 100 candidates,
use a hash table. This can result in ~500 times speed-up for typical
collections of size 5000, like that of `load-library'.
---
lisp/subr.el | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/lisp/subr.el b/lisp/subr.el
index 0fec29c..591980d 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -417,11 +417,19 @@ If N is omitted or nil, remove the last element."
Store the result in LIST and return it. LIST must be a proper list.
Of several `equal' occurrences of an element in LIST, the first
one is kept."
- (let ((tail list))
- (while tail
- (setcdr tail (delete (car tail) (cdr tail)))
- (setq tail (cdr tail))))
- list)
+ (if (> (length list) 100)
+ (let ((hash (make-hash-table :test #'equal))
+ res)
+ (dolist (elt list)
+ (unless (gethash elt hash)
+ (puthash elt elt hash)
+ (push elt res)))
+ (nreverse res))
+ (let ((tail list))
+ (while tail
+ (setcdr tail (delete (car tail) (cdr tail)))
+ (setq tail (cdr tail))))
+ list))
;; See http://lists.gnu.org/archive/html/emacs-devel/2013-05/msg00204.html
(defun delete-consecutive-dups (list &optional circular)
--
1.8.4
- What to do for faster `remove-duplicates'?, Oleh Krehel, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Stefan Monnier, 2015/05/06
- Re: What to do for faster `remove-duplicates'?,
Oleh Krehel <=
- Re: What to do for faster `remove-duplicates'?, Stefan Monnier, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Oleh Krehel, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Thierry Volpiatto, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Oleh Krehel, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Artur Malabarba, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Artur Malabarba, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Oleh Krehel, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Artur Malabarba, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Oleh Krehel, 2015/05/06
- Re: What to do for faster `remove-duplicates'?, Artur Malabarba, 2015/05/06