[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [New module] Persistent Hash Array Mapped Tries (HAMTs)
From: |
Bruno Haible |
Subject: |
Re: [New module] Persistent Hash Array Mapped Tries (HAMTs) |
Date: |
Sun, 11 Oct 2020 03:28:33 +0200 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-189-generic; KDE/5.18.0; x86_64; ; ) |
Hi Marc,
Some comments:
* GL_HAMT_THREAD_SAFE can be defined to 1 not only if
__STDC_VERSION__ >= 201112 && ! defined __STD_NO_ATOMICS__
but also if
__GNUC__ + (__GNUC_MINOR >= 9) > 4
(see the other mail).
* Still '#ifdef GL_HAMT_THREAD_SAFE'.
* Typo s/comparision/comparison/
* Hamt_processor: The comment does not say what the return value means. Maybe it
would be clearer if this type was moved down to the "Iteration" section.
* hamt_lookup: If the caller is allowed to modify the payload stored in the
returned entry, this function should return a 'Hamt_entry *', not a
'const Hamt_entry *'. Just like 'strchr' and 'strstr' return a 'char *',
not a 'const char *'.
* In trienode_count, you don't need count_one_bits_l, only count_one_bits.
It's OK to assume that 'uint32_t' and 'unsigned int' are the same type.
* If subtrie_lookup was inlined in entry_lookup, you could get rid of the
forward declaration of entry_lookup, and compilers would be more likely
to turn the recursion of entry_lookup into a loop (tail-recursion
elimination).
* If subtrie_do_while was inlined in entry_do_while, you could get rid of the
forward declaration of entry_do_while.
* It would be useful to be able to construct modules 'hamt-set' and 'hamt-map',
analogous to 'hash-set' and 'hash-map', but the hamt API currently has two
issues that block it:
1) The iterator API is such that you cannot write the iteration using a loop;
it involves a function call that is active during the entire iteration.
This is also problematic for two other reasons:
- Creating a closure in C (unlike Scheme or Lisp) is tedious hand-work.
(The GNU C extension that allows local functions inside functions [1]
is not a solution, because
- it is unportable,
- it forces an executable stack on many platforms, which is
considered bad for security.)
- In Common Lisp, iteration through a hash table is available not only
through a function-call interface (MAPHASH [2]) but also through an
iteration-loop interface (WITH-HASH-TABLE-ITERATOR [3], LOOP [4]).
For example, in GNU clisp, LOOP expands to
[1]> (macroexpand-1 '(loop for x being each hash-key of h do (foo x)))
(MACROLET ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-ERROR)))
(BLOCK NIL
(LET ((#:WHTI-3214 (SYSTEM::HASH-TABLE-ITERATOR H)))
(MULTIPLE-VALUE-BIND (#:MORE?3215 #:HASH-KEY-3216 #:HASH-VALUE-3217)
(SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214)
(DECLARE (IGNORE #:HASH-VALUE-3217))
(LET ((X NIL))
(LET NIL
(MACROLET ((LOOP-FINISH NIL '(GO SYSTEM::END-LOOP)))
(TAGBODY SYSTEM::BEGIN-LOOP (UNLESS #:MORE?3215 (LOOP-FINISH))
(SETQ X #:HASH-KEY-3216) (PROGN (PROGN (FOO X)))
(MULTIPLE-VALUE-SETQ
(#:MORE?3215 #:HASH-KEY-3216 #:HASH-VALUE-3217)
(SYSTEM::HASH-TABLE-ITERATE #:WHTI-3214))
(GO SYSTEM::BEGIN-LOOP) SYSTEM::END-LOOP
(MACROLET
((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-WARN)
'(GO SYSTEM::END-LOOP)))))))))))) ;
You can see that there is an initial function call HASH-TABLE-ITERATOR
upfront, and then a function call to HASH-TABLE-ITERATE in each round
of the loop.
This principle makes it possible to iterate e.g. through a hash table
and a list in parallel.
2) We have a dilemma regarding use of malloc vs. xmalloc. While modules
meant for use in programs can readily use xmalloc, modules meant for use
(also) in libraries cannot use xmalloc, since a library is not supposed
to terminate the program, even when memory is tight.
The solution we found for this dilemma is that the *-set and *-map modules
use just malloc, and we have thin wrapper modules (xset, xmap) that do
call xalloc_die() in case of out-of-memory.
Unfortunately, the API of many of the functions need to be adjusted to
cope with the possibility of an ENOMEM failure. That is tedious work, and
the code does not look so pretty afterwards... But I see no other way to
make it fit for use in libraries.
Bruno
[1] https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Nested-Functions.html
[2]
http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/fun_maphash.html
[3]
http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/mac_with-hash_ble-iterator.html
[4]
http://www.ai.mit.edu/projects/iiip/doc/CommonLISP/HyperSpec/Body/sec_6-1-2-1-6.html
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), (continued)
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Bruno Haible, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Paul Eggert, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/10
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs),
Bruno Haible <=
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/11
- Re: [New module] Persistent Hash Array Mapped Tries (HAMTs), Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Bruno Haible, 2020/10/11
- Re: HAMT iterator, Marc Nieper-Wißkirchen, 2020/10/11
- Re: HAMT iterators, Bruno Haible, 2020/10/11