[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: First try (was: Re: [Pika-dev] Re: willing to help)
From: |
Tom Lord |
Subject: |
Re: First try (was: Re: [Pika-dev] Re: willing to help) |
Date: |
Tue, 23 Dec 2003 10:58:02 -0800 (PST) |
> From: Jose A Ortega Ruiz <address@hidden>
>> I'll take a look at that on monday.
Ok, I've reviewed and merged your changes. Overall, they're just fine
-- thanks for your help and I hope you're interested in doing more.
Your code revealed a design bug on my part. There's a change I think
should be made to your vector hashing code. Other than that, there's
just a few style nits and a trivial bug. Review notes enclosed.
-t
* The Hard Thing (my bad)
Reviewing your code made me realize that there's a bug in the way
EQUAL? is framed out.
EQUAL? is inherently recursive (really recursive, not iterative).
It can consume an unbounded amount of space and, called on circular
strutures, can fail by trying to consume an infinite amount of
space.
As currently implemented, EQUAL? takes its space from the C stack
rather than the Scheme heap.
These facts raise two issues for long-lived, interactive,
dynamically extensible systems built using Pika:
a) EQUAL? (and similar primitives) should be space-bound.
That is: it's preferable for some calls to fail by signalling
an error, "this call tried to recurse too deep", than to have
the process crash.
b) Even given a space-bound EQUAL?, there's a second question:
should it _really_ allocate on the C stack or should it use
the Scheme heap?
Regarding (b), one thing I want to avoid is _requiring_ recursive
invocations of EVAL with intervening continuation state on the
C stack. Presuming that Pika's EQUAL? may eventually be extensible
in Scheme, the protocol for implementing EQUAL? in C will have to be
changed.
I've left your EQUAL? routines for pairs and vectors in place for
now, but they'll certainly have to be modified as things progress.
* Keeping HASH Fast
As you wrote it, scm_vector_hash_fn recurses to compute hashes for
every element of the vector. I suspect that it would be better
to do like SCM and only hash _some_ of the elements (so that hashing a
very large vector does not take too long). I have _not_ changed
this yet but I did add a FIXME:
#undef FIXME
/* don't hash every element of a large array.
*
* In fact, the total number of elements hashed should
* vary with `depth'.
*
* SCM uses the trick of hashing the leading "depth/2" elements
* of a large array (where large means "has more than 5 elements).
*
* If you consider something like a "tree of vectors", SCM's trick
* nicely bounds the cost of `hash'.
*/
* Minor Bug
In scm_vector_equal_fn you left out a pair of SCM_(UN)PROTECT_FRAME
calls. (In several other places you used them correctly.)
* Coding Style Issues
You wrote reps/hash-values-imp.c(scm_values_hash_acc) with a local
variable named `new'. I changed that to `new_hash' because
otherwise, eventually, someone who likes C++ will complain.
In reps/pairs-imp.c(pair_equal_fn) you wrote:
result = scm_values_equal (arena, &l.car_a, &l.car_b)
&& scm_values_equal (arena, &l.cdr_a, &l.cdr_b);
Emacs' autoindentation makes that look nicer with an extra pair of
parentheses:
result = (scm_values_equal (arena, &l.car_a, &l.car_b)
&& scm_values_equal (arena, &l.cdr_a, &l.cdr_b));
In a few places, you had unused variables.
In general, please be aggressive with parentheses rather than relying
on C precedence. This is just a quirky preference of mine. I.e.,
|| (a == b)
rather than
|| a == b