pika-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: First try (was: Re: [Pika-dev] Re: willing to help)


From: Tom Lord
Subject: Re: First try (was: Re: [Pika-dev] Re: willing to help)
Date: Tue, 23 Dec 2003 10:58:02 -0800 (PST)

    > From: Jose A Ortega Ruiz <address@hidden>

    >> I'll take a look at that on monday.  

Ok, I've reviewed and merged your changes.  Overall, they're just fine
-- thanks for your help and I hope you're interested in doing more.

Your code revealed a design bug on my part.  There's a change I think
should be made to your vector hashing code.  Other than that, there's
just a few style nits and a trivial bug.  Review notes enclosed.

-t

* The Hard Thing (my bad)

  Reviewing your code made me realize that there's a bug in the way
  EQUAL? is framed out.

  EQUAL? is inherently recursive (really recursive, not iterative).
  It can consume an unbounded amount of space and, called on circular
  strutures, can fail by trying to consume an infinite amount of
  space.

  As currently implemented, EQUAL? takes its space from the C stack
  rather than the Scheme heap.

  These facts raise two issues for long-lived, interactive,
  dynamically extensible systems built using Pika:

    a) EQUAL? (and similar primitives) should be space-bound.
       That is: it's preferable for some calls to fail by signalling
       an error, "this call tried to recurse too deep", than to have
       the process crash.

    b) Even given a space-bound EQUAL?, there's a second question:
       should it _really_ allocate on the C stack or should it use
       the Scheme heap?

  Regarding (b), one thing I want to avoid is _requiring_ recursive
  invocations of EVAL with intervening continuation state on the 
  C stack.   Presuming that Pika's EQUAL? may eventually be extensible 
  in Scheme, the protocol for implementing EQUAL? in C will have to be 
  changed.

  I've left your EQUAL? routines for pairs and vectors in place for
  now, but they'll certainly have to be modified as things progress.




* Keeping HASH Fast

  As you wrote it, scm_vector_hash_fn recurses to compute hashes for 
  every element of the vector.  I suspect that it would be better
  to do like SCM and only hash _some_ of the elements (so that hashing a 
  very large vector does not take too long).   I have _not_ changed 
  this yet but I did add a FIXME:

  #undef FIXME 
    /* don't hash every element of a large array.
     * 
     * In fact, the total number of elements hashed should
     * vary with `depth'.   
     * 
     * SCM uses the trick of hashing the leading "depth/2" elements
     * of a large array (where large means "has more than 5 elements).
     * 
     * If you consider something like a "tree of vectors", SCM's trick
     * nicely bounds the cost of `hash'.
     */



* Minor Bug

  In scm_vector_equal_fn you left out a pair of SCM_(UN)PROTECT_FRAME
  calls.   (In several other places you used them correctly.)



* Coding Style Issues

  You wrote reps/hash-values-imp.c(scm_values_hash_acc) with a local
  variable named `new'.   I changed that to `new_hash' because
  otherwise, eventually, someone who likes C++ will complain.
  
  
  In reps/pairs-imp.c(pair_equal_fn) you wrote:
  
        result = scm_values_equal (arena, &l.car_a, &l.car_b)
          && scm_values_equal (arena, &l.cdr_a, &l.cdr_b);
  
  Emacs' autoindentation makes that look nicer with an extra pair of
  parentheses:
  
        result = (scm_values_equal (arena, &l.car_a, &l.car_b)
                  && scm_values_equal (arena, &l.cdr_a, &l.cdr_b));
  
  
  In a few places, you had unused variables.
  
  In general, please be aggressive with parentheses rather than relying
  on C precedence.   This is just a quirky preference of mine.   I.e., 
  
        || (a == b)
  
  rather than
  
         || a == b
  





reply via email to

[Prev in Thread] Current Thread [Next in Thread]