pika-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pika-dev] Re: hashq values and weak references


From: Tom Lord
Subject: [Pika-dev] Re: hashq values and weak references
Date: Mon, 10 May 2004 12:56:40 -0700 (PDT)

    > From: Andreas Rottmann <address@hidden>

    > Yep, while attemting to implement object locks, I discovered a
    > generalization of a problem inherent to both "hashq problem" and
    > object lock: associate a data area with a scheme object. I thus
    > implemented an interface to do so, see [0].

    > I then implemented the solution to the hashq problem by using object
    > data areas. My integration branch should have a solution to the hashq
    > problem already, even if it's not the way you thought it should be
    > solved (you never commented on my mail[0]...)
    > 
    > I'll comment on the divergance of your proposed approach with the
    > current state in another mail.

    > [0] http://article.gmane.org/gmane.lisp.scheme.pika.devel/221


(This is partly just summarizing a conversation we had on IRC.)


There are four problems to solve:

1) The hashq problem

  We don't want to eagerly evaluate space for every object to hold a 
  hashq value.   However, with a copying collector, there's no
  information intrinsic in an object from which we could recompute
  a hashq value on demand.

  In a non-copying collector, hashq returns just a hash of the (fixed)
  address of the object.   But with a copying collector, the address
  can change, so if we try that we wind up returning different hashq
  values for a given object at different times.


2) The weak reference problem

  A weak reference is a single-cell container that does not GC-protect
  the object it refers to.   If the object is GC'ed, the reference is 
  replaced with an indicator of that fact.

  Collectors (all kinds) have two (conceptual) phases: trace and
  reclaim.  The trace phase finds all live objects.   The reclaim
  phase discards all dead objects.

  Copying collection is nifty and swell because, at least before we
  start adding features, the reclaim phase is completely trivial -- an
  O(1) operation.  Something around 0..10 instructions.

  Weak references complicate things, though.   Sometime during
  collection we have to update all the weak references to dead objects
  to know their referant has died.   This can't happen during trace
  phase because we don't know, during trace phase, which objects will
  live.  

  So, after trace phase, we have to scan all of the weak references
  and mark the ones that no longer point to a live object.
  To not mess up our copy collector and it's O(1) reclaim phase too
  badly, we have to make it fast to find all the weak references
  (e.g., we could keep them all in a doubly linked list so that after
  copy phase we'll have two lists of weak references -- one whose
  referants still live, another whose referants have died).


3) The finalization problem

  Some types require finalization.   After the object dies, some other
  asynchronous computation has to be scheduled.

  This can be regarded as a generalization of the weak-reference
  problem:  an object with a finalizer is like an object that has a
  "special" weak reference -- one that when the referant dies, an
  async computation is scheduled.


4) The object lock problem

  Sometimes objects have to be sheltered from the GC, other threads,
  and async execution for brief periods of time (as when, for example,
  accessing string data directly).  We use object locks for that.

  Object locks are short-lived and only a very small number of locks
  will be held at any one time.


What we worked out while chatting is:

A) problems 1, 2, and 3 all should all use the same solution

   Hashq, weak refs, and finalizers all have the property that

        many objects need them

   They have _similar_ lifetimes.  Hashq values _must_ live 
   as long as the object they are associated with.   Finalizers
   are _expected_ to live that long but sometimes won't (if a
   finalizer is canceled).   Weak refs may or may not live that 
   long but probably usually live a long time.

   The hash-bucket-linked-to-GC-page solution I sketched is a good
   implementation for these with our copying collector.

   We don't _obviously_ need any of these facilities for R5RS
   but, in fact, we probably really do.   In particular, ports
   need finalizers -- so we may as well do the whole shebang.


B) problem 4, object locks, needs a different solution

   The lifetime and number of object locks make it a very different
   problem.

   Andreas' hashtree-based object data areas is likely to be tunable
   to a fast and very scalable solution here so let's keep that, but
   hide the "object data area" abstraction.


-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]