pika-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pika-dev] Re: hashq values and weak references


From: Andreas Rottmann
Subject: [Pika-dev] Re: hashq values and weak references
Date: Mon, 10 May 2004 19:03:06 +0200
User-agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3 (gnu/linux)

Tom Lord <address@hidden> writes:

> I'm assuming that you're working solving the hashq problem: that with
> our copying collector an address-based HASHQ is no good -- the address
> of an object can change over time.
>
> Here is how I think we should solve that problem:
>
I'll explain a bit on the approach I've taken so far, which I think
can be modified to fit your proposal.

>
> Pick some size for "GC-pages".  The GC'ed heap is divided into
> GC-pages, each page divided up into objects of a certain size.  
>
> Reserve one of those objects at the start of each page for a hash
> bucket.   So it's (initially, I'll change this further down):
>
>
[snip]

Currently, I have one hashtree per Arena, where the hashq values get
stored.

> The hash-bucket only needs to exist if HASHQ has been called for any
> of the objects on a page.
>
> During copy collection, have _two_ to-space pages for each size of
> object:  one for objects with hash values, another for objects without
> hash values.   That way, after a copy collection, the number of hash
> buckets will be minimized.
>
> Also note that, after copy collection, the order of objects in the
> hash bucket is sorted by address (because we fill up to-space in
> address order).  HASHQ can preserve that sorting when adding new hash
> values to a bucket.  If we want to make hash buckets as small as
> possible then a lookup in one could be a binary search.
>
> We can do better than a binary search if we don't mind making hash
> buckets larger.  So, let's make it (I'll change this again further
> down):
>
[snip]

> where the number of bucket slots equals the number of objects on the
> page.   Now HASHQ is, once again, an O(1) operation.
>
While direct random access is certainly faster than access via a
hashtree, will the difference be significant? (or rather, do we care
ATM?) I somewhat like elegance that lies in using the same underlying
mechanism (object data areas) for both hashq and object locks...

> That space consumption is not so bad at all.   Many scheme
> implementations these days add a hash-value slot to every object
> (e.g., a cons-pair has storage for CAR, CDR, and HASHQ-VALUE (and
> usually something else, too).   We'll be using less space than those
> implementations and getting very close to the same speed.
>
> Finally, I think we can safely assume that less than 1/2 of all
> objects will ever be hashq'ed.  Plus, on every copy collection, we'll
> be compacting the hash buckets.
>
Hmm, what I don't quite get is the following: The probability of any
GC page having a hashq'ed value will be significantly larger than 0.5,
if we assume 1/2 of all objects are hashq'ed. Therefore, most GC pages
will have a hash bucket, wich in turn has as many elements as objects
in the page; the space overhead will thus hit most objects, not just
1/2. Might be I'm totally off track here, though...

["weak reference" stuff snipped]

Andy
-- 
Andreas Rottmann         | address@hidden      | address@hidden | address@hidden
http://yi.org/rotty      | GnuPG Key: http://yi.org/rotty/gpg.asc
Fingerprint              | DFB4 4EB4 78A4 5EEE 6219  F228 F92F CFC5 01FD 5B62

Python is executable pseudocode, Perl is executable line-noise.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]