emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal: block-based vector allocator


From: Stefan Monnier
Subject: Re: Proposal: block-based vector allocator
Date: Fri, 18 May 2012 13:40:24 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux)

> IIRC from the previous discussions, you're suggesting rounding-up allocator
> for the sake of simplicity and better speed;

Actually, the main reason is because it has proved over the years to be
a good choice, for example in terms of fragmentation.

> I'm still voting for exact-fit allocator with splitting/coalescing,
> for the sake of best possible memory utilization.

It's not at all clear that the memory utilization will be better in
practice, for example because of fragmentation or because of the need to
keep the "next" field in vectors.

> It would be helpful to get a broader discussion on this.

At this point, I'm OK with using your approach, so I only want to make
sure the code is as simple and clean as possible.

Here are some questions and comments about your code:

> +#define VECTOR_BLOCK_SIZE (4096)

No need to put parens, right?

> +    roundup_size = 8

This deserves a comment explaining the choice of 8 and describing what
this constant is used for.

> +/* One pointer is malloc overhead, others are BUMP and NEXT fields of struct
> +   vector_block.  Rounding helps to maintain alignment constraints.  */
> +#define VECTOR_BLOCK_BYTES \
> +  (VECTOR_BLOCK_SIZE - roundup (3 * sizeof (void *), roundup_size))

Why do we care about malloc overhead?

> +/* Free lists for all possible block-allocated vectors.  */
> +
> +#define VECTOR_FREE_LISTS ((VECTOR_BLOCK_BYTES / roundup_size) + 1)

Some comment here or earlier should make it clear that we have one free
list per vector size.  Maybe just put this declaration together with the
one for "vector_free_lists", after all they go together.

And this macro should be called something like
VECTOR_MAX_FREE_LIST_INDEX; its current names makes it sound like it
returns the free lists (or a pointer to them).

> +/* When the vector is on a free list, vectorlike_header.SIZE is set to
> +   this special value ORed with vector's memory footprint size.  */
> +
> +#define VECTOR_FREE_LIST_SIZE \
> +  (((EMACS_INT) ~0) & ~(ARRAY_MARK_FLAG | PSEUDOVECTOR_FLAG | \
> +                     (VECTOR_BLOCK_SIZE - 1)))

The name doesn't sound right, it's not a real size, so it should be
VECTOR_FREE_LIST_FLAG.

Also, do we really have to put this info in the `size' field?  E.g. for
cons-cells we put Vdead in the `car' slot.  Of course having it in
`size' is not terrible, but `size' is pretty complicated already.

BTW, if needed we could add a special case so there is only 1 vector of
size 0 and (make-vector 0 ?) always returns that same vector (so it
doesn't have to come from a vector_block).

> +/* Current vector block.  */
> +static struct vector_block *vector_block;

Do we actually want a "current vector block"?
Here's what I mean by that: how often are we going to allocate from this
"current vector block" as opposed to allocating from one of the
free lists?
It would be simpler, when we allocate a new vector block, to just carve
out the vector we need from it and put all the rest directly on the
appropriate free list, so there's never such a thing as a "current
vector block".  That also should lets us get rid of the "bump" field.

> +/* Allocate vector from the vector block.  */
                           ^^^
                            a

> +  /* Next, check free lists contains larger vectors.  */
> +  for (index = (nbytes + sizeof (struct Lisp_Vector)) / roundup_size;

Add a comment explaining that the "+ sizeof (struct Lisp_Vector)"
eliminates the case where the "excess bytes" aren't sufficient to make
them into a valid vector.

> +/* Return amount of Lisp_Objects can be stored in that vector.  */
                                  ^^^
                                 which

> +  while (block)
> +    {
> +      struct Lisp_Vector *free_vectors[VECTORS_PER_BLOCK_MAX];
> +      int index, used;

You should be able to avoid this two-step "put them in free_vectors,
then loop over free_vectors to put them on the free lists": just at the
end of the coalescing loop, if we bump into the end of the vector_block
check if we're also the first vector in the block, and if so free the
whole (otherwise push the vector on the free list).

> +  if (nbytes > VECTOR_BLOCK_BYTES)
> +    {
> +      p = (struct Lisp_Vector *) lisp_malloc (nbytes, MEM_TYPE_VECTORLIKE);
> +      p->header.next.vector = large_vectors;
> +      large_vectors = p;
> +    }
> +  else
> +    /* Rounding is for 32-bit code to preserve alignment.  */
> +    p = allocate_vector_from_block (roundup (nbytes, roundup_size));

Maybe we should only allocate from vector blocks vectors that are
smaller than (say) VECTOR_BLOCK_BYTES/2.  The reason is that allocating
from a vector block incurs an overhead (memory overhead of "roundup (3 *
sizeof (void *), roundup_size)", and CPU overhead of setting up the
block, allocating the vector out of it, scanning the block, ...).
This is shared among all the vectors in the block, so for smallish
vectors it's worth the trouble (since we save other overheads) but if
the vector is large, there won't be many vectors with which to share
that overhead.

> +           /* P is in the block's allocation range. Scan the block
> +              up to P and see whether P points to the start of some
> +              vector which is not on a free list.  */
> +           while (vector <= (struct Lisp_Vector *) p)
> +             {
> +               if ((vector->header.size & VECTOR_FREE_LIST_SIZE)
> +                   == VECTOR_FREE_LIST_SIZE)
> +                 vector = ADVANCE (vector, (vector->header.size & 
> +                                            (VECTOR_BLOCK_SIZE - 1)));
> +               else if (vector == p)
> +                 return 1;
> +               else
> +                 vector = ADVANCE (vector, vector->header.next.nbytes);
> +             }
> +         }

This deserves a FIXME mentioning that this could be a significant
overhead.  I don't see how we can get rid of this overhead without using
a different allocation technique, but it's always good to make this
choice explicit.


        Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]