help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: why are there [v e c t o r s] in Lisp?


From: Pascal J. Bourguignon
Subject: Re: why are there [v e c t o r s] in Lisp?
Date: Fri, 16 Oct 2015 05:57:29 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Random832 <random832@fastmail.com> writes:

> Emanuel Berg <embe8573@student.uu.se> writes:
>> If a list just contains say a bunch of known integers
>> that don't need to be computed, is there anything that
>> stops this list from being stored in "contiguous
>> memory" with the list functions as well as the
>> constant access time "vector" functions available?

Emanuel, notice that in some implementations of lisp in the 1980s, there
was an optimization for lists called cdr-coding which stored the list
when it was copied as a vector of consecutive elements (eliminating thus
the cdr, which basically was reduced to a bit in tag of the slot).  This
still allowed to call cdr (instead of derefencing that slot, we'd just
increment the address in the vector).  But if you used rplacd, then it
had to "split" the cdr-coded vector, creating a new cons cell where to
store the new cdr pointer.

    (copy-list '(1 2 3 4 5))

would create a vector of CARs and point to the first one.  The slot
would have a bit set indicating that the CDR has been eliminated, and
the CAR of the next cell is stored in the next address:

+-----------------------------------------------------------+
| list = (1 2 3 4 5)   +----------------------------f       |
|   |                  |                                    |
|   v                  v                                    |
| +-----+-----+-----+-----+-----+-----+                     |                   
       
| |  * '|  * '|  * '|  * '|  *  | NIL |                     |                   
      
| +-----+-----+-----+-----+-----+-----+                     |                   
      
|    |     |     |     |     |                              |                   
   
|    v     v     v     v     v                              |
|  +---+ +---+ +---+ +---+ +---+                            |
|  | 1 | | 2 | | 3 | | 4 | | 5 |                            |
|  +---+ +---+ +---+ +---+ +---+                            |
+-----------------------------------------------------------+

The last cell would be a CDR (so the preceeding slot doesn't have the
bit in the tag set), pointing to a following cons cell or cdr vector, or
the atom in the dotted list (here, NIL, to indicate the end of the
list).

Now the problem occurs when you use rplacd in the middle of the list:
                                                             
(let* ((list (list 1 2 3 4 5))
       (f    (nthcdr 3 list)))
  (rplacd (cddr list) (list* 4.0 4.2 (cddddr list)))         
  (list list f))
--> ((1 2 3 4.0 4.2 . #1=(5))
     (4 . #1#))

To do that in the presence of cdr-coding, we have to introduce a new
cons cell for 4, and more problematic, we have to scan the whole memory
for references to this cell, and update them (in this case, the variable
f must now point to a new cons cell, instead of in the middle of the
cdr-vector):


+-----------------------------------------------------------+
| list = (1 2 3 4 5)                                        |
|   |                                                       |
|   |                      +-----+-----+-----+              |
|   |                  +-->|  * '|  *  |  *  |              |
|   |                  |   +-----+-----+-----+              |
|   |                  |      |     |     |                 |
|   |                  |      v     v     |                 |
|   |                  |    +---+ +---+   |                 |
|   |                  |    |4.0| |4.2|   |                 |
|   |                  |    +---+ +---+   |                 |
|   |                  |                  |                 |
|   |                  |     +------------+<-----------+    |
|   |                  |     |        f-------+        |    |
|   |                  |     |                |        |    |
|   v                  v     v                v        |    |
| +-----+-----+-----+-----+-----+-----+     +---+---+  |    |                   
       
| |  * '|  * '|  *  |  *  |  * '| NIL |     | * | * |--+    |                   
      
| +-----+-----+-----+-----+-----+-----+     +---+---+       |                   
      
|    |     |     |           |                |             |                   
   
|    v     v     v           v                v             |
|  +---+ +---+ +---+       +---+            +---+           |
|  | 1 | | 2 | | 3 |       | 5 |            | 4 |           |
|  +---+ +---+ +---+       +---+            +---+           |
+-----------------------------------------------------------+

> Well, what happens if you cdr that list? Does it make a
> copy? 

cdr'ing is no problem you just increment your pointer.

> Wasted memory, wasted time, won't track if you change
> the original. Yeah, yeah, in platonic ideal lisp you *can't*
> change lists, but this is the real world, where even scheme
> lets you do it. Does it keep a reference to the original
> list? Then what if you take the cd^{4000}r of a list of
> 5000, then throw away the original?  Wasted memory
> again. Java used to have this issue with strings. I suppose
> you could have the underlying array somehow know what the
> earliest index that a reference actually exists for is, and
> magically throw away everything before it during garbage
> collection, if there's too much wasted space. But that seems
> rather a lot of work.
>
>> By the way: in my previous post strings were mentioned
>> and it sounded like they were sugar for lists of chars
>> - this isn't the case (you can't `car' a string) but
>> it could have been and it isn't harmful (I think) to
>> think of strings that way.
>
> The thing is, if you can car a string people will wonder why
> you can't cdr it. And with mutable objects it's hard to make
> cdr work right. (fsvo "right")

It's rplacd which becomes much more complex.

Also, when you program with such a system, you might be afraid that
rplacd leads to fragmented chains, and will have a tendency to call
copy-list to compact them.  But indeed, copy-list means that less
structure is shared, and therefore would waste time and memory.  This is
what they do in C++ and why they believe lists are costly.


-- 
__Pascal Bourguignon__                 http://www.informatimago.com/
“The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.” -- Carl Bass CEO Autodesk


reply via email to

[Prev in Thread] Current Thread [Next in Thread]