[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Chicken-users] External representation for continuations
From: |
Jonah Beckford |
Subject: |
RE: [Chicken-users] External representation for continuations |
Date: |
Wed, 1 Sep 2004 19:07:40 -0400 |
Thomas Chust wrote:
>
> Felix Winkelmann wrote:
> > [...]
> >
> > Basically everything is possible... ;-)
> >
> > I pondered about this quite a bit. Basic serialization of
> data is not
> > a big problem, but code is somewhat tricky. A closure (or
> > continuation) contains a code-pointer to some C routine. To
> properly
> > serialize a closure, the code-pointer would have to be
> converted into
> > some unique id, which, when deserialized back into a
> running process
> > must be converted into a code-pointer again (we can't just
> write the code-pointer directly:
> > in a process that deserializes a closure, the
> code-addresses may be at
> > completely different locations, or might even not be available, for
> > example if a library is not linked, which was used and
> referenced in a
> > serialized continuation/closure).
> >
> > One possible solution would be to generate a function-table
> for each
> > compiled file that maps code-pointers to unique identifiers). On
> > serialization a code-pointer would be looked up in that
> table, the id
> > written and on deserialization the id is searched in all
> loaded/linked
> > units/modules and converted back into a code-pointer.
>
> Information like that already exists in every file compiled
> with debugging information, so perhaps one would be able to
> access it without too much trouble. The bigger problem will
> probably be the serialization of code that is dynamically
> loaded or generated at runtime, because it also requires
> transmission of the code itself, not only of an identifier.
>
I see at least two main uses of serializing continuations. The first [1] is
to save a session so that you have a) a persistent Scheme environment, or b)
an early LISP-ish way to distribute applications for those LISP-ish
environments that do not have compilers. The second [2] is to create a
distributed framework or application.
As far as I can tell, the second [2] should be easy to do right now with
CHICKEN. Massively oversimplifying, just (object-evict kontinuation) and
save the result to disk or network, and make sure that your CHICKEN
executables are static builds and that all machines on your distributed
platform are homogeneous (byte architecture and operating system). Correct
me if I am wrong, but you would have no need for a function table or
something similar because the process code image would be the same on all
machines. And it would definitely be a weird distributed design if one was
relying on ports/files to be accessible across different machines. All the
above assumes distribution along the lines of standard MPI design, where
each node is to be used for computational activites, and the resultant data
is accumulated to one master node (hence no need to distribute files or
ports).
[1] a) is very cool as it would give you something like a long-running,
persistent CSI interpreter; you definitely have to go down the function
table or external info approach. I guess this would be for those who like
CHICKEN for its interpreter skills. I definitely don't see a practical need
for this, but in the same breath I definitely would want this. It's the
cool factor, but it sounds like a huge amount of work for little gain.
[1] b) is the hardest of all, but thankfully it is not necessary. It is
hard because the saved continuation must run on arbitrary platforms. It is
not necessary because CHICKEN has a decent compiler, which lets you
distribute applications in a more portable way.
> >
> > But serializing continuations is a tricky business: ports
> and foreign-
> > pointers have to handled specially and one has to make sure
> that a cont.
> > does not pull in the whole global environment.
>
> That's certainly a point. I think, stuff like ports and
> foreign pointers should simply become closed / invalid during
> serialization, unless one wants to implement a sophisticated
> callback system allowing access to those data objects on the
> original host. But if one also allows remote acces to data
> objects, garbage collection becomes a nightmare.
>
Yep.
> The problem with the environment was solved in the original
> Kali Scheme implementation (which was implemented on a stack
> based Scheme) by only transmitting the top few stackframes
> with a continuation and loading more frames over the network
> if they were actually needed. This is reasonable for example
> when migrating tasks across a network. I have no idea how to
> port this concept to CHICKEN, though.
>
Communication time is definitely a big issue with distributed apps/tasks. I
wonder though, have you tried (object-size kontinuation) on your prototype?
I have found that the object sizes are reasonably small (on the order of a
few kilobytes) with some internal task-oriented applications that I might
make distributable in the medium-term future (need a communication layer,
among other things). These are small because each worker task is a
computational horse, and has little need of the global environment.
I am curious if CHICKEN might already be good enough to satisfy your
requirements ...
Jonah