[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Storing serialised graph along with packages
From: |
Ricardo Wurmus |
Subject: |
Re: Storing serialised graph along with packages |
Date: |
Mon, 24 Jul 2017 18:43:23 +0200 |
User-agent: |
mu4e 0.9.18; emacs 25.2.1 |
Hi,
> Ricardo Wurmus <address@hidden> skribis:
>
>> it always bothered me that after building a package we lose all of the
>> beautiful features that Guix as a Guile library gives us. We always
>> need to keep track of the Guix version at the time of building the
>> package and only then can we hope to rebuild the same thing again at
>> some point in the future.
>>
>> What do you think about storing the serialised subset of the package
>> graph in a separate output of the package? Currently, the only place
>> where we store anything meta is the database. Wouldn’t it be great if
>> we could “dump an image” of the state of Guile when it has evaluated the
>> section of the package graph that is needed to build it?
>>
>> Then we could just load the serialised state into Guile at a later point
>> and inspect the package graph as if we had Guix checked out at the given
>> version. I suppose we could also store this kind of information in the
>> database.
>>
>> I’d really like the graph to stay alive even after Guix has moved on to
>> later versions. It also sounds like a really lispy thing to do.
>
> I sympathize with the goal, and I like the parallel with Lisp.
>
> However I’m skeptical about our ability to do something that is robust
> enough. The package → bag → derivation compilation process is “lossy”
> in the sense that at each layer we lose a bit of context from the higher
> layers. Each arrow potentially involves all the code and package
> definitions of Guix, as opposed to just a subset of the package
> definitions. We could certainly serialize package objects to sexps, but
> that would not capture the implementation of build systems,
> ‘package-derivation’, or even lower-level primitives. So this would be
> a rough approximation, at best.
Yes, indeed. My goal is to get a *better* approximation than what the
references database currently gives us.
Out of curiosity I’ve been playing with serialisation on the train ride
and build systems are indeed a problem. In my tests I just skipped
them until I figured something out.
I played with cutting out the sources for the package expression (using
“package-location”) and compiling the record to a file. Unfortunately,
this won’t work for packages that are the result of generator procedures
(like “gfortran”).
My current approach is just to go through each field of a package record
to generate an S-expression representing the package object, and then to
compile that. In a clean environment I can load that module along with
copies of the modules under the “guix” directory that implement things
like “url-fetch” or the search-path-specifications record.
To be able to traverse the dependency graph, one must load additional
modules for each of the store items making up the package closure.
(This would require that in addition to just embedded references we
would need to record the store items that were present at build time,
but that’s easy.)
> The safe way to achieve what you want would be to store the whole Guix
> tree (+ GUIX_PACKAGE_PATH), or a pointer to that (a Git commit).
>
> There’s a also the problem of bit-for-bit reproducibility: there’s an
> infinite set of source trees that can lead to a given store item. If we
> stored along with, say, Emacs, the Guix source tree/commit that led to
> it, then we’d effectively remove that equivalence (whitespace would
> become significant, for instance[*].)
Hmm, that’s true. And it’s not just a problem of sources. We might
still introduce unimportant differences if we only serialised the
compiled objects and completely excluded the plain text source code,
e.g. when we refactor supporting code that has no impact on the value of
the result but which would lead to a change in the compiled module.
Can we separate the two? Instead of installing modules (or the whole
Guix tree) into the output directory of a store item, could we instead
treat them like a table in the database? Building that part would not
be part of the package derivation; it would just be a pre- or
post-processing step, like registering the references in the database.
--
Ricardo
GPG: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC
https://elephly.net