bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos


From: Ludovic Courtès
Subject: bug#26201: hydra.gnu.org uses ‘guix publish’ for nars and narinfos
Date: Wed, 03 May 2017 11:25:38 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

Hello,

Mark H Weaver <address@hidden> skribis:

> Actually, IIUC, the build slaves are _already_ compressing everything,
> and they always have.  They compress the build outputs for transmission
> back to the master machine.  In the current framework, the master
> machine immediately decompresses them upon receipt, and this compression
> and decompression is considered an internal detail of the network
> transport.
>
> Currently, the master machine stores all build outputs uncompressed in
> /gnu/store, and then later recompresses them for transmission to users
> and other build slaves.  The needless decompression and recompression is
> a tremendous amount of wasted work on our master machine.  That it's all
> stored uncompressed is also a significant waste of disk space, which
> leads to significant additional costs during garbage collection.
>
> Essentially, my proposal is for the build slaves to be modified to
> prepare the compressed NARs in a form suitable for delivery to end users
> (and other build slaves) with minimal processing by our master node.
> The master node would be significantly modified to receive, store, and
> forward NARs explicitly, without ever decompressing them.  As far as I
> can tell, this would mean strictly less work to do and less data to
> store for every machine and in every case.

I agree that the redundant compression/decompression is terrible.  Yet
I’m not sure how to architect a solution where compression is performed
by build machines.  The main issue is that offloading and publication
are two independent mechanisms, as things are.

Maybe each build machine for a build farm use-case we could have a
“semi-offloading” mechanism whereby the master spawns a remote build
without retrieving its result, something akin to:

  GUIX_DAEMON_SOCKET=ssh://build-machine.example.org \
  guix build /gnu/store/…-foo.drv

In addition, the build machine would publish its result via ‘guix
publish’, which the master could then simply mirror and cache with
nginx.

There’s the issue of signatures, but perhaps we could have a more
sophisticated PKI and have the master delegate to build machines…

Then there are other issues such as that of synchronizing the TTL of a
narinfo and its corresponding nar, which --cache addresses.

Tricky!

> Ludovic has pointed out that we cannot do this because Hydra must add
> its digital signature, and that this digital signature is stored within
> the compressed NAR.  Therefore, we cannot avoid having the master
> machine decompress and recompress every NAR that is delivered to users.
>
> In my opinion, we should change the way we sign NARs.  Signatures should
> be external to the NARs, not internal.  Not only would this allow us to
> decentralize production of our NARs, but more importantly, it would
> enable a community of independent builders to add their signatures to a
> common pool of NARs.  Having a common pool of NARs enables us to store
> these NARs in a shared distribution network without duplication.  We
> cannot even have a common pool of NARs if they contain
> build-farm-specific data such as signatures.

Currently the signature is in the narinfos, not in nars proper¹.  So we
can already add signatures on an externally provided nar, for instance.

There’s a silly limitation currently, which is that the signature is
computed over all the fields of the narinfo.  That’s silly because it
means that if you change, say, the compression format or the URL of the
nar, then the signature becomes invalid.  We should fix that at some
point.

Ludo’.

¹ For ‘guix publish’.  ‘guix archive --export’ appends a signature to
  the nar set.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]