guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: issues with offloading


From: Ludovic Courtès
Subject: Re: issues with offloading
Date: Thu, 05 Feb 2015 23:39:02 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux)

Ricardo Wurmus <address@hidden> skribis:

> * lsh required
>
>   The manual does not appear to mention that for offloading lsh is
>   expected to be installed on the submitting host.  Since I only had
>   OpenSSH installed (on the local workstation and the remote server) I
>   decided to redefine %lsh-command and %lshg-command:
>
>     (define %lsh-command "ssh")
>     (define %lshg-command "ssh")

That won’t work because the command-line options that are passed are
lsh-specific.

>   When the command in these variables does not exist there is no error
>   message at all.  I only discovered the issue because machine-load
>   returned +inf.0 for every machine in the list (defined in
>   /etc/guix/machines.scm) and looped indefinitely to find a suitable
>   machine.
>
>   Here are some recommendations:
>
>   - make %lsh-command and %lshg-command configurable or mention in the
>     documentation that lsh must be available in the PATH.

Yes.

>   - print an error message when "remote-pipe" fails due to not finding
>     the command specified in %lsh-command / %lshg-command

Done.

However, there’s a wip-guile-ssh branch, which ideally is the future: it
uses the Guile-SSH library instead of invoking lsh.  This should improve
integration and error handling.

There were issues with old versions of Guile-SSH that have been
addressed since, so we should rebase it and see how well it works.

>   - only run once over the machines given in /etc/guix/machines.scm
>     instead of looping indefinitely, or alternatively print the reason
>     for skipping a machine (e.g. by stating that machine-load is +inf.0)

Yes.

> * does not work with unpriviledged user

[...]

>   This is a problem with register-gc-root, for example.  It creates a
>   directory in %state-directory where an unprivileged user likely has no
>   write permissions.  This mkdir fails silently because register-gc-root
>   does not bother checking the result of
>
>     (false-if-exception (mkdir root-directory))
>
>   When the root-directory (e.g. /var/guix/gcroots/tmp) cannot be created
>   by the remote user running the guile script, the following (symlink
>   ...) fails.

The idea was that /var/guix/gcroots/tmp would be created by the
administrator and made world-writable (similarly,
/var/guix/gcroots/profiles/per-user/$USER is writable by $USER.)

However, this is not documented and does not happen automatically.

I think this could be worked around by doing everything in a single
process on the remote side: we would run a single program there that
would take care of reporting missing store items, importing them,
performing the build, and writing the result.  That way, we would no
longer need the special directory for GC roots.

Needs some more thought.

>   Recommendations:
>
>   - instead of sending a script to be executed by a remote Guile process
>     running as the unprivileged SSH user it may make sense to bake this
>     feature into the daemon.  The daemon has permissions on
>     %state-directory anyway, while a regular user probably shouldn't.

I don’t think this is a good idea.

>   - check the return value of (false-if-exception (mkdir
>     root-directory)), or do not use false-if-exception at all to fail
>     right there when the directory should be created rather than failing
>     when the symlink to a non-existing directory cannot be created.
>     This would arguably result in a clearer error message.

I’ve improved that.

I realize there are several ways all this could be improved, most
notably: a) one remote process, b) Guile-SSH.  Let’s see what we can
do.

Thanks for your feedback!

Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]