guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Workflow management with GNU Guix


From: Roel Janssen
Subject: Workflow management with GNU Guix
Date: Thu, 12 May 2016 10:43:09 +0200
User-agent: mu4e 0.9.17; emacs 25.1.50.5

Dear Guix,

With GNU Guix we are able to install programs to our machines with an amazing
level of control over the dependency graph of the programs.  We can now know
what code will run when we invoke a program.  We can now know what the impact
of an upgrade will be.  And we can now safely roll-back to previous states.

What seems to be a common practice in research involving data analysis, is
running multiple programs in a chain to transform data from raw to specific. 
This is often referred to as a "pipeline" or a "workflow".  Because data sets
can be quite large in comparison to the computing power of our laptops, the
data analysis is performed on computing clusters instead of single machines.

The usage of a pipeline/workflow is somewhat different from the package
construction, because we want to run the sequence of commands on different data
sets (as opposed to running it on the same source code).  Plus, I would like to
integrate it with existing computing clusters that have a job scheduling system
in place.  

The reason I think this should be possible with Guix is that it has
everything in place to do software deployment and run-time isolation
(containers).  From there it is a small step to executing programs in an
automated way.

So, I would like to propose a new Guix subcommand and an extension to
the package management language to add workflow management features.

Would this be a feature you are interested in adding to GNU Guix?

I'm currently working on a proof-of-concept implementation that has three
record types/levels of abstraction:
<workflow>:  Describes which <process>es should be run, and concerns itself with
             the order of execution.

<process>:   Describes what packages are needed to run the programs involved,
             and its relationship to other processes.  Processes take input and
             generate output much like the package construction process.

<script>:    Short and simple imperative instructions to perform a task. They 
are
             part of a <process>.  Currently, my implementation generates a 
shell
             script that can be either Guile, Sh, Perl or Python.

The subcommand I envision is:
  guix workflow

With primarily:
  guix workflow --run=<name-of-workflow-definition>

If you are interested in adding any form of workflow management to GNU Guix, I
can elaborate on my proof-of-concept implementation, so we can work from there.
(or throw everything out of the window and start from scratch ;-))

Thanks again for your time.

Kind regards,
Roel Janssen



reply via email to

[Prev in Thread] Current Thread [Next in Thread]