guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rb-general] Paper preprint: Reproducible genomics analysis pipeline


From: Ludovic Courtès
Subject: Re: [rb-general] Paper preprint: Reproducible genomics analysis pipelines with GNU Guix
Date: Mon, 23 Apr 2018 10:20:26 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)

Hello Ricardo & all!

Ricardo Wurmus <address@hidden> skribis:

> I’m happy to announce that the group I’m working with has released a
> preprint of a paper on reproducibility with the title:
>
>     Reproducible genomics analysis pipelines with GNU Guix
>     https://www.biorxiv.org/content/early/2018/04/11/298653
>
> We built a collection of bioinformatics pipelines and packaged them with
> GNU Guix, and then looked at the degree to which the software achieves
> bit-reproducibility (spoiler: ~98%), analysed sources of non-determinism
> (e.g. time stamps), discussed experimental reproducibility at runtime
> (e.g. random number generators, kernel+glibc interface, etc) and
> commented on the idea of using “containers” (or application bundles)
> instead.

Very impressive piece of work!  I think it’s important to stress that
reproducible builds is a crucial foundation for reproducible
computational experiments, and this paper does a great job at this.

Also nice that you show you can have these bit-reproducible pipelines
formalized in Guix *and* produce a ready-to-use “container image.”

Hopefully we can soon address the remaining sources of non-determinism
shown in Table 3 (I think you already addressed some of them in the
meantime, didn’t you?).

The bit I’m less comfortable with is Autotools.  I do understand how it
helps capture configure-time dependencies, and how it generally helps
people package and use the software; I think it’s one of the best tools
for the job.  However it’s also hard to learn and, whether it’s
justified or not, it’s considered “scary.”

Given the intended audience, I wonder how we could provide a simpler
path to achieve the same goal.  It could be a set of Autoconf macros
leading to high-level ‘configure.ac’ files without any line of shell
code, or it could be Guix interpreting a top-level .scm or JSON file,
both of which would ideally be easier to write for bioinformaticians.

What are your thoughts on this?

Anyway, kudos on this, thank you!

Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]