[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] gnu: Add minced.
From: |
Marius Bakke |
Subject: |
Re: [PATCH] gnu: Add minced. |
Date: |
Tue, 16 Aug 2016 13:34:29 +0100 |
User-agent: |
Notmuch/0.22 (https://notmuchmail.org) Emacs/24.5.2 (x86_64-unknown-linux-gnu) |
Ben Woodcroft <address@hidden> writes:
> Hi Marius,
>
> Excellent to see others interested in packaging microbial bioinformatics
> tools.
You may want to look here before packaging other microbio tools:
https://github.com/MRC-CLIMB/guix-climb
I'm currently working on upstreaming most of these :)
> I tried this in a container and it seems that it calls out to a few
> programs preventing it from working:
>
> [env]# minced -h
> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 7:
> dirname: command not found
> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 9:
> dirname: command not found
> /gnu/store/8asw2i54x53rrwdr8qw4j2rpbkc9fqzz-profile/bin/minced: line 11:
> basename: command not found
>
> So I think that (in order of preference), the source files themselves
> should be patched with the absolute paths to these tools, the binary
> should be wrapped, or coreutils should be propagated. For the first two
> options, coreutils should be an input.
Good catch. Since the "minced" executable is just a wrapper script, I
opted to build my own wrapper to avoid the coreutils dependency.
> Now some small comments on the patch itself.
>
>> + (arguments
>> + `(#:test-target "test"
>> + #:phases
>> + (modify-phases %standard-phases
>> + (delete 'configure)
>> + (add-before 'check 'fix-test
>> + (lambda _
>> + ;; Fix test for latest version.
>> + (substitute* "t/Aquifex_aeolicus_VF5.expected"
>> + (("minced:0.1.6") "minced:0.2.0"))))
> It might be more future-proof to use '(string-append "minced:"
> ,version)' instead of hard-coding 0.2.0.
I don't think this will be a problem in future releases. And it can also
cause problems, in case a user sets version to e.g. a commit hash.
> Also, this phase (and the next two) should end in #t since the return
> value of substitute* is undefined.
>
>
>> + (add-before 'install 'qualify-java-path
>> + (lambda* (#:key inputs #:allow-other-keys)
>> + (substitute* "minced"
>> + ;; Set full path to java binary in wrapper script.
>> + (("^java") (string-append (assoc-ref inputs "jre")
>> + "/bin/java")))))
>> + (replace 'install
>> + ;; No install target.
>> + (lambda* (#:key outputs #:allow-other-keys)
>> + (let* ((out (assoc-ref outputs "out"))
>> + (bin (string-append out "/bin")))
>> + (for-each (lambda (file)
>> + (install-file file bin))
>> + (list "minced" "minced.jar"))))))))
>> + (native-inputs
>> + `(("jdk", icedtea "jdk")))
>> + (inputs
>> + `(("jre", icedtea)))
> The commas should go after the space.
Oops, I should stop submitting patches late in the night!
>> + (home-page"https://github.com/ctSkennerton/minced")
>> + (synopsis "Mining CRISPRs in Environmental Datasets")
>> + (description
>> + "MinCED is a program to find Clustered Regularly Interspaced Short
>> +Palindromic Repeats (CRISPRs) in full genomes or environmental datasets such
>> +as metagenomes, in which sequence size can be anywhere from 100 to 800 bp.")
> That description which you took from the README is a little dated at the
> end. How about this?
>
> "MinCED is a program to find Clustered Regularly Interspaced Short
> Palindromic Repeats (CRISPRs) in both full genomes and shorter metagenomic
> sequences."
>
> The rest LGTM. Thanks. Can you send an updated patch please?
Please find updated patch below. Thanks for the feedback!
>From 53602b0908956d122146baafdc1a541596df151d Mon Sep 17 00:00:00 2001
From: Marius Bakke <address@hidden>
Date: Mon, 15 Aug 2016 16:06:37 +0100
Subject: [PATCH] gnu: Add minced.
* gnu/packages/bioinformatics.scm (minced): New variable.
---
gnu/packages/bioinformatics.scm | 55 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/gnu/packages/bioinformatics.scm b/gnu/packages/bioinformatics.scm
index dd69e39..4869d56 100644
--- a/gnu/packages/bioinformatics.scm
+++ b/gnu/packages/bioinformatics.scm
@@ -5,6 +5,7 @@
;;; Copyright © 2015 Andreas Enge <address@hidden>
;;; Copyright © 2016 Roel Janssen <address@hidden>
;;; Copyright © 2016 Efraim Flashner <address@hidden>
+;;; Copyright © 2016 Marius Bakke <address@hidden>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -3118,6 +3119,60 @@ probabilistic distances of genome abundance and
tetranucleotide frequency.")
(license (license:non-copyleft "file://license.txt"
"See license.txt in the distribution."))))
+(define-public minced
+ (package
+ (name "minced")
+ (version "0.2.0")
+ (source (origin
+ (method url-fetch)
+ (uri (string-append
+ "https://github.com/ctSkennerton/minced/archive/"
+ version ".tar.gz"))
+ (file-name (string-append name "-" version ".tar.gz"))
+ (sha256
+ (base32
+ "0wxmlsapxfpxfd3ps9636h7i2xy6la8i42mwh0j2lsky63h63jp1"))))
+ (build-system gnu-build-system)
+ (arguments
+ `(#:test-target "test"
+ #:phases
+ (modify-phases %standard-phases
+ (delete 'configure)
+ (add-before 'check 'fix-test
+ (lambda _
+ ;; Fix test for latest version.
+ (substitute* "t/Aquifex_aeolicus_VF5.expected"
+ (("minced:0.1.6") "minced:0.2.0"))
+ #t))
+ (replace 'install ; No install target.
+ (lambda* (#:key inputs outputs #:allow-other-keys)
+ (let* ((out (assoc-ref outputs "out"))
+ (bin (string-append out "/bin"))
+ (wrapper (string-append bin "minced")))
+ ;; Minced comes with a wrapper script that tries to figure out
where
+ ;; it is located before running the JAR. Since these paths are
known
+ ;; to us, we build our own wrapper to avoid coreutils
dependency.
+ (install-file "minced.jar" bin)
+ (with-output-to-file wrapper
+ (lambda _
+ (display
+ (string-append
+ "#!" (which bash) "\n\n"
+ (which java) " -jar " bin "/minced.jar
\"address@hidden"\n"))))
+ (chmod "minced" #o555)))))))
+ (native-inputs
+ `(("jdk" ,icedtea "jdk")))
+ (inputs
+ `(("jre" ,icedtea)))
+ (home-page "https://github.com/ctSkennerton/minced")
+ (synopsis "Mining CRISPRs in Environmental Datasets")
+ (description
+ "MinCED is a program to find Clustered Regularly Interspaced Short
+Palindromic Repeats (CRISPRs) in DNA sequences. It can be used for unassembled
+metagenomic reads, but is mainly designed for full genomes and assembled
+metagenomic sequence.")
+ (license license:gpl3+)))
+
(define-public miso
(package
(name "miso")
--
2.9.2