guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hackage importer


From: Ludovic Courtès
Subject: Re: hackage importer
Date: Thu, 26 Mar 2015 14:09:55 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux)

Federico Beffa <address@hidden> skribis:

> I've created a test for hackage.  In doing so I've had an issue with the
> locale and would like an advice: When I run "guix import hackage
> package-name" from the shell, or when I invoke a REPL with Geiser
> everything works fine.  When I initially did run the test-suite with
>
> make check TESTS=tests/hackage.scm
>
> the conditionals conversion looked wrong.  I've found that this is
> related to the choice of locale.  To make the test work as desired I
> added '(setlocale LC_ALL "en_US.UTF-8")' to the file declaring the
> tests.

What’s the exact error you were getting?  Character classes in regexps
are locale sensitive, I think, which could be one reason things behave
differently.  There’s also the problem that, when running in the C
locale, ‘regexp-exec’ (and other C-implemented procedures) cannot be
passed any string that is not strictly ASCII.

Could you post the actual backtrace you get (?) when running the program
with LC_ALL=C?

[...]

>> I would expect the conversion of conditional expressions to sexps to be
>> done in the parsing phase above, such that ‘read-cabal’ returns an
>> object with some sort of an AST for those conditionals.
>>
>> Then this part would focus on the evaluation of those conditionals,
>> like:
>>
>>   ;; Evaluate the conditionals in CABAL according to FLAGS.  Return an
>>   ;; evaluated Cabal object.
>>   (eval-cabal cabal flags)
>>
>> WDYT?
>
> I think it's better to keep the parsing phase to a bare minimum and work
> with layers to keep each function as manageable and simple as possible.
>
> The way it works is as follows:
>
> 1. 'strip-cabal' reads the file, strips comment and empty lines and
> return a list.  Each element of the list is a line from the Cabal file.
>
> 2. 'read-cabal' takes the above list and does the parsing required to
> read the indentation based structure of the file. It returns a list
> composed of list pairs. The pair is composed by a list of keys and a
> list of values. For example, the following snippet
>
> name: foo
> version: 1.0
> ...
>
> is returned as
>
> ((("name")    ("foo"))
>  (("version") ("1.0"))
>  ...)

OK.  I would rather have ‘read-cabal’ take an input port (like Scheme’s
‘read’) and return the list above; this would be the least surprising,
more idiomatic approach.  ‘strip-cabal’ (or
‘strip-insignificant-lines’?) would be an internal procedure used only
by ‘read-cabal’.

WDYT?

> This is enough for all the information that we need to build a Guix
> package, but dependencies.
>
> Dependencies are listed in 'Library' and 'Executable cabal' sections of
> the Cabal file as in the following example snippet:
>
> executable cabal
>   build-depends:
>     HTTP       >= 4000.2.5 && < 4000.3
> ...
>
> and may include conditionals as in (continuing from above with the
> proper indentation)
>
>     if os(windows)
>       build-depends: Win32 >= 2 && < 3
>     else
>       build-depends: unix >= 2.0 && < 2.8
> ...
>
> Now, to make sense of the indentation based structure I need to keep a
> state indicating how many indentation levels we currently have and the
> name of the various sub-sections. For this reason I keep a one to one
> correspondence between indentation levels and section titles. That means
> that the above snipped is encoded as follows in my Cabal object:
>
> ((("executable cabal" "build-depends:")                  ("HTTP..."))
>  (("executable cabal" "if os(windows)" "build-depends:") ("Win32 ..."))
>  (("executable cabal" "else"           "build-depends:") ("unix ..."))
>  ...)
>
> If I split 'if' from the predicate 'os(windows)' then the level of
> indentation and the corresponding section header loose synchronization
> (different length).

Would it be possible for ‘read-cabal’ to instead return a
tree-structured sexp like:

  (if (os windows)
      (build-depends (Win32 >= 2 && < 3))
      (build-depends (unix >= 2.0 && < 2.8)))

That would use a variant of ‘conditional->sexp-like’, essentially.

(Of course the achieve that the parser must keep track of indentation
levels and everything, as you already implemented; I’m just commenting
on the interface here.)

Then, if I could imagine:

  (eval-cabal '(("name" "foo")
                ("version" "1.0"
                ("executable cabal" (if (os windows) ...)))
  => #<cabal-package name: "foo" dependencies: '(unix)>

This way the structure of the Cabal file would be preserved, only
converted to sexp form, which is easier to work with.

Does that make sense?

> For this reason I only convert the predicates from Cabal syntax to
> Scheme syntax when I take the list of dependencies and convert the list
> in the form expected in a guix package.
>
> I hope to have clarified the strategy.
>
>>> +(define (guix-name name)
>>
>> Rather ‘hackage-name->package-name’?
>
> OK
>
>>> +;; Split the comma separated list of dependencies coming from the cabal 
>>> file
>>> +;; and return a list with inputs suitable for the GUIX package.  Currently 
>>> the
>>> +;; version information is discarded.
>>
>> s/GUIX/Guix/
>
> OK
>
>>> +(define (split-dependencies ls)
>>> +  (define (split-at-comma d)
>>> +    (map
>>> +     (lambda (m)
>>> +       (let ((name (guix-name (match:substring m 1))))
>>> +         (list name (list 'unquote (string->symbol name)))))
>>> +     (list-matches dependencies-rx d)))
>>
>> I think it could use simply:
>>
>>   (map string-trim-both
>>        (string-tokenize d (char-set-complement (char-set #\,))))
>
> Actually, this wouldn't do.  On top of splitting at commas, the function also
> translates the name of the package from hackage-name to guix-name.

Right, but then it’s best to have two separate procedures and two
compose them:

  (map (compose hackage-name->package-name string-trim-both)
       (string-tokenize d (char-set-complement (char-set #\,))))

> From 231ea64519505f84e252a4b3ab14d3857a9374c2 Mon Sep 17 00:00:00 2001
> From: Federico Beffa <address@hidden>
> Date: Sat, 7 Mar 2015 17:23:14 +0100
> Subject: [PATCH 1/5] import: Add hackage importer.
> 
> * guix/scripts/import.scm (importers): Add hackage.

[...]

> From ad79b3b0bc8b08ec98f49b665b32ce9259516713 Mon Sep 17 00:00:00 2001
> From: Federico Beffa <address@hidden>
> Date: Sat, 7 Mar 2015 17:30:17 +0100
> Subject: [PATCH 2/5] import: Add hackage importer.
> 
> * guix/scripts/import/hackage.scm: New file.

These two should be one patch, along with the modification of
POTFILES.in, because these 3 things correspond to a single logical
change–the addition of a sub-command and associated infrastructure.  A
fourth part is also needed: the addition of a couple of paragraphs in
guix.texi.

> From 447c6b8f1ee6cc1d4edba44cabef1b28b75c7608 Mon Sep 17 00:00:00 2001
> From: Federico Beffa <address@hidden>
> Date: Sun, 8 Mar 2015 07:48:38 +0100
> Subject: [PATCH 3/5] import: Add hackage importer.
> 
> * guix/import/hackage.scm: New file.

Please add tests/hackage.test as part of the same patch.

> +;; (use-modules (ice-9 match))
> +;; (use-modules (ice-9 regex))
> +;; (use-modules (ice-9 rdelim))
> +;; (use-modules (ice-9 receive))
> +;; (use-modules (ice-9 pretty-print))
> +;; (use-modules (srfi srfi-26))
> +;; (use-modules (srfi srfi-11))
> +;; (use-modules (srfi srfi-1))

Debugging leftover?

Nothing to add to what I wrote above.


[...]

> +(define test-cabal-1
> +  "name: foo
> +version: 1.0.0
> +homepage: http://test.org
> +synopsis: synopsis
> +description: description
> +license: BSD3
> +executable cabal
> +  build-depends:
> +    HTTP       >= 4000.2.5 && < 4000.3,
> +    mtl        >= 2.0      && < 3
> +")
> +
> +(define test-cond-1
> +  "(os(darwin) || !(flag(debug))) && flag(cips)")
> +
> +(define read-cabal
> +  (@@ (guix import hackage) read-cabal))
> +
> +(define strip-cabal
> +  (@@ (guix import hackage) strip-cabal))
> +
> +(define eval-tests
> +  (@@ (guix import hackage) eval-tests))
> +
> +(define eval-impl
> +  (@@ (guix import hackage) eval-impl))
> +
> +(define eval-flags
> +  (@@ (guix import hackage) eval-flags))
> +
> +(define conditional->sexp-like
> +  (@@ (guix import hackage) conditional->sexp-like))

I think it would be fine to export ‘read-cabal’ and ‘eval-cabal’ once
these two have the desired interface.

> +(test-begin "hackage")
> +
> +(test-assert "hackage->guix-package"
> +  ;; Replace network resources with sample data.
> +  (mock
> +   ((guix import hackage) hackage-fetch
> +    (lambda (name-version)
> +      (read-cabal
> +       (call-with-input-string test-cabal-1
> +         strip-cabal))))
> +    (match (hackage->guix-package "foo")
> +      (('package
> +         ('name "ghc-foo")
> +         ('version "1.0.0")
> +         ('source
> +          ('origin
> +            ('method 'url-fetch)
> +            ('uri ('string-append
> +                  "http://hackage.haskell.org/package/foo/foo-";
> +                  'version
> +                  ".tar.gz"))
> +            ('sha256
> +             ('base32
> +              (? string? hash)))))
> +         ('build-system 'haskell-build-system)
> +         ('inputs
> +          ('quasiquote
> +           (("ghc-http" ('unquote 'ghc-http))
> +            ("ghc-mtl" ('unquote 'ghc-mtl)))))
> +         ('home-page "http://test.org";)
> +         ('synopsis (? string?))
> +         ('description (? string?))
> +         ('license 'bsd-3))
> +        #t)

Nice.

With the adjusted semantics, there could be lower-level tests:

  (test-equal "read-cabal"
    '(("name" "foo") ...)
    (call-with-input-string test-cabal-1 read-cabal))

and:

  (test-assert "eval-cabal, conditions"
    (let ((p (eval-cabal '(("name" "foo")
                           ("executable cabal" (if (os windows) ...))))))
      (and (cabal-package? p)
           (string=? "foo" (cabal-package-name p))
           (equal? '(unix) (cabal-package-dependencies p)))))

Looks like we’re almost there.  I hope the above makes sense!

Thank you,
Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]