guix-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

06/14: gpce-2017: Tweak some more.


From: Ludovic Courtčs
Subject: 06/14: gpce-2017: Tweak some more.
Date: Fri, 1 Sep 2017 11:57:54 -0400 (EDT)

civodul pushed a commit to branch master
in repository maintenance.

commit ee3a74c6e2a25956342157d555826626c00de3b0
Author: Ludovic Courtès <address@hidden>
Date:   Fri Jul 7 11:48:54 2017 +0200

    gpce-2017: Tweak some more.
---
 doc/gpce-2017/code/system-test.scm |   2 +-
 doc/gpce-2017/gpce.skb             | 154 ++++++++++++++++++-------------------
 doc/gpce-2017/staging.sbib         |  36 +++++++++
 3 files changed, 113 insertions(+), 79 deletions(-)

diff --git a/doc/gpce-2017/code/system-test.scm 
b/doc/gpce-2017/code/system-test.scm
index 6a086b7..4ea879b 100644
--- a/doc/gpce-2017/code/system-test.scm
+++ b/doc/gpce-2017/code/system-test.scm
@@ -3,7 +3,7 @@
                  (srfi srfi-64) (ice-9 match))
 
     ;; Spawn the VM that runs the declared OS.
-    (define marionette (make-marionette (list #$run)))
+    (define marionette (make-marionette (list #$vm)))
 
     (test-begin "basic")
     (test-assert "uname"
diff --git a/doc/gpce-2017/gpce.skb b/doc/gpce-2017/gpce.skb
index bbbbb14..1cc9ba4 100644
--- a/doc/gpce-2017/gpce.skb
+++ b/doc/gpce-2017/gpce.skb
@@ -292,7 +292,7 @@ homoiconicity—the fact that code has a direct 
representation as a data
 structure using the same syntax.  “S-expressions” or “sexps”, Lisp’s
 parenthecal expressions, thus look like they lend themselves to code
 staging.
-In this section we show how we this early experience made it clear that
+In this section we show how our early experience made it clear that
 we needed an ,(emph [augmented]) version of sexps.])
 
       (section :title [Staging Build Expressions]
@@ -312,7 +312,7 @@ which relied solely on Lisp quotation ,(ref :bib
 'bawden1999:quasiquotation).  Figure ,(ref :figure "fig-build-sexp")
 shows an example that creates a derivation that, when built, converts
 the input image to JPEG, using the ,(tt [convert]) program from the
-ImageMagick package—this is equivalent to a three-line makefile, but
+ImageMagick package—this is equivalent to a three-line makefile rule, but
 referentially transparent.  In this example, variable ,(tt [store])
 represents the connection to the build daemon.  The ,(tt
 [package-derivation]) function takes the ,(tt [imagemagick]) package
@@ -367,7 +367,7 @@ file name.])))
    (chapter :title [G-Expressions]
       :ident "gexps"
        
-      (p [We devised “G-expressions” as a mechanism to address
+      (p [We devised “G-expressions” to address
 these shortcomings.  This section describes the design and implementation of
 G-expressions, as well as extensions we added to address new use
 cases.])
@@ -383,7 +383,8 @@ cases.])
                  :start ";!begin-imagemagick-gexp"
                  :stop ";!end-imagemagick-gexp")))
 
-        (p [In essence, a gexp bundles an sexp and its inputs
+        (p [G-expressions ,(emph [bind software deployment to staging]).
+A gexp bundles an sexp and its inputs
 and outputs, and it can be serialized with ,(tt [/gnu/store]) file
 names substituted as needed.  We first define two operators:
 
@@ -513,17 +514,7 @@ as illustrated by Figure ,(ref :figure 
"fig-gexp-hygiene").  The
 implementation is similar to MetaScheme ,(ref :bib
 'kiselyov2008:metascheme) and to that described by Rhiger ,(ref :bib
 'rhiger2012:hygienic), with caveats discussed in ,(numref :text
-[Section] :ident "limitations").  Unlike the examples usually given in
-the literature, identifiers must be generated in a ,(emph
-[deterministic]) fashion: if they were not, we would produce different
-derivations at each run, which in turn would trigger full rebuilds of
-the package graph.  Thus, instead of relying on ,(tt [gensym]) and
-,(tt [generate-temporaries]), we generate identifiers as a function of
-the hash of
-the input expression and of the lexical nesting level of
-the identifier—these are the two components we can see in the generated
-identifiers of Figure ,(ref
-:figure "fig-gexp-hygiene").])
+[Section] :ident "limitations").])
     (item [The second pass ,(emph [collects the escape forms]) (,(tt
 [ungexp]) variants) in the input source.  The list of escape forms is
 needed to construct the list of inputs stored in the gexp
@@ -533,7 +524,19 @@ generation function shown in Figure ,(ref :figure
     (item [The third pass ,(emph [substitutes escape forms]) with
 references to the corresponding formal arguments of the code
 generation function.  This leads to the sexp-construction expression
-shown in Figure ,(ref :figure "fig-gexp-expansion").]))])
+shown in Figure ,(ref :figure "fig-gexp-expansion").]))
+
+Unlike the examples usually given in
+the literature, our renaming pass must generate identifiers in a ,(emph
+[deterministic]) fashion: if they were not, we would produce different
+derivations at each run, which in turn would trigger full rebuilds of
+the package graph.  Thus, instead of relying on ,(tt [gensym]) and
+,(tt [generate-temporaries]), we generate identifiers as a function of
+the hash of
+the input expression and of the lexical nesting level of
+the identifier—these are the two components we can see in the generated
+identifiers of Figure ,(ref
+:figure "fig-gexp-hygiene").])
         
         (figure
            :legend [The gexp compilers for package objects and for
@@ -638,6 +641,7 @@ run anyway.  Thus, we write ,(tt [#+imagemagick]) rather 
than ,(tt
       
       (p [Guix and GuixSD are used in production by individuals and
 organizations to deploy software on laptops, servers, and clusters.
+Deploying GuixSD involves staging hundreds of gexps.
 Introducing a new core mechanism in such a project can be both fruitful
 and challenging.  This section reports on our experience using gexps in
 Guix.])
@@ -719,13 +723,16 @@ into the Shepherd configuration file.])
 build linux-container)]) module to create Linux ,(emph [containers])
 (isolated execution environments), we were able to reuse this
 container module within the Shepherd ,(ref :bib
-'courtes2017:servicecontainers).  Essentially, the only thing we had
+'courtes2017:servicecontainers).  The only thing we had
 to do to achieve this was to (1) wrap our ,(tt [start]) gexp in ,(tt
 [with-imported-modules]) so that it has access to the container
 functionality, and (2) use our start-process-in-container function
 lieu of the Shepherd’s own start-process function.  This is a good
 example of cross-stage code sharing, where the second stage in this
-case is the operating system’s run-time environment.]))
+case is the operating system’s run-time environment.])
+        (p [Another system service implemented in Scheme is GNU mcron,
+which handles scheduled job execution.  Its configuration also consists
+of Scheme snippets, which GuixSD OS definitions can include as gexps.]))
       
       (section :title [System Tests]
         
@@ -742,11 +749,12 @@ verifies that the system running in the VM matches some 
of the settings.  The
 guest OS is instrumented with a Scheme interpreter that evaluates
 expressions sent by the host OS—we call it “marionette”.])
         (p [Whole-system tests are derivations whose build programs are
-gexps that resemble that of Figure ,(ref :figure "fig-system-test").
-The build program passes ,(tt [run]), the script to spawn the VM, to the
+gexps like that of Figure ,(ref :figure "fig-system-test").
+The build program passes ,(tt [vm]), the script to spawn the VM, to the
 instrumentation tool.  The test then uses ,(tt [marionette-eval]) to
-call the ,(tt [uname]) function: an ,(emph [additional code stage]) is
-introduced here, this time using ,(tt [quote]).  The test matches the
+call the ,(tt [uname]) function in the guest: an ,(emph [additional code 
stage]) is
+introduced here, this time using ,(tt [quote]) since gexps are currently
+limited to contexts with a connection to the build daemon.  The test matches 
the
 return value of ,(tt [uname]) against the expected vector, and makes
 sure the information corresponds to the various bits declared in ,(tt
 [os]), our OS definition.])))
@@ -762,8 +770,7 @@ well-documented approach to the problem ,(ref :bib
 implementation handles a single binding construct (,(tt [lambda])) and
 MetaScheme handles a couple more constructs, but ours has
 to deal with more binding constructs: R6RS defines around ten
-binding constructs (including binding constructs for syntactic
-keywords such as ,(tt [let-syntax])), and Guile adds a couple more.])
+binding constructs, and Guile adds a couple more.])
       (p [Hygiene in multi-stage programs relies on identifying binding
 constructs.  This turns out to be hard to achieve in Scheme because
 macros can define ,(emph [new]) bindings constructs.
@@ -773,7 +780,7 @@ macro expander, of course, does this and more already, so 
it would be
 tempting to reuse it rather than duplicate part of its work.  However,
 we do not want to macro-expand staged code; instead, macro expansion
 should be performed “the normal way”, by the Guile program that
-compiles or evaluate the staged code.  Again, this ensures
+compiles or evaluates the staged code.  Again, this ensures
 reproducibility across Guix installations since we control precisely
 the Guile variant used in derivations whereas we do not control the
 Guile variant used to evaluate “host-side” code.  How we could hook
@@ -807,18 +814,17 @@ in scope at the macro definition point.  How to achieve 
something
 similar with gexp, which lack the big picture that a macro expander has,
 remains an open question.])
       (p [,(bold [Cross-stage debugging.]) ,(tt [gexp->derivation])
-emits build programs as sexps in a file in ,(tt [/gnu/store]), using
-Scheme ,(tt [write]), which writes the whole sexp as one line.  When
+emits build programs as sexps in a file in ,(tt [/gnu/store]).  When
 an error occurs during the execution of these programs, Guile prints a
 backtrace that refers to source code locations ,(emph [inside the
 generated code]).  What we would like, instead, is for the backtrace
 to refer to the location ,(emph [of the gexp itself]).  C has ,(tt
 [#line]) directives, which code generators insert in generated code to
-,(emph [map]) generated code to its source.  Assuming a similar
+,(emph [map]) generated code to its source.  If a similar
 feature was available in Scheme, it would be unsuitable: moving the
 source code where a gexp appears would lead to a different derivation,
-in turn triggering a rebuild of everything that depends on it, which
-is undesirable.  Instead we would need a way to pass source code
+in turn triggering a rebuild of everything that depends on it.
+Instead we would need a way to pass source code
 mapping information ,(emph [out-of-band]), in a way that does not affect
 the derivation that is produced.  We are investigating ways to
 achieve that.]))
@@ -826,73 +832,65 @@ achieve that.]))
    (chapter :title [Related Work]
       :ident "related"
       
-      (p [Nix shares the same concerns as Guix: its language must be
+      (p [Like Guix, Nix must be
 able to include references to store items (derivation results) in
-generated code while not keeping track of derivations this generated
-code depends on.  However, Nix is a single-stage language, only used
-on the “host side”, which describes package derivations and their
-composition, while the “build side” is left to other languages such as
-Bash or Perl.  Nix provides a ,(emph [string interpolation]) mechanism
-that allows users to splice arbitrary Nix expressions in strings ,(ref
-:bib 'dolstra2010:nixos); when such an expression refers to a
-derivation, the Nix interpreter records this dependency in the string
+generated code while keeping track of derivations this generated
+code depends on.  However, Nix is a single-stage language:
+the “build side” is left to other languages such as
+Bash or Perl.  Users can splice arbitrary Nix expressions in
+strings thanks to ,(emph [string interpolation]) 
+,(ref :bib 'dolstra2010:nixos); when such an expression refers to a
+derivation, the interpreter records this dependency in the string
 context and substitutes the reference with the output file name of the
 derivation.])
-      (p [Because Nix views this generated code as mere strings, it
-does not provide any guarantee on the generated code (notably syntactic
-correctness).  The string interpolation syntax (,(tt [${])…,(tt [}])
-sequences), often clashes with the target’s language syntax (e.g.,
+      (p [Nix views staged code as mere strings and thus
+does not provide any guarantee on the generated code.
+The string interpolation syntax (,(tt [${])…,(tt [}])
+sequences) often clashes with the target’s language syntax (e.g.,
 Bash uses dollar-brace syntax to reference variables), which can lead
 to subtle errors and constrain developers to resort to non-trivial
 escaping syntax.  The “code-as-string” paradigm also has other side
 effects: comments and whitespace in those strings is preserved, and
 changing those triggers a rebuild of the derivation, which is
 inconvenient.])
-      (p [Code staging in Scheme has been studied in the context of
-,(emph [hygienic macros])—i.e., macros that generate
-well-scoped code, without unintended capture of variables ,(ref :bib 
'(kohlbecker1986:hygienic dybvig1992:syntax-case))—which later
-made it into the Sixth Report on Scheme (R6RS).  MacroML achieves
-something similar in the context of ML, which is statically-typed
-,(ref :bib 'ganz2001:macroml).  Both tools allow users to define new
-binding constructs; the macro expander recognizes those bindings
-constructs, which allows it to track bindings and preserve hygiene,
-notably by ,(symbol "alpha")-renaming introduced bindings.])
+      (p [Code staging is often studied in the context of optimized code
+generation ,(ref :bib '(rompf2012:lms wang2002:s2 aktemur2013:shonan)),
+or that of hygienic macros ,(ref :bib '(kohlbecker1986:hygienic
+dybvig1992:syntax-case ganz2001:macroml)).  Gexps appear to be the first
+use of staging in the context of software deployment.  Apart from LMS,
+which relies on types ,(ref :bib 'rompf2012:lms), most approaches to
+staging rely on syntactic annotations similar to ,(tt [bracket]) or ,(tt
+[gexp]).  Scheme’s ,(emph [hygienic macros]), now part of the R5RS and
+R6RS standards, as well as MacroML ,(ref :bib 'ganz2001:macroml) support
+user-defined binding constructs; the macro expander recognizes those
+bindings constructs, which allows it to track bindings and preserve
+hygiene, notably by ,(symbol "alpha")-renaming introduced bindings.])
       (p [MetaScheme is a translation of MetaOCaml’s staging
 primitives, ,(tt [bracket]), ,(tt [escape]), and ,(tt [lift]) ,(ref
-:bib 'kiselyov2008:metascheme).  The beauty of MetaScheme is that it
-extends Scheme through a set of macros and does not necessitate any
-modification to the host Scheme implementation.  MetaScheme inspired
-the ,(symbol "alpha")-renaming pass described in ,(numref :text
-[Section] :ident "implementation").  However, it only considers a few
+:bib 'kiselyov2008:metascheme) implemented as a macro that expands to an
+sexp.  It considers only a few
 core binding constructs and does not address hygiene in the presence
-of user-defined binding constructs (macros).  This strategy is
-appropriate in a macro-less language with a fixed set of binding
-constructs like OCaml, but we have seen that languages such as Scheme
-that support user-defined binding constructs create additional
-challenges.
+of user-defined binding constructs introduced by macros.
 Rhiger’s work ,(ref :bib 'rhiger2012:hygienic) follows a similar
-approach but chooses to redefine Scheme’s quasiquotation rather than
-introduce new constructs.])
-      (p [Staged Scheme, or S,(sup [2]), also improved on Lisp
-quasiquotations by providing bracket, escape, and lift forms separate
+approach but redefines Scheme’s quasiquotation instead of
+introducing new constructs.])
+      (p [Staged Scheme, or S,(sup [2]), provides bracket, escape, and lift 
forms separate
 from ,(tt [quasiquote]) and ,(tt [unquote]) ,(ref :bib 'wang2002:s2).
-Therefore, as with ,(tt [syntax-case]) and gexps, quoted code has a
-disjoint type as opposed to being a list.
+As with ,(tt [syntax-case]) ,(ref :bib 'dybvig1992:syntax-case) and ,(tt
+[gexp]), staged code has a disjoint type, as opposed to being a list.
 S,(sup [2])’s focus is on programs with
-possibly more than two stages, whereas gexp are, in practice, used for
+possibly more than two stages, whereas gexps are, in practice, used for
 two-stage programs.  The article discusses ,(emph [code regeneration])
 at run time; gexps have a similar requirement here: at run time a
-given gexp may be instantiated for different system types, for
+given gexp may be instantiated for different systems, for
 instance ,(tt [x86_64-linux]) and ,(tt [i686-linux]).])
-      (p [While Guix uses ,(emph [homogeneous]) staging, where the
-source and staged language are the same, Hop instead performs ,(emph
+      (p [Hop performs ,(emph
 [heterogenous staging]): the source language is Scheme, but the
-generated code is JavaScript ,(ref :bib 'serrano2010:multitier).  Hop
-has a ,(tt [~]) (tilde) form to introduce staged expressions, and a
-,(tt [$]) (dollar) form to escape to unstaged code.  Hop involves two
-code stages: server-side code and client-side code.  Unlike
-G-expressions, support for tilde forms is built in the Hop compiler,
-and tilde forms are not first-class objects.  Hop comes with useful
+generated code is JavaScript ,(ref :bib 'serrano2010:multitier).  In Hop
+,(tt [~]) introduces staged client-side expressions and
+,(tt [$]) escapes to unstaged server-side code.  Unlike
+gexps, support for ,(tt [~]) forms is built in the Hop compiler,
+and ,(tt [~]) forms are not first-class objects.  Hop comes with useful
 multi-stage debugging facilities not found in Guix, such as the
 ability to display cross-stage stack traces with correct source
 location information.  It also has a way to express modules in scope for
diff --git a/doc/gpce-2017/staging.sbib b/doc/gpce-2017/staging.sbib
index d4e0fd0..33b277d 100644
--- a/doc/gpce-2017/staging.sbib
+++ b/doc/gpce-2017/staging.sbib
@@ -111,6 +111,42 @@ Evaluation and Semantics-Based Program Manipulation (PEPM 
1999)")
   (address "New York, NY, USA")
   (keywords "hygiene, lexical scope, program generation, quasiquotation, 
types"))
 
+(article rompf2012:lms
+  (author "Tiark Rompf and Martin Odersky")
+  (title "Lightweight Modular Staging: A Pragmatic Approach to Runtime Code 
Generation and Compiled DSLs")
+  (journal "Commun. ACM")
+  (issue_date "June 2012")
+  (volume "55")
+  (number "6")
+  (month "June")
+  (year "2012")
+  (issn "0001-0782")
+  (pages "121--130")
+  (numpages "10")
+  (url "http://doi.acm.org/10.1145/2184319.2184345";)
+  (doi "10.1145/2184319.2184345")
+  (acmid "2184345")
+  (publisher "ACM")
+  (address "New York, NY, USA"))
+
+(inproceedings aktemur2013:shonan
+  (author "Baris Aktemur, Yukiyoshi Kameyama, Oleg Kiselyov, and Chung-chieh 
Shan")
+  (title "Shonan Challenge for Generative Programming: Short Position Paper")
+  (booktitle "Proceedings of the ACM SIGPLAN 2013 Workshop on Partial 
Evaluation and Program Manipulation")
+  (series "PEPM '13")
+  (year "2013")
+  (isbn "978-1-4503-1842-6")
+  (location "Rome, Italy")
+  (pages "147--154")
+  (numpages "8")
+  (url "http://doi.acm.org/10.1145/2426890.2426917";)
+  (doi "10.1145/2426890.2426917")
+  (acmid "2426917")
+  (publisher "ACM")
+  (address "New York, NY, USA")
+  (keywords "code generation, domain-specific languages, generative 
programming, high-performance computing, staging"))
+
+
 #|
 (defun skr-from-bibtex ()
   "Vaguely convert the BibTeX snippets after POINT to SBibTeX."



reply via email to

[Prev in Thread] Current Thread [Next in Thread]