[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gnu-arch-users] Round II -- new language, arch, furth, etc.
From: |
Tom Lord |
Subject: |
[Gnu-arch-users] Round II -- new language, arch, furth, etc. |
Date: |
Tue, 20 Jul 2004 13:57:45 -0700 (PDT) |
* Review:
So far we have "xl", a very simple configuration language in which
names can be bound to values. Values can be numbers, strings,
symbols, sequences of other values, or associative mappings of names
to other values. A typical program might look like:
(define my-id "Tom Lord <address@hidden>")
(define my-revlib-path '("/v/big/lord/{revlib}"
"~/{revlib}"
"/v/big/shared/{revlib}"))
We've (to a useful-enough level of detail, for now) seen the grammar
for xl (so far) and the semantic rules that define correct programs
(such as no multiply-defined names and undefined variables being
bound to #undefined). The informal definition we've given could be
made mathematically precise pretty easily. We've sketched a trivial
C API for this language and the run-time representation of values.
So far we have no way to cause side-effects in xl, no way to run
loops, and, indeed, only constant expressions and top-level
definitions.
xl-so-far has an interesting property: all xl programs can be
executed in a finite number of steps (specifically, 0 steps).
I posit that this is an essential characteristic of anything that
we'd wish to call a "configuration language".
xl-so-far has another interesting property that we haven't seen
much use for so far (other than in the API): xl programs can be
represented as xl data. It remains to be shown if/how that will
be useful.
* Call that "xl0"
We can dub the language-so-far "xl0" and it is, in and of itself,
a perfectly good configuration language. Above all, it's boringly
simple and does little more than correct the ambiguities in, for
example, the "RFC822 header language".
Someone could now, pretty much, go off an implement xl0. There
are some open questions that would have to be nailed down, ideally
in the form of a formal spec. For example: what is the extact
syntax of strings? What characters are permitted in strings? What
happens if an association contains two definitions for one name?
I'm not aware of any such "open questions" that are hard to answer.
I've left out detailed answers for them precisely because, in my
judgement, they'll be easy to fill in later and getting bogged down
in the details now would just obscure the presentation.
So, that's xl0.
* More Expressiveness?
I've noticed that, fairly often, people wind up supplementing
static configuration languages (like xl0) with an additional
layer adding some degree of programmability.
For example: the X11 consortium programmers augmented "make"
with the C pre-processor, giving them the ability to define
and use macros in their makefiles.
Whenever that kind of programmability is added as an afterthought
the result is awkward. For example, programs reading the config
file might not know they have to run /bin/cpp. Or the syntax
of macros in the front-end language might interfere with the syntax
of the underlying language.
_If_ it turns out to be a good idea to add a degree of
programmability to xl0, probably we will want to "build that in"
rather than relying on something ad hoc like trying to use /bin/m4.
At the same time, if programmability is added, will we wind up with
a turing complete language? It's clearly not necessary (e.g,.
CPP itself is not turing complete). So I decided on this rule:
If programmability is added to xl, then:
a) it must remain true that all programs execute in a
finite number of steps
b) the implementation of an interpreter for xl must
remain very small and simple
** Is it Needed?
Whether or not more expressiveness than xl0 is needed is essentially
a matter of opinion. If more expressiveness is provided, surely
it will find some interesting uses (see below). If it isn't
provided, people will likely wind up working around that, providing
programmability in other ways.
The reason to build-in programmability rather than work-around the
absense of programmability is to help programmers better cooperate.
If everyone has their own solution for programmability of arch
configurations then one person's programs won't be useful to another
person. If everyone uses the same solution, then people can trade
little xl programs and use xl to define cooperative infrastructures.
Here is one example of how programmability might be used:
In ~/.arch-params, the user wants to define the variables:
my-default-archive
my-default-category
my-default-branch
my-default-version
my-default-fq-category
my-default-fq-branch
my-default-fq-version
and wants these to all be related. For example, the default
fully qualified category should be the default archive plus
the default category.
One way (using imaginary xl code) to accomplish this would
be using expressions:
(define my-default-fq-vesion
"address@hidden/tla--devo--1.3")
(define my-default-archive
(archive-of my-default-fq-vesion))
(define my-default-category
(archive-of my-default-fq-vesion))
[...etc...]
Here is another example of how programmability might be used:
The user wants to define some version aliases:
upstream -- the upstream version
patch-queue -- the patch queue for submissions
however, the definitions of these depends on whether
the user is a maintainer or not. Maintainers should
use one upstream version and patch queue, others should
use a different upstream version and patch queue.
This (hypothetical) program might do the trick:
(define maintainers '("joe" "jane" "fred"))
(define upstream
(if (member my-user-name maintainers)
"address@hidden/proj--devo--1.3"
"address@hidden/proj--public--1.3"))
(define patch-queue
(if (member my-user-name maintainers)
"address@hidden"
"address@hidden"))
If repeated instances of that conditional test are
undesirable, this would work just as well:
(define maintainers '("joe" "jane" "fred"))
(define alias-info
(if (member my-user-name maintainers)
'{
version => "address@hidden/proj--devo--1.3"
queue => "address@hidden"
}
'{
version => "address@hidden/proj--devo--1.3"
queue => "address@hidden"
}))
(define upstream (ref alias-info 'version))
(define patch-queue (ref alias-info 'queue))
These examples illustrate a simple "expression language" with
function calls (such as to "archive-of" in the first example)
and conditionals (such as "if" in the last example). One
definition can apparently reference others (such as
`upstream' being defined in terms of `alias-info' in the last
example).
* A Killer App for xl Expressions: Boilerplate Configurations
The examples above show some potentially interesting uses for
expressions in xl but may not be terribly compelling.
All that they do, really, is save a little typing.
But the examples suggest a similar use that I, at least, find
more significant: boilerplate configurations. This comes
down to "support for modularity": helping programmers to divide
up their work and then combine the pieces.
Suppose that I send you (or build into tla, as a default) a
template like:
(define my-id "<PUT YOUR ID HERE>")
(define my-default-fq-version "<YOUR-ARCHIVE/YOUR VERSION>")
and the rest of the program expressed entirely in terms of those:
(define my-default-archive (archive-of my-default-fq-version))
(define my-default-category (category-of ...))
...
I've captured a set of rules that define the most common usage
pattern for these variables. You can apply that common-case usage
to your environment just by fixing up the definitions of `my-id'
and `my-default-fq-version'. Your also free to change the
definitions and use something besides the common-case rules -- but
if you want the common case, it's been captured in a convenient way.
Better still, the common case rules aren't "hard-wired" into arch
in any deep way. They can be revised just by changing the
template xl program.
The only alternative here would be lots of detailed instructions
for how to edit the file. Something like:
; my-default-fq-version -- your default version
;
; Normally, after updating this, you will want to make
; corresponding changes to these other variables:
;
; my-default-fq-branch
; my-default-fq-category
; my-default-archive
; [...etc...]
(define my-id "<PUT YOUR ID HERE>")
What if you had to read through and follow the instructions in
such comments everytime you want to do something that _should_
be very simple (like instantiate a new master archive and
patch queue manager for a new project)?
Expressions seem to me like an idea worth exploring.
* The Problems with Defining Expressions
Adding expressions to xl raises a substantial number of language
design and language implementation issues.
First, as noted earlier, we have ensure that, even with expressions,
an xl program can always be executed in a finite number of steps.
Second, we have lots of little questions that need answering.
For example, is this is a valid program and, if so, what is
the value of `x':
; Does the order of definitions matter?
; Is `x' 42 or #undefined?
;
(define x y)
(define y 42)
Another example:
; When is syntax checked? Can this
; program be evaluated to produce a value
; for `x' assuming we don't care about the
; value for `y'?
;
(define x 42)
(define y (if mistake))
Another example:
; Is this a valid program?
;
; It doesn't _really_ contain an infinite recursion
;
(define x (if (odd? 3) 7 x))
how about:
(define x (if (odd? 3) x 7))
And there are always banal questions about error handling during
execution:
; What is the value of `y'?
; (What does division-by-0 produce?)
;
(define y (/ 1 0))
; What is the value of `y'?
; (What does out-of-range access produce?)
;
(define a-sequence '(a b c))
(define y (ref a-sequence 100)) ; the 100th element?!?
* The Problems with Implementing Expressions
One of our goals is to keep xl implementations very tiny and very
simple. Another goal is to make sure that xl is well-enough
defined that implementations are interchangable. Finally,
another goal is to preserve the property that all programs execute
in a finite number of steps.
These raise the question: how can we define expressions in xl in
such a way that we are:
1. confident all xl programs terminate
2. confident that a simple implementation is practical
3. confident that we have explained with precision
how a correct implementation behaves
* Solving Those Problems
Lots of very advanced techniques exist for defining languages in
ways that provide clear and provable answers for the "Problems
Defining Expressions" and "Problems Implementing Expressions".
We don't need anything terribly advanced precisely because our
language is so "weak" (i.e., not turing complete). Many of the hard
language-theory problems that show up once we allow
possibly-infinite loops don't come up in xl.
An easy way to define xl while addressing all of these issues will
be to define it _operationally_. In short, I'll write what is
in essense a mathematical description of an xl interpreter -- but
you'll see that the mathematical description translates very
directly into actual running code. (Perhaps, time permitting,
I'll just go ahead and write the description _as_ running code.)
We'll see, when that's done, that it isn't hard to prove things such
as the termination property of xl programs.
We'll also see, when that's done, that yes, in fact, a tiny and
simple interpreter is possible.
* Sneak Preview
Roughly speaking, the definition of xl-with-expressions will be
expressed as a virtual machine, furth, extended with some new
primitive primitive instructions. Loading an xl program (in the
form of data) into one of the registers of this VM, followed by
running the VM for some number of cycles, will, as a side effect,
evaluate xl expressions and store the result back in a VM register.
We'll be able to infer, by induction on the process of evaluation,
what values are produced by expressions.
The operational definition will turn out to be slightly
_overspecified_: it will do more than just define the values
produced by expressions. In addition, the operational definition
will (somewhat by "accident") define the _order_of_evaluation_ of
expressions, letting us reason about what order various
subexpressions are evaluated in under various circumstances.
The order of evaluation is superfluous, at first glance. None of
the expressions in the examples above depend on the order of
evaluation at all. For example, in this:
(define x (+ (* 3 3) (* 4 4)))
it doesn't matter whether we evaluate `(* 3 3)' first or `(* 4 4)'
first.
However, I'll then consider a class of built-in functions for which
order-of-evaluation _does_ matter: functions that perform side
effects on the environment. I'll argue that the order of evaluation
rules for xl expressions make for a very nice way to control the
order of side effects. I'll demonstrate ways in which, without
adding anything more to xl than expressions and side-effecting
functions, we get not only a very expressive configuration language,
but a language with many interesting applications....
-t
- [Gnu-arch-users] Round II -- new language, arch, furth, etc.,
Tom Lord <=
- Re: [Gnu-arch-users] Round II -- new language, arch, furth, etc., Colin Walters, 2004/07/20
- Re: [Gnu-arch-users] Round II -- new language, arch, furth, etc., Tom Lord, 2004/07/20
- [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Miles Bader, 2004/07/20
- Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Colin Walters, 2004/07/20
- Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Tom Lord, 2004/07/20
- [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Miles Bader, 2004/07/20
- Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Matthieu Moy, 2004/07/21
- Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., James Blackwell, 2004/07/21
- Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Andrew Suffield, 2004/07/21
- Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Matthieu Moy, 2004/07/21