guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] GNU Guile branch, stable-2.0, updated. v2.0.7-64-ga14b6e


From: Andy Wingo
Subject: [Guile-commits] GNU Guile branch, stable-2.0, updated. v2.0.7-64-ga14b6e1
Date: Mon, 28 Jan 2013 11:03:06 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU Guile".

http://git.savannah.gnu.org/cgit/guile.git/commit/?id=a14b6e18259bcc860ecc7bd3bf320d3adca9ea47

The branch, stable-2.0 has been updated
       via  a14b6e18259bcc860ecc7bd3bf320d3adca9ea47 (commit)
       via  1488753a66d499cab55edee8ee7e2b2ea5a64717 (commit)
       via  bb0615d0157facb67ee1489a9764866dcd97eb20 (commit)
       via  3e31e75a462fc05f425b887105ccd6607a56ca3b (commit)
       via  2b6fcf5b1f6f3cf8d94cada4f00885b275f1a7c5 (commit)
      from  25645a0ac9158916667588b76cd541ee9dc05132 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit a14b6e18259bcc860ecc7bd3bf320d3adca9ea47
Author: Andy Wingo <address@hidden>
Date:   Mon Jan 28 12:01:16 2013 +0100

    xml->sxml argument can be a port or a string
    
    * module/sxml/simple.scm (xml->sxml): Allow the optional arg to be a
      port or a string.
    * doc/ref/sxml.texi (Reading and Writing XML): Update docs.

commit 1488753a66d499cab55edee8ee7e2b2ea5a64717
Author: Andy Wingo <address@hidden>
Date:   Sun Jan 27 21:56:07 2013 +0100

    make (sxml simple)'s xml->sxml more capable
    
    * module/sxml/simple.scm (xml->sxml): Add #:namespaces,
      #:declare-namespaces?, #:entities, #:default-entity-handler, and
      #:trim-whitespace? arguments.
    
    * doc/ref/sxml.texi (Reading and Writing XML): Document the new
      options.

commit bb0615d0157facb67ee1489a9764866dcd97eb20
Author: Andy Wingo <address@hidden>
Date:   Sun Jan 27 22:20:02 2013 +0100

    ssax: treat *DEFAULT* as a fallback handler in entity alist
    
    * module/sxml/upstream/SSAX.scm (ssax:handle-parsed-entity):
      Interpret *DEFAULT* as being a default handler procedure for parsed
      entities.  Includes test.

commit 3e31e75a462fc05f425b887105ccd6607a56ca3b
Author: Andy Wingo <address@hidden>
Date:   Mon Jan 28 10:48:42 2013 +0100

    begin rewriting SXML docs
    
    * doc/ref/sxml.texi (SXML): Reorder and begin rewriting.  Fix formatting
      throughout, provide a new introduction, and the beginning of proper
      SSAX documentation.
    
    * doc/ref/sxml-match.texi:
    * doc/ref/texinfo.texi:
    * doc/ref/web.texi: Update references to new node names.

commit 2b6fcf5b1f6f3cf8d94cada4f00885b275f1a7c5
Author: Andy Wingo <address@hidden>
Date:   Sun Jan 27 21:53:49 2013 +0100

    current-ssax-error-port is a parameter
    
    * module/sxml/ssax.scm (current-ssax-error-port): Change to be a
      parameter.

-----------------------------------------------------------------------

Summary of changes:
 doc/ref/sxml-match.texi       |   10 +-
 doc/ref/sxml.texi             | 1040 +++++++++++++++++++++++------------------
 doc/ref/texinfo.texi          |   14 +-
 doc/ref/web.texi              |    6 +-
 module/sxml/simple.scm        |  158 ++++++-
 module/sxml/ssax.scm          |   12 +-
 module/sxml/upstream/SSAX.scm |   31 ++-
 7 files changed, 798 insertions(+), 473 deletions(-)

diff --git a/doc/ref/sxml-match.texi b/doc/ref/sxml-match.texi
index 7a1a9ac..d2795a5 100644
--- a/doc/ref/sxml-match.texi
+++ b/doc/ref/sxml-match.texi
@@ -1,6 +1,6 @@
 @c -*-texinfo-*-
 @c This is part of the GNU Guile Reference Manual.
address@hidden Copyright (C) 2010  Free Software Foundation, Inc.
address@hidden Copyright (C) 2010, 2013  Free Software Foundation, Inc.
 @c See the file guile.texi for copying conditions.
 @c
 @c Based on the documentation at
@@ -16,10 +16,10 @@
 @cindex pattern matching (SXML)
 @cindex SXML pattern matching
 
-The @code{(sxml match)} module provides syntactic forms for pattern matching of
-SXML trees, in a ``by example'' style reminiscent of the pattern matching of 
the
address@hidden and @code{syntax-case} macro systems.  @xref{sxml simple,
-the @code{(sxml simple)} module}, for more information on SXML.
+The @code{(sxml match)} module provides syntactic forms for pattern
+matching of SXML trees, in a ``by example'' style reminiscent of the
+pattern matching of the @code{syntax-rules} and @code{syntax-case} macro
+systems.  @xref{SXML}, for more information on SXML.
 
 The following address@hidden example is taken from a paper by
 Krishnamurthi et al.  Their paper was the first to show the usefulness of the
diff --git a/doc/ref/sxml.texi b/doc/ref/sxml.texi
index 3ce6062..66584bf 100644
--- a/doc/ref/sxml.texi
+++ b/doc/ref/sxml.texi
@@ -6,257 +6,256 @@
 @node SXML
 @section SXML
 
address@hidden
-* sxml apply-templates::  A more XSLT-like approach to SXML transformations
-* sxml fold::            Fold-based SXML transformation operators
-* sxml simple::          Convenient XML parsing and serializing
-* sxml ssax::            Functional-style XML parsing for Scheme
-* sxml ssax input-parse::  The SSAX tokenizer, optimized for Guile
-* sxml transform::       A higher-order SXML transformation operator, 
@code{pre-post-order}
-* sxml xpath::           XPath for SXML
address@hidden menu
-
address@hidden sxml apply-templates
address@hidden (sxml apply-templates)
address@hidden Overview
-Pre-order traversal of a tree and creation of a new tree:
-
address@hidden 
-       apply-templates:: tree x <templates> -> <new-tree>
address@hidden smallexample
-
-where
-
address@hidden 
- <templates> ::= (<template> ...)
- <template>  ::= (<node-test> <node-test> ... <node-test> . <handler>)
- <node-test> ::= an argument to node-typeof? above
- <handler>   ::= <tree> -> <new-tree>
address@hidden smallexample
-
-This procedure does a @emph{normal}, pre-order traversal of an SXML
-tree. It walks the tree, checking at each node against the list of
-matching templates.
-
-If the match is found (which must be unique, i.e., unambiguous), the
-corresponding handler is invoked and given the current node as an
-argument. The result from the handler, which must be a @code{<tree>},
-takes place of the current node in the resulting tree. The name of the
-function is not accidental: it resembles rather closely an
address@hidden function of XSLT.
-
address@hidden Usage
address@hidden apply-templates address@hidden apply-templates tree templates
address@hidden defun
-
address@hidden sxml fold
address@hidden (sxml fold)
address@hidden Overview
address@hidden(sxml fold)} defines a number of variants of the @dfn{fold}
-algorithm for use in transforming SXML trees. Additionally it defines
-the layout operator, @code{fold-layout}, which might be described as a
-context-passing variant of SSAX's @code{pre-post-order}.
-
address@hidden Usage
address@hidden fold address@hidden foldt fup fhere tree
-The standard multithreaded tree fold.
-
address@hidden is of type [a] -> a. @var{fhere} is of type object -> a.
-
address@hidden defun
-
address@hidden fold address@hidden foldts fdown fup fhere seed tree
-The single-threaded tree fold originally defined in SSAX. @xref{sxml
-ssax,,(sxml ssax)}, for more information.
+SXML is a native representation of XML in terms of standard Scheme data
+types: lists, symbols, and strings.  For example, the simple XML
+fragment:
 
address@hidden defun
-
address@hidden fold address@hidden foldts* fdown fup fhere seed tree
-A variant of @ref{sxml fold foldts,,foldts} that allows pre-order tree
-rewrites. Originally defined in Andy Wingo's 2007 paper,
address@hidden of fold to XML transformation}.
-
address@hidden defun
-
address@hidden fold address@hidden fold-values proc list . seeds
-A variant of @ref{SRFI-1 Fold and Map,fold} that allows multi-valued
-seeds. Note that the order of the arguments differs from that of
address@hidden
-
address@hidden defun
-
address@hidden fold address@hidden foldts*-values fdown fup fhere tree . seeds
-A variant of @ref{sxml fold foldts*,,foldts*} that allows multi-valued
-seeds. Originally defined in Andy Wingo's 2007 paper, @emph{Applications
-of fold to XML transformation}.
-
address@hidden defun
-
address@hidden fold address@hidden fold-layout tree bindings params layout 
stylesheet
-A traversal combinator in the spirit of SSAX's @ref{sxml transform
-pre-post-order,,pre-post-order}.
-
address@hidden was originally presented in Andy Wingo's 2007 paper,
address@hidden of fold to XML transformation}.
-
address@hidden 
-bindings := (<binding>...)
-binding  := (<tag> <bandler-pair>...)
-          | (*default* . <post-handler>)
-          | (*text* . <text-handler>)
-tag      := <symbol>
-handler-pair := (pre-layout . <pre-layout-handler>)
-          | (post . <post-handler>)
-          | (bindings . <bindings>)
-          | (pre . <pre-handler>)
-          | (macro . <macro-handler>)
address@hidden
+<parrot type="African Grey"><name>Alfie</name></parrot>
 @end example
 
address@hidden @var
address@hidden pre-layout-handler
-A function of three arguments:
-
address@hidden @var
address@hidden kids
-the kids of the current node, before traversal
-
address@hidden params
-the params of the current node
-
address@hidden layout
-the layout coming into this node
-
address@hidden table
-
address@hidden is expected to use this information to return a
-layout to pass to the kids. The default implementation returns the
-layout given in the arguments.
-
address@hidden post-handler
-A function of five arguments:
-
address@hidden @var
address@hidden tag
-the current tag being processed
-
address@hidden params
-the params of the current node
+may be represented with the following SXML:
 
address@hidden layout
-the layout coming into the current node, before any kids were processed
-
address@hidden klayout
-the layout after processing all of the children
address@hidden
+(parrot (@@ (type "African Grey)) (name "Alfie"))
address@hidden example
 
address@hidden kids
-the already-processed child nodes
+SXML is very general, and is capable of representing all of XML.
+Formally, this means that SXML is a conforming implementation of the
address@hidden Information Set,http://www.w3.org/TR/xml-infoset/} standard.
 
address@hidden table
+Guile includes several facilities for working with XML and SXML:
+parsers, serializers, and transformers.
 
address@hidden should return two values, the layout to pass to the
-next node and the final tree.
address@hidden
+* SXML Overview::              XML, as it was meant to be
+* Reading and Writing XML::    Convenient XML parsing and serializing
+* SSAX::                       Custom functional-style XML parsers
+* Transforming SXML::          Munging SXML with @code{pre-post-order}
+* SXML Tree Fold::             Fold-based SXML transformations
+* SXPath::                     XPath for SXML
+* sxml apply-templates::       A more XSLT-like approach to SXML 
transformations
+* sxml ssax input-parse::      The SSAX tokenizer, optimized for Guile
address@hidden menu
 
address@hidden text-handler
address@hidden is a function of three arguments:
address@hidden SXML Overview
address@hidden SXML Overview
 
address@hidden @var
address@hidden text
-the string
+(This section needs to be written; volunteers welcome.)
 
address@hidden params
-the current params
 
address@hidden layout
-the current layout
address@hidden Reading and Writing XML
address@hidden Reading and Writing XML
 
address@hidden table
+The @code{(sxml simple)} module presents a basic interface for parsing
+XML from a port into the Scheme SXML format, and for serializing it back
+to text.
 
address@hidden should return two values, the layout to pass to the
-next node and the value to which the string should transform.
address@hidden
+(use-modules (sxml simple))
address@hidden example
 
address@hidden table
address@hidden {Scheme Procedure} xml->sxml [string-or-port] [#:namespaces='()] 
@
+       [#:declare-namespaces?=#t] [#:trim-whitespace?=#f] @
+       [#:entities='()] [#:default-entity-handler=#f]
+Use SSAX to parse an XML document into SXML. Takes one optional
+argument, @var{string-or-port}, which defaults to the current input
+port.  Returns the resulting SXML document.  If @var{string-or-port} is
+a port, it will be left pointing at the next available character in the
+port.
address@hidden deffn
+
+As is normal in SXML, XML elements parse as tagged lists.  Attributes,
+if any, are placed after the tag, within an @code{@@} element.  The root
+of the resulting XML will be contained in a special tag, @code{*TOP*}.
+This tag will contain the root element of the XML, but also any prior
+processing instructions.
+
address@hidden
+(xml->sxml "<foo/>")
address@hidden (*TOP* (foo))
+(xml->sxml "<foo>text</foo>")
address@hidden (*TOP* (foo "text"))
+(xml->sxml "<foo kind=\"bar\">text</foo>")
address@hidden (*TOP* (foo (@@ (kind "bar")) "text"))
+(xml->sxml "<?xml version=\"1.0\"?><foo/>")
address@hidden (*TOP* (*PI* xml "version=\"1.0\"") (foo))
address@hidden example
 
address@hidden defun
+All namespaces in the XML document must be declared, via @code{xmlns}
+attributes.  SXML elements built from non-default namespaces will have
+their tags prefixed with their URI.  Users can specify custom prefixes
+for certain namespaces with the @code{#:namespaces} keyword argument to
address@hidden>sxml}.
+
address@hidden
+(xml->sxml "<foo xmlns=\"http://example.org/ns1\";>text</foo>")
address@hidden (*TOP* (http://example.org/ns1:foo "text"))
+(xml->sxml "<foo xmlns=\"http://example.org/ns1\";>text</foo>"
+           #:namespaces '((ns1 . "http://example.org/ns1";)))
address@hidden (*TOP* (ns1:foo "text"))
+(xml->sxml "<foo xmlns:bar=\"http://example.org/ns2\";><bar:baz/></foo>"
+           #:namespaces '((ns2 . "http://example.org/ns2";)))
address@hidden (*TOP* (foo (ns2:baz)))
address@hidden example
 
address@hidden sxml simple
address@hidden (sxml simple)
address@hidden Overview
-A simple interface to XML parsing and serialization.
+Passing a true @code{#:declare-namespaces?} argument will cause the
+user-given @code{#:namespaces} to be treated as if they were declared on
+the root element.
+
address@hidden
+(xml->sxml "<foo><ns2:baz/></foo>"
+           #:namespaces '((ns2 . "http://example.org/ns2";)))
address@hidden error: undeclared namespace: `bar'
+(xml->sxml "<foo><ns2:baz/></foo>"
+           #:namespaces '((ns2 . "http://example.org/ns2";))
+           #:declare-namespaces? #t)
address@hidden (*TOP* (foo (ns2:baz)))
address@hidden example
 
address@hidden Usage
address@hidden simple xml->address@hidden xml->sxml [port]
-Use SSAX to parse an XML document into SXML. Takes one optional
-argument, @var{port}, which defaults to the current input port.
+By default, all whitespace in XML is significant.  Passing the
address@hidden:trim-whitespace?} keyword argument to @code{xml->sxml} will trim
+whitespace in front, behind and between elements, treating it as
+``unsignificant''.  Whitespace in text fragments is left alone.
+
address@hidden
+(xml->sxml "<foo>\n<bar> Alfie the parrot! </bar>\n</foo>")
address@hidden (*TOP* (foo "\n" (bar " Alfie the parrot! ") "\n")
+(xml->sxml "<foo>\n<bar> Alfie the parrot! </bar>\n</foo>"
+           #:trim-whitespace? #t)
address@hidden (*TOP* (foo (bar " Alfie the parrot! "))
address@hidden example
 
address@hidden defun
+Parsed entities may be declared with the @code{#:entities} keyword
+argument, or handled with the @code{#:default-entity-handler}.  By
+default, only the standard @code{&lt;}, @code{&gt;}, @code{&amp;},
address@hidden&apos;} and @code{&quot;} entities are defined, as well as the
address@hidden&address@hidden;} and @code{&address@hidden;} (decimal and 
hexadecimal)
+numeric character entities.
+
address@hidden
+(xml->sxml "<foo>&amp;</foo>")
address@hidden (*TOP* (foo "&"))
+(xml->sxml "<foo>&nbsp;</foo>")
address@hidden error: undefined entity: nbsp
+(xml->sxml "<foo>&#xA0;</foo>")
address@hidden (*TOP* (foo "\xa0"))
+(xml->sxml "<foo>&nbsp;</foo>"
+           #:entities '((nbsp . "\xa0")))
address@hidden (*TOP* (foo "\xa0"))
+(xml->sxml "<foo>&nbsp; &foo;</foo>"
+           #:default-entity-handler
+           (lambda (port name)
+             (case name
+               ((nbsp) "\xa0")
+               (else
+                (format (current-warning-port)
+                        "~a:~a:~a: undefined entitity: ~a\n"
+                        (or (port-filename port) "<unknown file>")
+                        (port-line port) (port-column port)
+                        name)
+                (symbol->string name)))))
address@hidden <unknown file>:0:17: undefined entitity: foo
address@hidden (*TOP* (foo "\xa0 foo"))
address@hidden example
 
address@hidden simple sxml->address@hidden sxml->xml tree [port]
-Serialize the sxml tree @var{tree} as XML. The output will be written to
address@hidden {Scheme Procedure} sxml->xml tree [port]
+Serialize the SXML tree @var{tree} as XML. The output will be written to
 the current output port, unless the optional argument @var{port} is
 present.
address@hidden deffn
 
address@hidden defun
-
address@hidden simple sxml->address@hidden sxml->string sxml
address@hidden {Scheme Procedure} sxml->string sxml
 Detag an sxml tree @var{sxml} into a string. Does not perform any
 formatting.
-
address@hidden defun
-
address@hidden sxml ssax
address@hidden (sxml ssax)
address@hidden Overview
address@hidden Functional XML parsing framework
address@hidden SAX/DOM and SXML parsers with support for XML Namespaces and 
validation
-This is a package of low-to-high level lexing and parsing procedures
-that can be combined to yield a SAX, a DOM, a validating parser, or a
-parser intended for a particular document type. The procedures in the
-package can be used separately to tokenize or parse various pieces of
-XML documents. The package supports XML Namespaces, internal and
-external parsed entities, user-controlled handling of whitespace, and
-validation. This module therefore is intended to be a framework, a set
-of "Lego blocks" you can use to build a parser following any discipline
-and performing validation to any degree. As an example of the parser
-construction, this file includes a semi-validating SXML parser.
-
-The present XML framework has a "sequential" feel of SAX yet a
-"functional style" of DOM. Like a SAX parser, the framework scans the
-document only once and permits incremental processing. An application
-that handles document elements in order can run as efficiently as
-possible. @emph{Unlike} a SAX parser, the framework does not require an
-application register stateful callbacks and surrender control to the
-parser. Rather, it is the application that can drive the framework --
-calling its functions to get the current lexical or syntax element.
-These functions do not maintain or mutate any state save the input port.
-Therefore, the framework permits parsing of XML in a pure functional
-style, with the input port being a monad (or a linear, read-once
-parameter).
-
-Besides the @var{port}, there is another monad -- @var{seed}. Most of
address@hidden deffn
+
address@hidden SSAX
address@hidden SSAX: A Functional XML Parsing Toolkit
+
+Guile's XML parser is based on Oleg Kiselyov's powerful XML parsing
+toolkit, SSAX.
+
address@hidden History
+
+Back in the 1990s, when the world was young again and XML was the
+solution to all of its problems, there were basically two kinds of XML
+parsers out there: DOM parsers and SAX parsers.
+
+A DOM parser reads through an entire XML document, building up a tree of
+``DOM objects'' representing the document structure.  They are very easy
+to use, but sometimes you don't actually want all of the information in
+a document; building an object tree is not necessary if all you want to
+do is to count word frequencies in a document, for example.
+
+SAX parsers were created to give the programmer more control on the
+parsing process.  A programmer gives the SAX parser a number of
+``callbacks'': functions that will be called on various features of the
+XML stream as they are encountered.  SAX parsers are more efficient, but
+much harder to user, as users typically have to manually maintain a
+stack of open elements.
+
+Kiselyov realized that the SAX programming model could be made much
+simpler if the callbacks were formulated not as a linear fold across the
+features of the XML stream, but as a @emph{tree fold} over the structure
+implicit in the XML.  In this way, the user has a very convenient,
+functional-style interface that can still generate optimal parsers.
+
+The @code{xml->sxml} interface from the @code{(sxml simple)} module is a
+DOM-style parser built using SSAX, though it returns SXML instead of DOM
+objects.
+
address@hidden Implementation
+
address@hidden(sxml ssax)} is a package of low-to-high level lexing and parsing
+procedures that can be combined to yield a SAX, a DOM, a validating
+parser, or a parser intended for a particular document type.  The
+procedures in the package can be used separately to tokenize or parse
+various pieces of XML documents.  The package supports XML Namespaces,
+internal and external parsed entities, user-controlled handling of
+whitespace, and validation.  This module therefore is intended to be a
+framework, a set of ``Lego blocks'' you can use to build a parser
+following any discipline and performing validation to any degree.  As an
+example of the parser construction, this file includes a semi-validating
+SXML parser.
+
+SSAX has a ``sequential'' feel of SAX yet a ``functional style'' of DOM.
+Like a SAX parser, the framework scans the document only once and
+permits incremental processing.  An application that handles document
+elements in order can run as efficiently as possible.  @emph{Unlike} a
+SAX parser, the framework does not require an application register
+stateful callbacks and surrender control to the parser.  Rather, it is
+the application that can drive the framework -- calling its functions to
+get the current lexical or syntax element.  These functions do not
+maintain or mutate any state save the input port.  Therefore, the
+framework permits parsing of XML in a pure functional style, with the
+input port being a monad (or a linear, read-once parameter).
+
+Besides the @var{port}, there is another monad -- @var{seed}.  Most of
 the middle- and high-level parsers are single-threaded through the
address@hidden The functions of this framework do not process or affect the
address@hidden in any way: they simply pass it around as an instance of an
-opaque datatype. User functions, on the other hand, can use the seed to
-maintain user's state, to accumulate parsing results, etc. A user can
-freely mix his own functions with those of the framework. On the other
-hand, the user may wish to instantiate a high-level parser:
address@hidden:make-elem-parser} or @code{SSAX:make-parser}. In the latter
address@hidden  The functions of this framework do not process or affect
+the @var{seed} in any way: they simply pass it around as an instance of
+an opaque datatype.  User functions, on the other hand, can use the seed
+to maintain user's state, to accumulate parsing results, etc.  A user
+can freely mix his own functions with those of the framework.  On the
+other hand, the user may wish to instantiate a high-level parser:
address@hidden:make-elem-parser} or @code{SSAX:make-parser}.  In the latter
 case, the user must provide functions of specific signatures, which are
 called at predictable moments during the parsing: to handle character
-data, element data, or processing instructions (PI). The functions are
+data, element data, or processing instructions (PI).  The functions are
 always given the @var{seed}, among other parameters, and must return the
 new @var{seed}.
 
 From a functional point of view, XML parsing is a combined
-pre-post-order traversal of a "tree" that is the XML document itself.
+pre-post-order traversal of a ``tree'' that is the XML document itself.
 This down-and-up traversal tells the user about an element when its
-start tag is encountered. The user is notified about the element once
-more, after all element's children have been handled. The process of XML
-parsing therefore is a fold over the raw XML document. Unlike a fold
-over trees defined in [1], the parser is necessarily single-threaded --
-obviously as elements in a text XML document are laid down sequentially.
-The parser therefore is a tree fold that has been transformed to accept
-an accumulating parameter [1,2].
+start tag is encountered.  The user is notified about the element once
+more, after all element's children have been handled.  The process of
+XML parsing therefore is a fold over the raw XML document.  Unlike a
+fold over trees defined in [1], the parser is necessarily
+single-threaded -- obviously as elements in a text XML document are laid
+down sequentially.  The parser therefore is a tree fold that has been
+transformed to accept an accumulating parameter [1,2].
 
 Formally, the denotational semantics of the parser can be expressed as
 
@@ -287,20 +286,22 @@ The real parser created by @code{SSAX:make-parser} is 
slightly more
 complicated, to account for processing instructions, entity references,
 namespaces, processing of document type declaration, etc.
 
-The XML standard document referred to in this module
address@hidden://www.w3.org/TR/1998/REC-xml-19980210.html}
+The XML standard document referred to in this module is
address@hidden://www.w3.org/TR/1998/REC-xml-19980210.html}
 
 The present file also defines a procedure that parses the text of an XML
 document or of a separate element into SXML, an S-expression-based model
-of an XML Information Set. SXML is also an Abstract Syntax Tree of an
-XML document. SXML is similar but not identical to DOM; SXML is
+of an XML Information Set.  SXML is also an Abstract Syntax Tree of an
+XML document.  SXML is similar but not identical to DOM; SXML is
 particularly suitable for Scheme-based XML/HTML authoring, SXPath
-queries, and tree transformations. See SXML.html for more details. SXML
-is a term implementation of evaluation of the XML document [3]. The
-other implementation is context-passing.
+queries, and tree transformations.  See SXML.html for more details.
+SXML is a term implementation of evaluation of the XML document [3].
+The other implementation is context-passing.
 
-The present frameworks fully supports the XML Namespaces
-Recommendation:@uref{http://www.w3.org/TR/REC-xml-names/} Other links:
+The present frameworks fully supports the XML Namespaces Recommendation:
address@hidden://www.w3.org/TR/REC-xml-names/}.
+
+Other links:
 
 @table @asis
 @item [1]
@@ -319,175 +320,109 @@ Pearl. Proc ICFP'00, pp. 186-197.
 @end table
 
 @subsubsection Usage
address@hidden ssax address@hidden current-ssax-error-port 
address@hidden defun
address@hidden {Scheme Procedure} current-ssax-error-port 
address@hidden deffn
 
address@hidden ssax address@hidden with-ssax-error-to-port port thunk
address@hidden defun
address@hidden {Scheme Procedure} with-ssax-error-to-port port thunk
address@hidden deffn
 
address@hidden ssax address@hidden xml-token? _
address@hidden {Scheme Procedure} xml-token? _
 @verbatim 
  -- Scheme Procedure: pair? x
      Return `#t' if X is a pair; otherwise return `#f'.
 
  
 @end verbatim
address@hidden deffn
 
address@hidden defun
-
address@hidden ssax address@hidden xml-token-kind token
address@hidden defspec
address@hidden {Scheme Syntax} xml-token-kind token
address@hidden deffn
 
address@hidden ssax address@hidden xml-token-head token
address@hidden defspec
address@hidden {Scheme Syntax} xml-token-head token
address@hidden deffn
 
address@hidden ssax address@hidden make-empty-attlist 
address@hidden defun
address@hidden {Scheme Procedure} make-empty-attlist 
address@hidden deffn
 
address@hidden ssax address@hidden attlist-add attlist name-value
address@hidden defun
address@hidden {Scheme Procedure} attlist-add attlist name-value
address@hidden deffn
 
address@hidden ssax address@hidden attlist-null? _
address@hidden {Scheme Procedure} attlist-null? _
 @verbatim 
  -- Scheme Procedure: null? x
      Return `#t' iff X is the empty list, else `#f'.
 
  
 @end verbatim
address@hidden deffn
 
address@hidden defun
address@hidden {Scheme Procedure} attlist-remove-top attlist
address@hidden deffn
 
address@hidden ssax address@hidden attlist-remove-top attlist
address@hidden defun
address@hidden {Scheme Procedure} attlist->alist attlist
address@hidden deffn
 
address@hidden ssax attlist->address@hidden attlist->alist attlist
address@hidden defun
address@hidden {Scheme Procedure} attlist-fold kons knil lis1
address@hidden deffn
 
address@hidden ssax address@hidden attlist-fold kons knil lis1
address@hidden defun
-
address@hidden ssax address@hidden define-parsed-entity! entity str
-Define a new parsed entity. @var{entity} should be a symbol.
address@hidden {Scheme Procedure} define-parsed-entity! entity str
+Define a new parsed entity.  @var{entity} should be a symbol.
 
 Instances of &@var{entity}; in XML text will be replaced with the string
 @var{str}, which will then be parsed.
address@hidden deffn
 
address@hidden defun
-
address@hidden ssax address@hidden reset-parsed-entity-definitions! 
address@hidden {Scheme Procedure} reset-parsed-entity-definitions! 
 Restore the set of parsed entity definitions to its initial state.
address@hidden deffn
 
address@hidden defun
-
address@hidden ssax ssax:uri-string->address@hidden ssax:uri-string->symbol 
uri-str
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:skip-internal-dtd port
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:read-pi-body-as-string port
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:reverse-collect-str-drop-ws 
fragments
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:read-markup-token port
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:read-cdata-body port str-handler 
seed
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:read-char-ref port
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:read-attributes port entities
address@hidden defun
address@hidden {Scheme Procedure} ssax:uri-string->symbol uri-str
address@hidden deffn
 
address@hidden ssax ssax:address@hidden ssax:complete-start-tag tag-head port 
elems entities namespaces
address@hidden defun
address@hidden {Scheme Procedure} ssax:skip-internal-dtd port
address@hidden deffn
 
address@hidden ssax ssax:address@hidden ssax:read-external-id port
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:read-char-data port expect-eof? 
str-handler seed
address@hidden defun
-
address@hidden ssax ssax:xml->address@hidden ssax:xml->sxml port 
namespace-prefix-assig
address@hidden defun
-
address@hidden ssax ssax:address@hidden ssax:make-parser  . kw-val-pairs
address@hidden defspec
-
address@hidden ssax ssax:address@hidden ssax:make-pi-parser orig-handlers
address@hidden defspec
-
address@hidden ssax ssax:address@hidden ssax:make-elem-parser my-new-level-seed 
my-finish-element my-char-data-handler my-pi-handlers
address@hidden defspec
-
address@hidden sxml ssax input-parse
address@hidden (sxml ssax input-parse)
address@hidden Overview
-A simple lexer.
-
-The procedures in this module surprisingly often suffice to parse an
-input stream. They either skip, or build and return tokens, according to
-inclusion or delimiting semantics. The list of characters to expect,
-include, or to break at may vary from one invocation of a function to
-another. This allows the functions to easily parse even
-context-sensitive languages.
-
-EOF is generally frowned on, and thrown up upon if encountered.
-Exceptions are mentioned specifically. The list of expected characters
-(characters to skip until, or break-characters) may include an EOF
-"character", which is to be coded as the symbol, @code{*eof*}.
-
-The input stream to parse is specified as a @dfn{port}, which is usually
-the last (and optional) argument. It defaults to the current input port
-if omitted.
address@hidden {Scheme Procedure} ssax:read-pi-body-as-string port
address@hidden deffn
 
-If the parser encounters an error, it will throw an exception to the key
address@hidden The arguments will be of the form @code{(@var{port}
address@hidden @var{specialising-msg}*)}.
address@hidden {Scheme Procedure} ssax:reverse-collect-str-drop-ws fragments
address@hidden deffn
 
-The first argument is a port, which typically points to the offending
-character or its neighborhood. You can then use @code{port-column} and
address@hidden to query the current position. @var{message} is the
-description of the error. Other arguments supply more details about the
-problem.
address@hidden {Scheme Procedure} ssax:read-markup-token port
address@hidden deffn
 
address@hidden Usage
address@hidden ssax input-parse address@hidden peek-next-char [port]
address@hidden defun
address@hidden {Scheme Procedure} ssax:read-cdata-body port str-handler seed
address@hidden deffn
 
address@hidden ssax input-parse address@hidden assert-curr-char expected-chars 
comment [port]
address@hidden defun
address@hidden {Scheme Procedure} ssax:read-char-ref port
address@hidden deffn
 
address@hidden ssax input-parse address@hidden skip-until arg [port]
address@hidden defun
address@hidden {Scheme Procedure} ssax:read-attributes port entities
address@hidden deffn
 
address@hidden ssax input-parse address@hidden skip-while skip-chars [port]
address@hidden defun
address@hidden {Scheme Procedure} ssax:complete-start-tag tag-head port elems 
entities namespaces
address@hidden deffn
 
address@hidden ssax input-parse address@hidden next-token prefix-skipped-chars 
break-chars [comment] [port]
address@hidden defun
address@hidden {Scheme Procedure} ssax:read-external-id port
address@hidden deffn
 
address@hidden ssax input-parse address@hidden next-token-of incl-list/pred 
[port]
address@hidden defun
address@hidden {Scheme Procedure} ssax:read-char-data port expect-eof? 
str-handler seed
address@hidden deffn
 
address@hidden ssax input-parse address@hidden read-text-line [port]
address@hidden defun
address@hidden {Scheme Procedure} ssax:xml->sxml port namespace-prefix-assig
address@hidden deffn
 
address@hidden ssax input-parse address@hidden read-string n [port]
address@hidden defun
address@hidden {Scheme Syntax} ssax:make-parser . kw-val-pairs
address@hidden deffn
 
address@hidden ssax input-parse address@hidden find-string-from-port? _ _ . _
-Looks for @var{str} in @var{<input-port>}, optionally within the first
address@hidden characters.
address@hidden {Scheme Syntax} ssax:make-pi-parser orig-handlers
address@hidden deffn
 
address@hidden defun
address@hidden {Scheme Syntax} ssax:make-elem-parser my-new-level-seed 
my-finish-element my-char-data-handler my-pi-handlers
address@hidden deffn
 
address@hidden sxml transform
address@hidden (sxml transform)
address@hidden Transforming SXML
address@hidden Transforming SXML
 @subsubsection Overview
 @heading SXML expression tree transformers
 @subheading Pre-Post-order traversal of a tree and creation of a new tree
@@ -508,11 +443,11 @@ where
 @end smallexample
 
 The pre-post-order function visits the nodes and nodelists
-pre-post-order (depth-first). For each @code{<Node>} of the form
+pre-post-order (depth-first).  For each @code{<Node>} of the form
 @code{(@var{name} <Node> ...)}, it looks up an association with the
-given @var{name} among its @var{<bindings>}. If failed,
address@hidden tries to locate a @code{*default*} binding. It's
-an error if the latter attempt fails as well. Having found a binding,
+given @var{name} among its @var{<bindings>}.  If failed,
address@hidden tries to locate a @code{*default*} binding.  It's
+an error if the latter attempt fails as well.  Having found a binding,
 the @code{pre-post-order} function first checks to see if the binding is
 of the form
 
@@ -520,14 +455,14 @@ of the form
        (<trigger-symbol> *preorder* . <handler>)
 @end smallexample
 
-If it is, the handler is 'applied' to the current node. Otherwise, the
+If it is, the handler is 'applied' to the current node.  Otherwise, the
 pre-post-order function first calls itself recursively for each child of
 the current node, with @var{<new-bindings>} prepended to the
address@hidden<bindings>} in effect. The result of these calls is passed to the
address@hidden<handler>} (along with the head of the current @var{<Node>}). To 
be
address@hidden<bindings>} in effect.  The result of these calls is passed to the
address@hidden<handler>} (along with the head of the current @var{<Node>}).  To 
be
 more precise, the handler is _applied_ to the head of the current node
-and its processed children. The result of the handler, which should also
-be a @code{<tree>}, replaces the current @var{<Node>}. If the current
+and its processed children.  The result of the handler, which should also
+be a @code{<tree>}, replaces the current @var{<Node>}.  If the current
 @var{<Node>} is a text string or other atom, a special binding with a
 symbol @code{*text*} is looked up.
 
@@ -537,60 +472,182 @@ A binding can also be of a form
        (<trigger-symbol> *macro* . <handler>)
 @end smallexample
 
-This is equivalent to @code{*preorder*} described above. However, the
+This is equivalent to @code{*preorder*} described above.  However, the
 result is re-processed again, with the current stylesheet.
 
 @subsubsection Usage
address@hidden transform SRV:address@hidden SRV:send-reply . fragments
address@hidden {Scheme Procedure} SRV:send-reply . fragments
 Output the @var{fragments} to the current output port.
 
 The fragments are a list of strings, characters, numbers, thunks,
address@hidden, @code{#t} -- and other fragments. The function traverses the
address@hidden, @code{#t} -- and other fragments.  The function traverses the
 tree depth-first, writes out strings and characters, executes thunks,
-and ignores @code{#f} and @code{'()}. The function returns @code{#t} if
+and ignores @code{#f} and @code{'()}.  The function returns @code{#t} if
 anything was written at all; otherwise the result is @code{#f} If
 @code{#t} occurs among the fragments, it is not written out but causes
 the result of @code{SRV:send-reply} to be @code{#t}.
address@hidden deffn
+
address@hidden {Scheme Procedure} foldts fdown fup fhere seed tree
address@hidden deffn
+
address@hidden {Scheme Procedure} post-order tree bindings
address@hidden deffn
+
address@hidden {Scheme Procedure} pre-post-order tree bindings
address@hidden deffn
+
address@hidden {Scheme Procedure} replace-range beg-pred end-pred forest
address@hidden deffn
+
address@hidden SXML Tree Fold
address@hidden SXML Tree Fold
address@hidden Overview
address@hidden(sxml fold)} defines a number of variants of the @dfn{fold}
+algorithm for use in transforming SXML trees.  Additionally it defines
+the layout operator, @code{fold-layout}, which might be described as a
+context-passing variant of SSAX's @code{pre-post-order}.
+
address@hidden Usage
address@hidden {Scheme Procedure} foldt fup fhere tree
+The standard multithreaded tree fold.
+
address@hidden is of type [a] -> a. @var{fhere} is of type object -> a.
address@hidden deffn
+
address@hidden {Scheme Procedure} foldts fdown fup fhere seed tree
+The single-threaded tree fold originally defined in SSAX.  @xref{SSAX},
+for more information.
address@hidden deffn
+
address@hidden {Scheme Procedure} foldts* fdown fup fhere seed tree
+A variant of @code{foldts} that allows pre-order tree
+rewrites.  Originally defined in Andy Wingo's 2007 paper,
address@hidden of fold to XML transformation}.
address@hidden deffn
+
address@hidden {Scheme Procedure} fold-values proc list . seeds
+A variant of @code{fold} that allows multi-valued seeds.  Note that the
+order of the arguments differs from that of @code{fold}.  @xref{SRFI-1
+Fold and Map}.
address@hidden deffn
+
address@hidden {Scheme Procedure} foldts*-values fdown fup fhere tree . seeds
+A variant of @code{foldts*} that allows multi-valued
+seeds.  Originally defined in Andy Wingo's 2007 paper, @emph{Applications
+of fold to XML transformation}.
address@hidden deffn
+
address@hidden {Scheme Procedure} fold-layout tree bindings params layout 
stylesheet
+A traversal combinator in the spirit of @code{pre-post-order}.
address@hidden SXML}.
+
address@hidden was originally presented in Andy Wingo's 2007 paper,
address@hidden of fold to XML transformation}.
+
address@hidden 
+bindings := (<binding>...)
+binding  := (<tag> <bandler-pair>...)
+          | (*default* . <post-handler>)
+          | (*text* . <text-handler>)
+tag      := <symbol>
+handler-pair := (pre-layout . <pre-layout-handler>)
+          | (post . <post-handler>)
+          | (bindings . <bindings>)
+          | (pre . <pre-handler>)
+          | (macro . <macro-handler>)
address@hidden example
+
address@hidden @var
address@hidden pre-layout-handler
+A function of three arguments:
+
address@hidden @var
address@hidden kids
+the kids of the current node, before traversal
+
address@hidden params
+the params of the current node
+
address@hidden layout
+the layout coming into this node
+
address@hidden table
+
address@hidden is expected to use this information to return a
+layout to pass to the kids.  The default implementation returns the
+layout given in the arguments.
+
address@hidden post-handler
+A function of five arguments:
+
address@hidden @var
address@hidden tag
+the current tag being processed
+
address@hidden params
+the params of the current node
+
address@hidden layout
+the layout coming into the current node, before any kids were processed
+
address@hidden klayout
+the layout after processing all of the children
+
address@hidden kids
+the already-processed child nodes
 
address@hidden defun
address@hidden table
 
address@hidden transform address@hidden foldts fdown fup fhere seed tree
address@hidden defun
address@hidden should return two values, the layout to pass to the
+next node and the final tree.
 
address@hidden transform address@hidden post-order tree bindings
address@hidden defun
address@hidden text-handler
address@hidden is a function of three arguments:
 
address@hidden transform address@hidden pre-post-order tree bindings
address@hidden defun
address@hidden @var
address@hidden text
+the string
 
address@hidden transform address@hidden replace-range beg-pred end-pred forest
address@hidden defun
address@hidden params
+the current params
 
address@hidden sxml xpath
address@hidden (sxml xpath)
address@hidden layout
+the current layout
+
address@hidden table
+
address@hidden should return two values, the layout to pass to the
+next node and the value to which the string should transform.
+
address@hidden table
address@hidden deffn
+
address@hidden SXPath
address@hidden SXPath
 @subsubsection Overview
 @heading SXPath: SXML Query Language
 SXPath is a query language for SXML, an instance of XML Information set
-(Infoset) in the form of s-expressions. See @code{(sxml ssax)} for the
-definition of SXML and more details. SXPath is also a translation into
+(Infoset) in the form of s-expressions.  See @code{(sxml ssax)} for the
+definition of SXML and more details.  SXPath is also a translation into
 Scheme of an XML Path Language, @uref{http://www.w3.org/TR/xpath,XPath}.
 XPath and SXPath describe means of selecting a set of Infoset's items or
 their properties.
 
 To facilitate queries, XPath maps the XML Infoset into an explicit tree,
 and introduces important notions of a location path and a current,
-context node. A location path denotes a selection of a set of nodes
-relative to a context node. Any XPath tree has a distinguished, root
+context node.  A location path denotes a selection of a set of nodes
+relative to a context node.  Any XPath tree has a distinguished, root
 node -- which serves as the context node for absolute location paths.
 Location path is recursively defined as a location step joined with a
-location path. A location step is a simple query of the database
-relative to a context node. A step may include expressions that further
-filter the selected set. Each node in the resulting set is used as a
-context node for the adjoining location path. The result of the step is
+location path.  A location step is a simple query of the database
+relative to a context node.  A step may include expressions that further
+filter the selected set.  Each node in the resulting set is used as a
+context node for the adjoining location path.  The result of the step is
 a union of the sets returned by the latter location paths.
 
 The SXML representation of the XML Infoset (see SSAX.scm) is rather
-suitable for querying as it is. Bowing to the XPath specification, we
+suitable for querying as it is.  Bowing to the XPath specification, we
 will refer to SXML information items as 'Nodes':
 
 @example 
@@ -610,124 +667,217 @@ An (ordered) set of nodes is just a list of the 
constituent nodes:
        <Nodeset> ::= (<Node> ...)
 @end example
 
-Nodesets, and Nodes other than text strings are both lists. A <Nodeset>
-however is either an empty list, or a list whose head is not a symbol. A
+Nodesets, and Nodes other than text strings are both lists.  A <Nodeset>
+however is either an empty list, or a list whose head is not a symbol.  A
 symbol at the head of a node is either an XML name (in which case it's a
-tag of an XML element), or an administrative name such as '@@'. This
+tag of an XML element), or an administrative name such as '@@'.  This
 uniform list representation makes processing rather simple and elegant,
-while avoiding confusion. The multi-branch tree structure formed by the
+while avoiding confusion.  The multi-branch tree structure formed by the
 mutually-recursive datatypes <Node> and <Nodeset> lends itself well to
 processing by functional languages.
 
 A location path is in fact a composite query over an XPath tree or its
-branch. A singe step is a combination of a projection, selection or a
-transitive closure. Multiple steps are combined via join and union
-operations. This insight allows us to @emph{elegantly} implement XPath
+branch.  A singe step is a combination of a projection, selection or a
+transitive closure.  Multiple steps are combined via join and union
+operations.  This insight allows us to @emph{elegantly} implement XPath
 as a sequence of projection and filtering primitives -- converters --
-joined by @dfn{combinators}. Each converter takes a node and returns a
+joined by @dfn{combinators}.  Each converter takes a node and returns a
 nodeset which is the result of the corresponding query relative to that
-node. A converter can also be called on a set of nodes. In that case it
+node.  A converter can also be called on a set of nodes.  In that case it
 returns a union of the corresponding queries over each node in the set.
 The union is easily implemented as a list append operation as all nodes
-in a SXML tree are considered distinct, by XPath conventions. We also
-preserve the order of the members in the union. Query combinators are
+in a SXML tree are considered distinct, by XPath conventions.  We also
+preserve the order of the members in the union.  Query combinators are
 high-order functions: they take converter(s) (which is a Node|Nodeset ->
-Nodeset function) and compose or otherwise combine them. We will be
+Nodeset function) and compose or otherwise combine them.  We will be
 concerned with only relative location paths [XPath]: an absolute
 location path is a relative path applied to the root node.
 
 Similarly to XPath, SXPath defines full and abbreviated notations for
-location paths. In both cases, the abbreviated notation can be
-mechanically expanded into the full form by simple rewriting rules. In
+location paths.  In both cases, the abbreviated notation can be
+mechanically expanded into the full form by simple rewriting rules.  In
 case of SXPath the corresponding rules are given as comments to a sxpath
-function, below. The regression test suite at the end of this file shows
+function, below.  The regression test suite at the end of this file shows
 a representative sample of SXPaths in both notations, juxtaposed with
-the corresponding XPath expressions. Most of the samples are borrowed
+the corresponding XPath expressions.  Most of the samples are borrowed
 literally from the XPath specification, while the others are adjusted
 for our running example, tree1.
 
 @subsubsection Usage
address@hidden xpath address@hidden nodeset? x
address@hidden defun
address@hidden {Scheme Procedure} nodeset? x
address@hidden deffn
 
address@hidden xpath address@hidden node-typeof? crit
address@hidden defun
address@hidden {Scheme Procedure} node-typeof? crit
address@hidden deffn
 
address@hidden xpath address@hidden node-eq? other
address@hidden defun
address@hidden {Scheme Procedure} node-eq? other
address@hidden deffn
 
address@hidden xpath address@hidden node-equal? other
address@hidden defun
address@hidden {Scheme Procedure} node-equal? other
address@hidden deffn
 
address@hidden xpath address@hidden node-pos n
address@hidden defun
address@hidden {Scheme Procedure} node-pos n
address@hidden deffn
 
address@hidden xpath address@hidden filter pred?
address@hidden {Scheme Procedure} filter pred?
 @verbatim 
  -- Scheme Procedure: filter pred list
      Return all the elements of 2nd arg LIST that satisfy predicate
      PRED.  The list is not disordered - elements that appear in the
      result list occur in the same order as they occur in the argument
-     list. The returned list may share a common tail with the argument
-     list. The dynamic order in which the various applications of pred
+     list.  The returned list may share a common tail with the argument
+     list.  The dynamic order in which the various applications of pred
      are made is not specified.
 
           (filter even? '(0 7 8 8 43 -4)) => (0 8 8 -4)
 
  
 @end verbatim
address@hidden deffn
 
address@hidden defun
-
address@hidden xpath address@hidden take-until pred?
address@hidden defun
address@hidden {Scheme Procedure} take-until pred?
address@hidden deffn
 
address@hidden xpath address@hidden take-after pred?
address@hidden defun
address@hidden {Scheme Procedure} take-after pred?
address@hidden deffn
 
address@hidden xpath address@hidden map-union proc lst
address@hidden defun
address@hidden {Scheme Procedure} map-union proc lst
address@hidden deffn
 
address@hidden xpath address@hidden node-reverse node-or-nodeset
address@hidden defun
address@hidden {Scheme Procedure} node-reverse node-or-nodeset
address@hidden deffn
 
address@hidden xpath address@hidden node-trace title
address@hidden defun
address@hidden {Scheme Procedure} node-trace title
address@hidden deffn
 
address@hidden xpath address@hidden select-kids test-pred?
address@hidden defun
address@hidden {Scheme Procedure} select-kids test-pred?
address@hidden deffn
 
address@hidden xpath address@hidden node-self pred?
address@hidden {Scheme Procedure} node-self pred?
 @verbatim 
  -- Scheme Procedure: filter pred list
      Return all the elements of 2nd arg LIST that satisfy predicate
      PRED.  The list is not disordered - elements that appear in the
      result list occur in the same order as they occur in the argument
-     list. The returned list may share a common tail with the argument
-     list. The dynamic order in which the various applications of pred
+     list.  The returned list may share a common tail with the argument
+     list.  The dynamic order in which the various applications of pred
      are made is not specified.
 
           (filter even? '(0 7 8 8 43 -4)) => (0 8 8 -4)
 
  
 @end verbatim
address@hidden deffn
 
address@hidden defun
address@hidden {Scheme Procedure} node-join . selectors
address@hidden deffn
 
address@hidden xpath address@hidden node-join . selectors
address@hidden defun
address@hidden {Scheme Procedure} node-reduce . converters
address@hidden deffn
 
address@hidden xpath address@hidden node-reduce . converters
address@hidden defun
address@hidden {Scheme Procedure} node-or . converters
address@hidden deffn
 
address@hidden xpath address@hidden node-or . converters
address@hidden defun
address@hidden {Scheme Procedure} node-closure test-pred?
address@hidden deffn
 
address@hidden xpath address@hidden node-closure test-pred?
address@hidden defun
address@hidden {Scheme Procedure} node-parent rootnode
address@hidden deffn
 
address@hidden xpath address@hidden node-parent rootnode
address@hidden defun
address@hidden {Scheme Procedure} sxpath path
address@hidden deffn
+
address@hidden sxml ssax input-parse
address@hidden (sxml ssax input-parse)
address@hidden Overview
+A simple lexer.
+
+The procedures in this module surprisingly often suffice to parse an
+input stream.  They either skip, or build and return tokens, according to
+inclusion or delimiting semantics.  The list of characters to expect,
+include, or to break at may vary from one invocation of a function to
+another.  This allows the functions to easily parse even
+context-sensitive languages.
+
+EOF is generally frowned on, and thrown up upon if encountered.
+Exceptions are mentioned specifically.  The list of expected characters
+(characters to skip until, or break-characters) may include an EOF
+"character", which is to be coded as the symbol, @code{*eof*}.
+
+The input stream to parse is specified as a @dfn{port}, which is usually
+the last (and optional) argument.  It defaults to the current input port
+if omitted.
+
+If the parser encounters an error, it will throw an exception to the key
address@hidden  The arguments will be of the form @code{(@var{port}
address@hidden @var{specialising-msg}*)}.
+
+The first argument is a port, which typically points to the offending
+character or its neighborhood.  You can then use @code{port-column} and
address@hidden to query the current position.  @var{message} is the
+description of the error.  Other arguments supply more details about the
+problem.
+
address@hidden Usage
address@hidden {Scheme Procedure} peek-next-char [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} assert-curr-char expected-chars comment [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} skip-until arg [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} skip-while skip-chars [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} next-token prefix-skipped-chars break-chars 
[comment] [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} next-token-of incl-list/pred [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} read-text-line [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} read-string n [port]
address@hidden deffn
+
address@hidden {Scheme Procedure} find-string-from-port? _ _ . _
+Looks for @var{str} in @var{<input-port>}, optionally within the first
address@hidden characters.
address@hidden deffn
+
address@hidden sxml apply-templates
address@hidden (sxml apply-templates)
address@hidden Overview
+Pre-order traversal of a tree and creation of a new tree:
+
address@hidden 
+       apply-templates:: tree x <templates> -> <new-tree>
address@hidden smallexample
+
+where
+
address@hidden 
+ <templates> ::= (<template> ...)
+ <template>  ::= (<node-test> <node-test> ... <node-test> . <handler>)
+ <node-test> ::= an argument to node-typeof? above
+ <handler>   ::= <tree> -> <new-tree>
address@hidden smallexample
+
+This procedure does a @emph{normal}, pre-order traversal of an SXML
+tree.  It walks the tree, checking at each node against the list of
+matching templates.
+
+If the match is found (which must be unique, i.e., unambiguous), the
+corresponding handler is invoked and given the current node as an
+argument.  The result from the handler, which must be a @code{<tree>},
+takes place of the current node in the resulting tree.  The name of the
+function is not accidental: it resembles rather closely an
address@hidden function of XSLT.
+
address@hidden Usage
address@hidden {Scheme Procedure} apply-templates tree templates
address@hidden deffn
 
address@hidden xpath address@hidden sxpath path
address@hidden defun
diff --git a/doc/ref/texinfo.texi b/doc/ref/texinfo.texi
index b2947fc..b5ef393 100644
--- a/doc/ref/texinfo.texi
+++ b/doc/ref/texinfo.texi
@@ -152,8 +152,8 @@ interested in @code{replace-titles} and 
@code{filter-empty-elements}.
 @xref{texinfo docbook replace-titles,,replace-titles}, and @ref{texinfo
 docbook filter-empty-elements,,filter-empty-elements}.
 
-Returns a nodeset, as described in @ref{sxml xpath}. That is to say,
-this function returns an untagged list of stexi elements.
+Returns a nodeset; that is to say, an untagged list of stexi elements.
address@hidden, for the definition of a nodeset.
 
 @end defun
 
@@ -184,10 +184,12 @@ For example:
 This module implements transformation from @code{stexi} to HTML. Note
 that the output of @code{stexi->shtml} is actually SXML with the HTML
 vocabulary. This means that the output can be further processed, and
-that it must eventually be serialized by @ref{sxml simple
-sxml->xml,sxml->xml}. References (i.e., the @code{@@ref} family of
-commands) are resolved by a @dfn{ref-resolver}. @xref{texinfo html
-add-ref-resolver!,add-ref-resolver!}, for more information.
+that it must eventually be serialized by @code{sxml->xml}.
address@hidden and Writing XML}.
+
+References (i.e., the @code{@@ref} family of commands) are resolved by a
address@hidden  @xref{texinfo html
+add-ref-resolver!,add-ref-resolver!}.
 
 @subsubsection Usage
 @anchor{texinfo html address@hidden add-ref-resolver! proc
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index 0f69089..6c33f32 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -127,8 +127,8 @@ basic idea is that HTML is either text, represented by a 
string, or an
 element, represented as a tagged list.  So @samp{foo} becomes
 @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
 Attributes, if present, go in a tagged list headed by @samp{@@}, like
address@hidden(img (@@ (src "http://example.com/foo.png";)))}.  @xref{sxml
-simple}, for more information.
address@hidden(img (@@ (src "http://example.com/foo.png";)))}.  @xref{SXML}, for
+more information.
 
 The good thing about SXML is that HTML elements cannot be confused with
 text.  Let's make a new definition of @code{para}:
@@ -1769,7 +1769,7 @@ message body is long enough.)
 The web handler interface is a common baseline that all kinds of Guile
 web applications can use.  You will usually want to build something on
 top of it, however, especially when producing HTML.  Here is a simple
-example that builds up HTML output using SXML (@pxref{sxml simple}).
+example that builds up HTML output using SXML (@pxref{SXML}).
 
 First, load up the modules:
 
diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm
index dcef3b2..606975d 100644
--- a/module/sxml/simple.scm
+++ b/module/sxml/simple.scm
@@ -1,6 +1,6 @@
 ;;;; (sxml simple) -- a simple interface to the SSAX parser
 ;;;;
-;;;;   Copyright (C) 2009, 2010  Free Software Foundation, Inc.
+;;;;   Copyright (C) 2009, 2010, 2013  Free Software Foundation, Inc.
 ;;;;    Modified 2004 by Andy Wingo <wingo at pobox dot com>.
 ;;;;    Originally written by Oleg Kiselyov <oleg at pobox dot com> as 
SXML-to-HTML.scm.
 ;;;; 
@@ -28,14 +28,162 @@
 (define-module (sxml simple)
   #:use-module (sxml ssax)
   #:use-module (sxml transform)
-  #:use-module (ice-9 optargs)
+  #:use-module (ice-9 match)
   #:use-module (srfi srfi-13)
   #:export (xml->sxml sxml->xml sxml->string))
 
-(define* (xml->sxml #:optional (port (current-input-port)))
+;; Helpers from upstream/SSAX.scm.
+;;
+
+(define (ssax:warn port msg . args)
+  (format (current-ssax-error-port)
+          ";;; SSAX warning: ~a ~a\n" msg args))
+
+;     ssax:reverse-collect-str LIST-OF-FRAGS -> LIST-OF-FRAGS
+; given the list of fragments (some of which are text strings)
+; reverse the list and concatenate adjacent text strings.
+; We can prove from the general case below that if LIST-OF-FRAGS
+; has zero or one element, the result of the procedure is equal?
+; to its argument. This fact justifies the shortcut evaluation below.
+(define (ssax:reverse-collect-str fragments)
+  (cond
+    ((null? fragments) '())    ; a shortcut
+    ((null? (cdr fragments)) fragments) ; see the comment above
+    (else
+      (let loop ((fragments fragments) (result '()) (strs '()))
+       (cond
+         ((null? fragments)
+           (if (null? strs) result
+             (cons (string-concatenate/shared strs) result)))
+         ((string? (car fragments))
+           (loop (cdr fragments) result (cons (car fragments) strs)))
+         (else
+           (loop (cdr fragments)
+             (cons
+               (car fragments)
+               (if (null? strs) result
+                 (cons (string-concatenate/shared strs) result)))
+             '())))))))
+
+;; Ideas for the future for this interface:
+;;
+;;  * Allow doctypes to provide parsed entities
+;;
+;;  * Allow validation (the ELEMENTS value from the DOCTYPE handler
+;;    below)
+;;
+;;  * Parse internal DTDs
+;;
+;;  * Parse external DTDs
+;;
+(define* (xml->sxml #:optional (string-or-port (current-input-port)) #:key
+                    (namespaces '())
+                    (declare-namespaces? #t)
+                    (trim-whitespace? #f)
+                    (entities '())
+                    (default-entity-handler #f))
   "Use SSAX to parse an XML document into SXML. Takes one optional
-argument, @var{port}, which defaults to the current input port."
-  (ssax:xml->sxml port '()))
+argument, @var{string-or-port}, which defaults to the current input
+port."
+  ;; NAMESPACES: alist of PREFIX -> URI.  Specifies the symbol prefix
+  ;; that the user wants on elements of a given namespace in the
+  ;; resulting SXML, regardless of the abbreviated namespaces defined in
+  ;; the document by xmlns attributes.  If DECLARE-NAMESPACES? is true,
+  ;; these namespaces are treated as if they were declared in the DTD.
+
+  ;; ENTITIES: alist of SYMBOL -> STRING.
+
+  ;; NAMESPACES: list of (DOC-PREFIX . (USER-PREFIX . URI)).
+  ;; A DOC-PREFIX of #f indicates that it comes from the user.
+  ;; Otherwise, prefixes are symbols.
+  (define (user-namespaces)
+    (map (lambda (el)
+           (match el
+             ((prefix . uri-string)
+              (cons* (and declare-namespaces? prefix)
+                     prefix
+                     (ssax:uri-string->symbol uri-string)))))
+         namespaces))
+
+  (define (user-entities)
+    (if (and default-entity-handler
+             (not (assq '*DEFAULT* entities)))
+        (acons '*DEFAULT* default-entity-handler entities)
+        entities))
+
+  (define (name->sxml name)
+    (match name
+      ((prefix . local-part)
+       (symbol-append prefix (string->symbol ":") local-part))
+      (_ name)))
+
+  ;; The SEED in this parser is the SXML: initialized to '() at each new
+  ;; level by the fdown handlers; built in reverse by the fhere parsers;
+  ;; and reverse-collected by the fup handlers.
+  (define parser
+    (ssax:make-parser
+     NEW-LEVEL-SEED ; fdown
+     (lambda (elem-gi attributes namespaces expected-content seed)
+       '())
+   
+     FINISH-ELEMENT ; fup
+     (lambda (elem-gi attributes namespaces parent-seed seed)
+       (let ((seed (if trim-whitespace?
+                       (ssax:reverse-collect-str-drop-ws seed)
+                       (ssax:reverse-collect-str seed)))
+             (attrs (attlist-fold
+                     (lambda (attr accum)
+                       (cons (list (name->sxml (car attr)) (cdr attr))
+                             accum))
+                     '() attributes)))
+         (acons (name->sxml elem-gi)
+                (if (null? attrs)
+                    seed
+                    (cons (cons '@ attrs) seed))
+                parent-seed)))
+
+     CHAR-DATA-HANDLER ; fhere
+     (lambda (string1 string2 seed)
+       (if (string-null? string2)
+           (cons string1 seed)
+           (cons* string2 string1 seed)))
+
+     DOCTYPE
+     ;; -> ELEMS ENTITIES NAMESPACES SEED
+     ;;
+     ;; ELEMS is for validation and currently unused.
+     ;;
+     ;; ENTITIES is an alist of parsed entities (symbol -> string).
+     ;;
+     ;; NAMESPACES is as above.
+     ;;
+     ;; SEED builds up the content.
+     (lambda (port docname systemid internal-subset? seed)
+       (when internal-subset?
+         (ssax:warn port "Internal DTD subset is not currently handled ")
+         (ssax:skip-internal-dtd port))
+       (ssax:warn port "DOCTYPE DECL " docname " "
+                  systemid " found and skipped")
+       (values #f (user-entities) (user-namespaces) seed))
+
+     UNDECL-ROOT
+     ;; This is like the DOCTYPE handler, but for documents that do not
+     ;; have a <!DOCTYPE!> entry.
+     (lambda (elem-gi seed)
+       (values #f (user-entities) (user-namespaces) seed))
+
+     PI
+     ((*DEFAULT*
+       . (lambda (port pi-tag seed)
+           (cons
+            (list '*PI* pi-tag (ssax:read-pi-body-as-string port))
+            seed))))))
+
+  (let* ((port (if (string? string-or-port)
+                   (open-input-string string-or-port)
+                   string-or-port))
+         (elements (reverse (parser port '()))))
+    `(*TOP* ,@elements)))
 
 (define check-name
   (let ((*good-cache* (make-hash-table)))
diff --git a/module/sxml/ssax.scm b/module/sxml/ssax.scm
index a4de0e3..474247b 100644
--- a/module/sxml/ssax.scm
+++ b/module/sxml/ssax.scm
@@ -1,6 +1,6 @@
 ;;;; (sxml ssax) -- the SSAX parser
 ;;;;
-;;;;   Copyright (C) 2009, 2010,2012  Free Software Foundation, Inc.
+;;;;   Copyright (C) 2009, 2010,2012,2013  Free Software Foundation, Inc.
 ;;;;    Modified 2004 by Andy Wingo <wingo at pobox dot com>.
 ;;;;    Written 2001,2002,2003,2004 by Oleg Kiselyov <oleg at pobox dot com> 
as SSAX.scm.
 ;;;; 
@@ -170,12 +170,14 @@
 (define ascii->char integer->char)
 (define char->ascii char->integer)
 
-(define *current-ssax-error-port* (make-fluid))
-(define (current-ssax-error-port)
-  (fluid-ref *current-ssax-error-port*))
+(define current-ssax-error-port
+  (make-parameter (current-error-port)))
+
+(define *current-ssax-error-port*
+  (parameter-fluid current-ssax-error-port))
 
 (define (with-ssax-error-to-port port thunk)
-  (with-fluids ((*current-ssax-error-port* port))
+  (parameterize ((current-ssax-error-port port))
     (thunk)))
 
 (define (ssax:warn port msg . args)
diff --git a/module/sxml/upstream/SSAX.scm b/module/sxml/upstream/SSAX.scm
index 776e311..d2b8fd9 100644
--- a/module/sxml/upstream/SSAX.scm
+++ b/module/sxml/upstream/SSAX.scm
@@ -442,6 +442,11 @@
 ;      named-entity-name is currently being expanded. A reference to
 ;      this named-entity-name will be an error: violation of the
 ;      WFC nonrecursion.
+;
+;       As an extension to the original SSAX, Guile allows a
+;       named-entity-name of *DEFAULT* to indicate a fallback procedure,
+;       called as (FALLBACK PORT NAME).  The procedure should return a
+;       string.
 
 ; XML-TOKEN -- a record
 
@@ -1095,10 +1100,20 @@
             (close-input-port port))))
         (else
          (parser-error port "[norecursion] broken for " name))))))
-    ((assq name ssax:predefined-parsed-entities)
-     => (lambda (decl-entity)
-         (str-handler (cdr decl-entity) "" seed)))
-    (else (parser-error port "[wf-entdeclared] broken for " name))))
+   ((assq name ssax:predefined-parsed-entities)
+    => (lambda (decl-entity)
+         (str-handler (cdr decl-entity) "" seed)))
+   ((assq '*DEFAULT* entities) =>
+    (lambda (decl-entity)
+      (let ((fallback (cdr decl-entity))
+           (new-entities (cons (cons name #f) entities)))
+       (cond
+        ((procedure? fallback)
+          (call-with-input-string (fallback port name)
+            (lambda (port) (content-handler port new-entities seed))))
+        (else
+         (parser-error port "[norecursion] broken for " name))))))
+   (else (parser-error port "[wf-entdeclared] broken for " name))))
 
 
 
@@ -1267,6 +1282,14 @@
          '((ent . "&lt;&ent1;T;&gt;") (ent1 . "&amp;"))
          `((,(string->symbol "Abc") . ,(unesc-string "<&>%n"))
            (,(string->symbol "Next") . "12<&T;>34")))
+    (test "%tAbc='&lt;&amp;&gt;&#x0A;'%nNext='12&ent;34' />" 
+         `((*DEFAULT* . ,(lambda (port name)
+                            (case name
+                              ((ent) "&lt;&ent1;T;&gt;")
+                              ((ent1) "&amp;")
+                              (else (error "unrecognized" name))))))
+         `((,(string->symbol "Abc") . ,(unesc-string "<&>%n"))
+           (,(string->symbol "Next") . "12<&T;>34")))
     (assert (failed?
        (test "%tAbc='&lt;&amp;&gt;&#x0A;'%nNext='12&ent;34' />" 
          '((ent . "<&ent1;T;&gt;") (ent1 . "&amp;")) '())))


hooks/post-receive
-- 
GNU Guile



reply via email to

[Prev in Thread] Current Thread [Next in Thread]