chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-users] Namespace issues in SXML


From: Matt Gushee
Subject: [Chicken-users] Namespace issues in SXML
Date: Mon, 25 Mar 2013 13:58:48 -0600

Hello, list--

I am currently developing a project that uses SXML, and I have been
encountering some problems with namespace handling in SXML. At least
they are problems for me--I am not necessarily claiming that the
behavior I'm seeing is incorrect. And perhaps the most troubling
aspect to this is inconsistencies in the APIs of the different
SXML-related eggs.

And by the way, I am aware that namespaces are a big headache to
implement. Though I have never personally written low-level namespace
handling code, I have worked on a team that did, and ... suffice it to
say that it was quite hard to determine the right approach. So I
respect the efforts of anyone who has gotten namespace handling even
halfway right.

First issue (SXPath):

As the code below shows, expressions that should be equivalent in
sxpath and txpath syntaxes produce different results with prefixed
elements. The sxpath expression returns elements with the same
prefixes as in the source document, while the txpath version returns
elements with the full NSURI prepended. It is not entirely clear from
the docs what results we should expect, but I can't see a reason for
this discrepancy.

The following code applies two sxpath expressions and their txpath
equivalents to two versions of the same SXML document--one with
prefixes and a *NAMESPACES* node, the other without. The first test
expression selects prefixed nodes, the second unprefixed nodes. The
expression without prefixes shows consistent behavior between sxpath
and txpath; the sxpath expression with prefixes works only for the
document with prefixes, while the txpath version works only for the
document without prefixes.

   (use ssax sxpath txpath)

    (define nsmap1 '((#f . "http://www.w3.org/1999/xhtml";)
                     (fubar . "http://fu.bar.com/ns";)))

    ;; This document is just for reference. Reading 'xmldoc' with
ssax:xml->sxml, using nsmap1,
    ;; should produce the following sxml tree.
    (define doc1
      '(*TOP*
         (@ (*NAMESPACES* (#f "http://www.w3.org/1999/xhtml";) (fubar
"http://fu.bar.com/ns";)))
         (html
           (head
             (title "My Test Document")
             (fubar:meta
               (fubar:created-date "2013-03-23")
               (fubar:license "Creative Commons")))
           (body
             (@ (class "fubar"))
             (h1 "Hi there!")
             (p
               (@ (class "boilerplate"))
               "Lorem Ipsum etc. etc.")))))

    ;; This document is just for reference. Reading 'xmldoc' with
ssax:xml->sxml with an empty
    ;; namespace prefix declaration should produce the following sxml tree.
    (define doc3
      '(*TOP*
         (http://www.w3.org/1999/xhtml:html
           (http://www.w3.org/1999/xhtml:head
             (http://www.w3.org/1999/xhtml:title "My Test Document")
             (http://fu.bar.com/ns:meta
               (http://fu.bar.com/ns:created-date "2013-03-23")
               (http://fu.bar.com/ns:license "Creative Commons")))
           (http://www.w3.org/1999/xhtml:body
             (@ (class "fubar"))
             (http://www.w3.org/1999/xhtml:h1 "Hi there!")
             (http://www.w3.org/1999/xhtml:p
               (@ (class "boilerplate"))
               "Lorem Ipsum etc. etc.")))))

    (define xmldoc #<<XMLDOC
    <html xmlns="http://www.w3.org/1999/xhtml";
xmlns:fubar="http://fu.bar.com/ns";>
      <head>
        <title>My Test Document</title>
        <fubar:meta>
          <fubar:created-date>2013-03-23</fubar:created-date>
          <fubar:license>Creative Commons</fubar:license>
        </fubar:meta>
      </head>
      <body class="fubar">
        <h1>Hi there!</h1>
        <p class="boilerplate">Lorem Ipsum etc. etc.</p>
      </body>
    </html>
    XMLDOC
    )

    (define doc2
      (with-input-from-string xmldoc
        (lambda ()
          (ssax:xml->sxml (current-input-port) nsmap1))))

    (define doc4
      (with-input-from-string xmldoc
        (lambda ()
          (ssax:xml->sxml (current-input-port) '()))))

    (define nsmap2 (list (cons '*default* (cdar nsmap1)) (cadr nsmap1)))

    ;; Select a prefixed element.
    (define sx1 (sxpath '(// fubar:license) nsmap2))

    ;; sx1 and tx1 should be equivalent
    (define tx1 (txpath "//fubar:license" nsmap2))

    ;; Select an attribute of a non-prefixed element.
    (define sx2 (sxpath '(// p @ class) nsmap2))

    ;; sx2 and tx2 should be equivalent
    (define tx2 (txpath "//p/@class" nsmap2))


    (if (equal? doc1 doc2)
      (print "doc1 = doc2")
      (print "doc1 != doc2"))

    (if (equal? doc3 doc4)
      (print "doc3 = doc4")
      (print "doc3 != doc4"))


    (print "DOC 2/TEST 1")
    (print "sxpath:")
    (pp (sx1 doc2))
    (print "txpath:")
    (pp (tx1 doc2))

    (print "DOC 2/TEST 2")
    (print "sxpath:")
    (pp (sx2 doc2))
    (print "txpath:")
    (pp (tx2 doc2))

    (print "DOC 4/TEST 1")
    (print "sxpath:")
    (pp (sx1 doc4))
    (print "txpath:")
    (pp (tx1 doc4))

    (print "DOC 4/TEST 2")
    (print "sxpath:")
    (pp (sx2 doc4))
    (print "txpath:")
    (pp (tx2 doc4))


Second issue:

I note that when you provide namespace bindings for ssax:xml->sxml,
the default namespace is indicated with #f, e.g. '((#f .
"http://www.w3.org/1999/xhtml";)), whereas sxpath and sxml-serializer
require the symbol *default*. I'm not sure who should fix what, but it
would be nice if we could use the same convention for all relevant
APIs.

Third issue:

At this stage I'm mostly just curious, but I noticed that Jim Ursetto
has written somewhere that sxml-serializer may sometimes output
namespace prefixes locally for prefixes that occur in various
locations throughout the document. And indeed, that seems to be true
... and I note that in a large, namespace-heavy document, that could
result in a good deal of bloat. I'm just wondering if there are major
technical obstacles to, say, using a *NAMESPACES* node when present,
and writing the prefix declarations derived from it on the root
element? That does not appear to happen at present.

Finally:

First of all, I am grateful for the hard work of Messrs. Kiselyov,
Lissovsky, and Lizorkin, and those who have adapted their code to
Chicken Scheme. SXML itself is clearly a good thing, and the tools
seem to work very well. And yet, looking at the APIs, there are these
maddening inconsistencies, and even looking at each egg in isolation
... for example, what's up with the name 'SRV:send-reply'. How does
that relate to anything? So I'm wondering if it wouldn't be a good
idea at some point for those who work with sxml-tools and its
descendants to get together and design a more unified and usable
API(s). I know sxml-fu is intended to address that need, but it seems
kind of ad-hoc and limited scope. Just a thought.

Best regards,
Matt Gushee



reply via email to

[Prev in Thread] Current Thread [Next in Thread]