guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] 01/01: web: add support for URI-reference


From: Andy Wingo
Subject: [Guile-commits] 01/01: web: add support for URI-reference
Date: Sun, 21 May 2017 08:01:30 -0400 (EDT)

wingo pushed a commit to branch stable-2.2
in repository guile.

commit 7095a536f32d08efbd6578cb26fc2a4367ad16bb
Author: Andy Wingo <address@hidden>
Date:   Sun May 21 11:56:59 2017 +0200

    web: add support for URI-reference
    
    Based on a patch by Daniel Hartwig <address@hidden>.
    
    * NEWS: Update.
    * doc/ref/web.texi (URIs): Fragments are properly part of a URI, so
      remove the incorrect note.  Add documentation on URI subtypes.
    * module/web/uri.scm (uri-reference?): New base type predicate.
      (uri?, relative-ref?): Specific predicates.
      (validate-uri-reference): Strict validation.
      (validate-uri, validate-relative-ref): Specific validators.
      (build-uri-reference, build-relative-ref): New constructors.
      (string->uri-reference): Rename from string->uri.
      (string->uri, string->relative-ref): Specific constructors.
      (uri->string): Add #:include-fragment? keyword argument.
    * module/web/http.scm (parse-request-uri): Use `build-uri-reference',
      and result is a URI-reference, not URI, object.  No longer infer an
      absent `uri-scheme' is `http'.
      (write-uri): Just use `uri->string'.
      (declare-uri-header!): Remove unused function.
      (declare-uri-reference-header!): Update.  Rename from
      `declare-relative-uri-header!'.
    * test-suite/tests/web-uri.test ("build-uri-reference"):
      ("string->uri-reference"): Add.
      ("uri->string"): Also tests for relative-refs.
    * test-suite/tests/web-http.test ("read-request-line"):
      ("write-request-line"): Update for no scheme in some URIs.
      ("entity headers", "request headers"): Content-location, Referer, and
      Location should also parse relative-URIs.
    * test-suite/tests/web-request.test ("example-1"): Expect URI-reference
      with no scheme.
---
 NEWS                              |  18 +++++
 doc/ref/web.texi                  | 134 +++++++++++++++++++++------------
 module/web/client.scm             |  12 +--
 module/web/http.scm               |  48 +++---------
 module/web/request.scm            |   2 +-
 module/web/uri.scm                | 152 ++++++++++++++++++++++++++++++--------
 test-suite/tests/web-http.test    |  51 ++++++++-----
 test-suite/tests/web-request.test |   5 +-
 test-suite/tests/web-uri.test     |  62 +++++++++++++++-
 9 files changed, 338 insertions(+), 146 deletions(-)

diff --git a/NEWS b/NEWS
index 6d7e58e..7ce583b 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,24 @@ Please send Guile bug reports to address@hidden
 
 Changes in 2.2.3 (since 2.2.2):
 
+* New interfaces
+
+** (web uri) module has better support for RFC 3986
+
+The URI standard, RFC 3986, defines additional "relative-ref" and
+"URI-reference" data types.  Thanks to Daniel Hartwig, Guile's support
+for these URI subtypes has been improved.  See "Universal Resource
+Identifiers" in the manual, for more.
+
+* New deprecations
+
+** Using `uri?' as a predicate on relative-refs deprecated
+
+If you don't care whether the URI is a relative-ref or not, use
+`uri-reference?'.  If you do, use `uri-reference?' and `relative-ref?'.
+In the future `uri?' will return a true value only for URIs that specify
+a scheme.
+
 * Bug fixes
 
 ** Enable GNU Readline 7.0's support for "bracketed paste".
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index c0a7bdd..7c6a954 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -173,23 +173,13 @@ Guile provides a standard data type for Universal 
Resource Identifiers
 The generic URI syntax is as follows:
 
 @example
-URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
-       [ "?" query ] [ "#" fragment ]
+URI-reference := [scheme ":"] ["//" [userinfo "@@"] host [":" port]] path \
+                 [ "?" query ] [ "#" fragment ]
 @end example
 
 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
 scheme is @code{http}, the host is @code{www.gnu.org}, the path is
address@hidden/help/}, and there is no userinfo, port, query, or fragment.  All
-URIs have a scheme and a path (though the path might be empty).  Some
-URIs have a host, and some of those have ports and userinfo.  Any URI
-might have a query part or a fragment.
-
-There is also a ``URI-reference'' data type, which is the same as a URI
-but where the scheme is optional.  In this case, the scheme is taken to
-be relative to some other related URI.  A common use of URI references
-is when you want to be vague regarding the choice of HTTP or HTTPS --
-serving a web page referring to @code{/foo.css} will use HTTPS if loaded
-over HTTPS, or HTTP otherwise.
address@hidden/help/}, and there is no userinfo, port, query, or fragment.
 
 Userinfo is something of an abstraction, as some legacy URI schemes
 allowed userinfo of the form @address@hidden:@var{passwd}}.  But
@@ -197,14 +187,6 @@ since passwords do not belong in URIs, the RFC does not 
want to condone
 this practice, so it calls anything before the @code{@@} sign
 @dfn{userinfo}.
 
-Properly speaking, a fragment is not part of a URI.  For example, when a
-web browser follows a link to @indicateurl{http://example.com/#foo}, it
-sends a request for @indicateurl{http://example.com/}, then looks in the
-resulting page for the fragment identified @code{foo} reference.  A
-fragment identifies a part of a resource, not the resource itself.  But
-it is useful to have a fragment field in the URI record itself, so we
-hope you will forgive the inconsistency.
-
 @example
 (use-modules (web uri))
 @end example
@@ -213,40 +195,36 @@ The following procedures can be found in the @code{(web 
uri)}
 module. Load it into your Guile, using a form like the above, to have
 access to them.
 
+The most common way to build a URI from Scheme is with the
address@hidden function.
+
 @deffn {Scheme Procedure} build-uri scheme @
        [#:address@hidden [#:address@hidden [#:address@hidden @
        [#:address@hidden""}] [#:address@hidden [#:address@hidden @
        [#:address@hidden
-Construct a URI object.  @var{scheme} should be a symbol, @var{port}
-either a positive, exact integer or @code{#f}, and the rest of the
-fields are either strings or @code{#f}.  If @var{validate?} is true,
-also run some consistency checks to make sure that the constructed URI
-is valid.
+Construct a URI.  @var{scheme} should be a symbol, @var{port} either a
+positive, exact integer or @code{#f}, and the rest of the fields are
+either strings or @code{#f}.  If @var{validate?} is true, also run some
+consistency checks to make sure that the constructed URI is valid.
 @end deffn
-
address@hidden {Scheme Procedure} build-uri-reference [#:address@hidden@
-       [#:address@hidden [#:address@hidden [#:address@hidden @
-       [#:address@hidden""}] [#:address@hidden [#:address@hidden @
-       [#:address@hidden
-Like @code{build-uri}, but with an optional scheme.
address@hidden {Scheme Procedure} uri? obj
+Return @code{#t} if @var{obj} is a URI.
 @end deffn
 
-In Guile, both URI and URI reference data types are represented in the
-same way, as URI objects.
+Guile, URIs are represented as URI records, with a number of associated
+accessors.
 
address@hidden {Scheme Procedure} uri? obj
address@hidden {Scheme Procedure} uri-scheme uri
address@hidden {Scheme Procedure} uri-scheme uri
 @deffnx {Scheme Procedure} uri-userinfo uri
 @deffnx {Scheme Procedure} uri-host uri
 @deffnx {Scheme Procedure} uri-port uri
 @deffnx {Scheme Procedure} uri-path uri
 @deffnx {Scheme Procedure} uri-query uri
 @deffnx {Scheme Procedure} uri-fragment uri
-A predicate and field accessors for the URI record type.  The URI scheme
-will be a symbol, or @code{#f} if the object is a URI reference but not
-a URI.  The port will be either a positive, exact integer or @code{#f},
-and the rest of the fields will be either strings or @code{#f} if not
-present.
+Field accessors for the URI record type.  The URI scheme will be a
+symbol, or @code{#f} if the object is a relative-ref (see below).  The
+port will be either a positive, exact integer or @code{#f}, and the rest
+of the fields will be either strings or @code{#f} if not present.
 @end deffn
 
 @deffn {Scheme Procedure} string->uri string
@@ -254,15 +232,11 @@ Parse @var{string} into a URI object.  Return @code{#f} 
if the string
 could not be parsed.
 @end deffn
 
address@hidden {Scheme Procedure} string->uri-reference string
-Parse @var{string} into a URI object, while not requiring a scheme.
-Return @code{#f} if the string could not be parsed.
address@hidden deffn
-
address@hidden {Scheme Procedure} uri->string uri
address@hidden {Scheme Procedure} uri->string uri [#:address@hidden
 Serialize @var{uri} to a string.  If the URI has a port that is the
 default port for its scheme, the port is not included in the
-serialization.
+serialization.  If @var{include-fragment?} is given as false, the
+resulting string will omit the fragment (if any).
 @end deffn
 
 @deffn {Scheme Procedure} declare-default-port! scheme port
@@ -323,6 +297,70 @@ For example, the list @code{("scrambled eggs" 
"biscuits&gravy")} encodes
 as @code{"scrambled%20eggs/biscuits%26gravy"}.
 @end deffn
 
address@hidden Subtypes of URI
+
+As we noted above, not all URI objects have a scheme.  You might have
+noted in the ``generic URI syntax'' example that the left-hand side of
+that grammar definition was URI-reference, not URI.  A
address@hidden is a generalization of a URI where the scheme is
+optional.  If no scheme is specified, it is taken to be relative to some
+other related URI.  A common use of URI references is when you want to
+be vague regarding the choice of HTTP or HTTPS -- serving a web page
+referring to @code{/foo.css} will use HTTPS if loaded over HTTPS, or
+HTTP otherwise.
+
address@hidden {Scheme Procedure} build-uri-reference [#:address@hidden@
+       [#:address@hidden [#:address@hidden [#:address@hidden @
+       [#:address@hidden""}] [#:address@hidden [#:address@hidden @
+       [#:address@hidden
+Like @code{build-uri}, but with an optional scheme.
address@hidden deffn
address@hidden {Scheme Procedure} uri-reference? obj
+Return @code{#t} if @var{obj} is a URI-reference.  This is the most
+general URI predicate, as it includes not only full URIs that have
+schemes (those that match @code{uri?}) but also URIs without schemes.
address@hidden deffn
+
+It's also possible to build a @dfn{relative-ref}: a URI-reference that
+explicitly lacks a scheme.
+
address@hidden {Scheme Procedure} build-relative-ref @
+       [#:address@hidden [#:address@hidden [#:address@hidden @
+       [#:address@hidden""}] [#:address@hidden [#:address@hidden @
+       [#:address@hidden
+Like @code{build-uri}, but with no scheme.
address@hidden deffn
address@hidden {Scheme Procedure} relative-ref? obj
+Return @code{#t} if @var{obj} is a ``relative-ref'': a URI-reference
+that has no scheme.  Every URI-reference will either match @code{uri?}
+or @code{relative-ref?} (but not both).
address@hidden deffn
+
+In case it's not clear from the above, the most general of these URI
+types is the URI-reference, with @code{build-uri-reference} as the most
+general constructor.  @code{build-uri} and @code{build-relative-ref}
+enforce enforce specific restrictions on the URI-reference.  The most
+generic URI parser is then @code{string->uri-reference}, and there is
+also a parser for when you know that you want a relative-ref.
+
address@hidden {Scheme Procedure} string->uri-reference string
+Parse @var{string} into a URI object, while not requiring a scheme.
+Return @code{#f} if the string could not be parsed.
address@hidden deffn
+
address@hidden {Scheme Procedure} string->relative-ref string
+Parse @var{string} into a URI object, while asserting that no scheme is
+present.  Return @code{#f} if the string could not be parsed.
address@hidden deffn
+
+For compatibility reasons, note that @code{uri?} will return @code{#t}
+for all URI objects, even relative-refs.  In contrast, @code{build-uri}
+and @code{string->uri} require that the resulting URI not be a
+relative-ref.  As a predicate to distinguish relative-refs from proper
+URIs (in the language of RFC 3986), use something like @code{(and
+(uri-reference? @var{x}) (not (relative-ref?  @var{x})))}.
+
+
 @node HTTP
 @subsection The Hyper-Text Transfer Protocol
 
diff --git a/module/web/client.scm b/module/web/client.scm
index c30fa99..3b7ea51 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -164,16 +164,16 @@ host name without trailing dot."
                                             get-position set-position!
                                             close))))
 
-(define (ensure-uri uri-or-string)
+(define (ensure-uri-reference uri-or-string)
   (cond
-   ((string? uri-or-string) (string->uri uri-or-string))
-   ((uri? uri-or-string) uri-or-string)
-   (else (error "Invalid URI" uri-or-string))))
+   ((string? uri-or-string) (string->uri-reference uri-or-string))
+   ((uri-reference? uri-or-string) uri-or-string)
+   (else (error "Invalid URI-reference" uri-or-string))))
 
 (define (open-socket-for-uri uri-or-string)
   "Return an open input/output port for a connection to URI."
   (define http-proxy (current-http-proxy))
-  (define uri (ensure-uri (or http-proxy uri-or-string)))
+  (define uri (ensure-uri-reference (or http-proxy uri-or-string)))
   (define addresses
     (let ((port (uri-port uri)))
       (delete-duplicates
@@ -344,7 +344,7 @@ as is the case by default with a request returned by 
`build-request'."
                   (streaming? #f)
                   (request
                    (build-request
-                    (ensure-uri uri)
+                    (ensure-uri-reference uri)
                     #:method method
                     #:version version
                     #:headers (if keep-alive?
diff --git a/module/web/http.scm b/module/web/http.scm
index 1f208f4..993b50e 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1112,7 +1112,8 @@ symbol, like ‘GET’."
 
 (define* (parse-request-uri str #:optional (start 0) (end (string-length str)))
   "Parse a URI from an HTTP request line.  Note that URIs in requests do
-not have to have a scheme or host name.  The result is a URI object."
+not have to have a scheme or host name.  The result is a URI-reference
+object."
   (cond
    ((= start end)
     (bad-request "Missing Request-URI"))
@@ -1122,10 +1123,10 @@ not have to have a scheme or host name.  The result is 
a URI object."
     (let* ((q (string-index str #\? start end))
            (f (string-index str #\# start end))
            (q (and q (or (not f) (< q f)) q)))
-      (build-uri 'http
-                 #:path (substring str start (or q f end))
-                 #:query (and q (substring str (1+ q) (or f end)))
-                 #:fragment (and f (substring str (1+ f) end)))))
+      (build-uri-reference
+       #:path (substring str start (or q f end))
+       #:query (and q (substring str (1+ q) (or f end)))
+       #:fragment (and f (substring str (1+ f) end)))))
    (else
     (or (string->uri (substring str start end))
         (bad-request "Invalid URI: ~a" (substring str start end))))))
@@ -1143,31 +1144,7 @@ three values: the method, the URI, and the version."
             (parse-http-version line (1+ d1) (string-length line)))))
 
 (define (write-uri uri port)
-  (when (uri-host uri)
-    (when (uri-scheme uri)
-      (put-symbol port (uri-scheme uri))
-      (put-char port #\:))
-    (put-string port "//")
-    (when (uri-userinfo uri)
-      (put-string port (uri-userinfo uri))
-      (put-char port #\@))
-    (put-string port (uri-host uri))
-    (let ((p (uri-port uri)))
-      (when (and p (not (eqv? p 80)))
-        (put-char port #\:)
-        (put-non-negative-integer port p))))
-  (let* ((path (uri-path uri))
-         (len (string-length path)))
-    (cond
-     ((and (> len 0) (not (eqv? (string-ref path 0) #\/)))
-      (bad-request "Non-absolute URI path: ~s" path))
-     ((and (zero? len) (not (uri-host uri)))
-      (bad-request "Empty path and no host for URI: ~s" uri))
-     (else
-      (put-string port path))))
-  (when (uri-query uri)
-    (put-char port #\?)
-    (put-string port (uri-query uri))))
+  (put-string port (uri->string uri #:include-fragment? #f)))
 
 (define (write-request-line method uri version port)
   "Write the first line of an HTTP request to PORT."
@@ -1272,20 +1249,13 @@ treated specially, and is just returned as a plain 
string."
     parse-non-negative-integer non-negative-integer?
     (lambda (val port) (put-non-negative-integer port val))))
 
-;; emacs: (put 'declare-uri-header! 'scheme-indent-function 1)
-(define (declare-uri-header! name)
-  (declare-header! name
-    (lambda (str) (or (string->uri str) (bad-header-component 'uri str)))
-    (@@ (web uri) absolute-uri?)
-    write-uri))
-
 ;; emacs: (put 'declare-uri-reference-header! 'scheme-indent-function 1)
 (define (declare-uri-reference-header! name)
   (declare-header! name
     (lambda (str)
       (or (string->uri-reference str)
-          (bad-header-component 'uri str)))
-    uri?
+          (bad-header-component 'uri-reference str)))
+    uri-reference?
     write-uri))
 
 ;; emacs: (put 'declare-quality-list-header! 'scheme-indent-function 1)
diff --git a/module/web/request.scm b/module/web/request.scm
index c9f1dc1..eea32e9 100644
--- a/module/web/request.scm
+++ b/module/web/request.scm
@@ -170,7 +170,7 @@ the headers are each run through their respective 
validators."
                 (non-negative-integer? (car version))
                 (non-negative-integer? (cdr version))))
       (bad-request "Bad version: ~a" version))
-     ((not (uri? uri))
+     ((not (uri-reference? uri))
       (bad-request "Bad uri: ~a" uri))
      ((and (not port) (memq method '(POST PUT)))
       (bad-request "Missing port for message ~a" method))
diff --git a/module/web/uri.scm b/module/web/uri.scm
index 848d500..5b01aa4 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -42,11 +42,15 @@
             uri->string
             uri-decode uri-encode
             split-and-decode-uri-path
-            encode-and-join-uri-path))
+            encode-and-join-uri-path
+
+            uri-reference? relative-ref?
+            build-uri-reference build-relative-ref
+            string->uri-reference string->relative-ref))
 
 (define-record-type <uri>
   (make-uri scheme userinfo host port path query fragment)
-  uri?
+  uri-reference?
   (scheme uri-scheme)
   (userinfo uri-userinfo)
   (host uri-host)
@@ -55,8 +59,49 @@
   (query uri-query)
   (fragment uri-fragment))
 
-(define (absolute-uri? obj)
-  (and (uri? obj) (uri-scheme obj) #t))
+;;;
+;;; Predicates.
+;;;
+;;; These are quick, and assume rigid validation at construction time.
+
+;;; RFC 3986, #3.
+;;;
+;;;   URI         = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
+;;;
+;;;   hier-part   = "//" authority path-abempty
+;;;               / path-absolute
+;;;               / path-rootless
+;;;               / path-empty
+
+(define (uri? obj)
+  (and (uri-reference? obj)
+       (if (include-deprecated-features)
+           (begin
+             (unless (uri-scheme obj)
+               (issue-deprecation-warning
+                "Use uri-reference? instead of uri?; in the future, uri?
+will require that the object not be a relative-ref."))
+             #t)
+           (uri-scheme obj))
+       #t))
+
+;;; RFC 3986, #4.2.
+;;;
+;;;   relative-ref  = relative-part [ "?" query ] [ "#" fragment ]
+;;;
+;;;   relative-part = "//" authority path-abempty
+;;;                 / path-absolute
+;;;                 / path-noscheme
+;;;                 / path-empty
+
+(define (relative-ref? obj)
+  (and (uri-reference? obj)
+       (not (uri-scheme obj))))
+
+
+;;;
+;;; Constructors.
+;;;
 
 (define (uri-error message . args)
   (throw 'uri-error message args))
@@ -64,10 +109,9 @@
 (define (positive-exact-integer? port)
   (and (number? port) (exact? port) (integer? port) (positive? port)))
 
-(define* (validate-uri scheme userinfo host port path query fragment
-                       #:key reference?)
+(define (validate-uri-reference scheme userinfo host port path query fragment)
   (cond
-   ((and (not reference?) (not (symbol? scheme)))
+   ((and scheme (not (symbol? scheme)))
     (uri-error "Expected a symbol for the URI scheme: ~s" scheme))
    ((and (or userinfo port) (not host))
     (uri-error "Expected a host, given userinfo or port"))
@@ -79,32 +123,65 @@
     (uri-error "Expected string for userinfo: ~s" userinfo))
    ((not (string? path))
     (uri-error "Expected string for path: ~s" path))
-   ((and host (not (string-null? path))
-         (not (eqv? (string-ref path 0) #\/)))
-    (uri-error "Expected path of absolute URI to start with a /: ~a" path))))
+   ((and query (not (string? query)))
+    (uri-error "Expected string for query: ~s" query))
+   ((and fragment (not (string? fragment)))
+    (uri-error "Expected string for fragment: ~s" fragment))
+   ;; Strict validation of allowed paths, based on other components.
+   ;; Refer to RFC 3986 for the details.
+   ((not (string-null? path))
+    (if host
+        (cond
+         ((not (eqv? (string-ref path 0) #\/))
+          (uri-error
+           "Expected absolute path starting with \"/\": ~a" path)))
+        (cond
+         ((string-prefix? "//" path)
+          (uri-error
+           "Expected path not starting with \"//\" (no host): ~a" path))
+         ((and (not scheme)
+               (not (eqv? (string-ref path 0) #\/))
+               (let ((colon (string-index path #\:)))
+                 (and colon (not (string-index path #\/ 0 colon)))))
+          (uri-error
+           "Expected relative path's first segment without \":\": ~a"
+           path)))))))
 
 (define* (build-uri scheme #:key userinfo host port (path "") query fragment
                     (validate? #t))
   "Construct a URI object.  SCHEME should be a symbol, PORT either a
 positive, exact integer or ‘#f’, and the rest of the fields are either
 strings or ‘#f’.  If VALIDATE? is true, also run some consistency checks
-to make sure that the constructed object is a valid absolute URI."
-  (if validate?
-      (validate-uri scheme userinfo host port path query fragment))
+to make sure that the constructed object is a valid URI."
+  (when validate?
+    (unless scheme (uri-error "Missing URI scheme"))
+    (validate-uri-reference scheme userinfo host port path query fragment))
   (make-uri scheme userinfo host port path query fragment))
 
 (define* (build-uri-reference #:key scheme userinfo host port (path "") query
                               fragment (validate? #t))
-  "Construct a URI object.  SCHEME should be a symbol or ‘#f’, PORT
-either a positive, exact integer or ‘#f’, and the rest of the fields
-are either strings or ‘#f’.  If VALIDATE? is true, also run some
+  "Construct a URI-reference object.  SCHEME should be a symbol or ‘#f’,
+PORT either a positive, exact integer or ‘#f’, and the rest of the
+fields are either strings or ‘#f’.  If VALIDATE? is true, also run some
 consistency checks to make sure that the constructed URI is a valid URI
-reference (either an absolute URI or a relative reference)."
-  (if validate?
-      (validate-uri scheme userinfo host port path query fragment
-                    #:reference? #t))
+reference."
+  (when validate?
+    (validate-uri-reference scheme userinfo host port path query fragment))
   (make-uri scheme userinfo host port path query fragment))
 
+(define* (build-relative-ref #:key userinfo host port (path "") query fragment
+                             (validate? #t))
+  "Construct a relative-ref URI object.  The arguments are the same as
+for ‘build-uri’ except there is no scheme."
+  (when validate?
+    (validate-uri-reference #f userinfo host port path query fragment))
+  (make-uri #f userinfo host port path query fragment))
+
+
+;;;
+;;; Converters.
+;;;
+
 ;; See RFC 3986 #3.2.2 for comments on percent-encodings, IDNA (RFC
 ;; 3490), and non-ASCII host names.
 ;;
@@ -192,16 +269,24 @@ reference (either an absolute URI or a relative 
reference)."
   (make-regexp uri-pat))
 
 (define (string->uri-reference string)
-  "Parse the URI reference written as STRING into a URI object.  Return
-‘#f’ if the string could not be parsed."
+  "Parse STRING into a URI-reference object.  Return ‘#f’ if the string
+could not be parsed."
   (% (let ((m (regexp-exec uri-regexp string)))
-       (if (not m) (abort))
+       (unless m (abort))
        (let ((scheme (let ((str (match:substring m 2)))
                        (and str (string->symbol (string-downcase str)))))
              (authority (match:substring m 3))
              (path (match:substring m 4))
              (query (match:substring m 6))
              (fragment (match:substring m 8)))
+         ;; The regular expression already ensures all of the validation
+         ;; requirements for URI-references, except the one that the
+         ;; first component of a relative-ref's path can't contain a
+         ;; colon.
+         (unless scheme
+           (let ((colon (string-index path #\:)))
+             (when (and colon (not (string-index path #\/ 0 colon)))
+               (abort))))
          (call-with-values
              (lambda ()
                (if authority
@@ -213,10 +298,19 @@ reference (either an absolute URI or a relative 
reference)."
        #f)))
 
 (define (string->uri string)
-  "Parse STRING into an absolute URI object.  Return ‘#f’ if the string
-could not be parsed."
-  (let ((uri (string->uri-reference string)))
-    (and uri (uri-scheme uri) uri)))
+  "Parse STRING into a URI object.  Return ‘#f’ if the string could not
+be parsed.  Note that this procedure will require that the URI have a
+scheme."
+  (let ((uri-reference (string->uri-reference string)))
+    (and (not (relative-ref? uri-reference))
+         uri-reference)))
+
+(define (string->relative-ref string)
+  "Parse STRING into a relative-ref URI object.  Return ‘#f’ if the
+string could not be parsed."
+  (let ((uri-reference (string->uri-reference string)))
+    (and (relative-ref? uri-reference)
+         uri-reference)))
 
 (define *default-ports* (make-hash-table))
 
@@ -231,7 +325,7 @@ could not be parsed."
 (declare-default-port! 'http 80)
 (declare-default-port! 'https 443)
 
-(define (uri->string uri)
+(define* (uri->string uri #:key (include-fragment? #t))
   "Serialize URI to a string.  If the URI has a port that is the
 default port for its scheme, the port is not included in the
 serialization."
@@ -261,7 +355,7 @@ serialization."
      (if query
          (string-append "?" query)
          "")
-     (if fragment
+     (if (and fragment include-fragment?)
          (string-append "#" fragment)
          ""))))
 
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index da00ec3..6337734 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -1,6 +1,6 @@
 ;;;; web-http.test --- HTTP library        -*- mode: scheme; coding: utf-8; -*-
 ;;;;
-;;;;   Copyright (C) 2010-2011, 2014-2016 Free Software Foundation, Inc.
+;;;;   Copyright (C) 2010-2011, 2014-2017 Free Software Foundation, Inc.
 ;;;;
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -150,32 +150,33 @@
 (with-test-prefix "read-request-line"
   (pass-if-read-request-line "GET / HTTP/1.1"
                              GET
-                             (build-uri 'http
-                                        #:path "/")
+                             (build-uri-reference
+                              #:path "/")
                              (1 . 1))
   (pass-if-read-request-line "GET http://www.w3.org/pub/WWW/TheProject.html 
HTTP/1.1"
                              GET
-                             (build-uri 'http
-                                        #:host "www.w3.org"
-                                        #:path "/pub/WWW/TheProject.html")
+                             (build-uri-reference
+                              #:scheme 'http
+                              #:host "www.w3.org"
+                              #:path "/pub/WWW/TheProject.html")
                              (1 . 1))
   (pass-if-read-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
                              GET
-                             (build-uri 'http
-                                        #:path "/pub/WWW/TheProject.html")
+                             (build-uri-reference
+                              #:path "/pub/WWW/TheProject.html")
                              (1 . 1))
   (pass-if-read-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
                              HEAD
-                             (build-uri 'http
-                                        #:path "/etc/hosts"
-                                        #:query "foo=bar")
+                             (build-uri-reference
+                              #:path "/etc/hosts"
+                              #:query "foo=bar")
                              (1 . 1)))
 
 (with-test-prefix "write-request-line"
   (pass-if-write-request-line "GET / HTTP/1.1"
                               GET
-                              (build-uri 'http
-                                         #:path "/")
+                              (build-uri-reference
+                               #:path "/")
                               (1 . 1))
   ;;; FIXME: Test fails due to scheme, host always being removed.
   ;;; However, it should be supported to request these be present, and
@@ -188,8 +189,8 @@
   ;;                             (1 . 1))
   (pass-if-write-request-line "GET /pub/WWW/TheProject.html HTTP/1.1"
                               GET
-                              (build-uri 'http
-                                         #:path "/pub/WWW/TheProject.html")
+                              (build-uri-reference
+                               #:path "/pub/WWW/TheProject.html")
                               (1 . 1))
   (pass-if-write-request-line "GET /?foo HTTP/1.1"
                               GET
@@ -197,9 +198,9 @@
                               (1 . 1))
   (pass-if-write-request-line "HEAD /etc/hosts?foo=bar HTTP/1.1"
                               HEAD
-                              (build-uri 'http
-                                         #:path "/etc/hosts"
-                                         #:query "foo=bar")
+                              (build-uri-reference
+                               #:path "/etc/hosts"
+                               #:query "foo=bar")
                               (1 . 1)))
 
 (with-test-prefix "read-response-line"
@@ -298,6 +299,12 @@
   (pass-if-parse content-length "010" 10)
   (pass-if-parse content-location "http://foo/";
                  (build-uri 'http #:host "foo" #:path "/"))
+  (pass-if-parse content-location "//foo/"
+                 (build-uri-reference #:host "foo" #:path "/"))
+  (pass-if-parse content-location "/etc/foo"
+                 (build-uri-reference #:path "/etc/foo"))
+  (pass-if-parse content-location "foo"
+                 (build-uri-reference #:path "foo"))
   (pass-if-parse content-range "bytes 10-20/*" '(bytes (10 . 20) *))
   (pass-if-parse content-range "bytes */*" '(bytes * *))
   (pass-if-parse content-range "bytes */30" '(bytes * 30))
@@ -370,6 +377,14 @@
   (pass-if-parse range "bytes=-20,-30" '(bytes (#f . 20) (#f . 30)))
   (pass-if-parse referer "http://foo/bar?baz";
                  (build-uri 'http #:host "foo" #:path "/bar" #:query "baz"))
+  (pass-if-parse referer "//foo/bar?baz"
+                 (build-uri-reference #:host "foo"
+                                      #:path "/bar"
+                                      #:query "baz"))
+  (pass-if-parse referer "/etc/foo"
+                 (build-uri-reference #:path "/etc/foo"))
+  (pass-if-parse referer "foo"
+                 (build-uri-reference #:path "foo"))
   (pass-if-parse te "trailers" '((trailers)))
   (pass-if-parse te "trailers,foo" '((trailers) (foo)))
   (pass-if-parse user-agent "guile" "guile"))
diff --git a/test-suite/tests/web-request.test 
b/test-suite/tests/web-request.test
index 8cf1c2e..68721d3 100644
--- a/test-suite/tests/web-request.test
+++ b/test-suite/tests/web-request.test
@@ -1,6 +1,6 @@
 ;;;; web-request.test --- HTTP requests       -*- mode: scheme; coding: utf-8; 
-*-
 ;;;;
-;;;;   Copyright (C) 2010, 2011 Free Software Foundation, Inc.
+;;;;   Copyright (C) 2010, 2011, 2013 Free Software Foundation, Inc.
 ;;;;
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -53,7 +53,8 @@ Accept-Language: en-gb, en;q=0.9\r
     
     (pass-if (equal? (request-method r) 'GET))
     
-    (pass-if (equal? (request-uri r) (build-uri 'http #:path "/qux")))
+    (pass-if (equal? (request-uri r)
+                     (build-uri-reference #:path "/qux")))
     
     (pass-if (equal? (read-request-body r) #f))
 
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index ad56f6f..7339189 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -1,6 +1,6 @@
 ;;;; web-uri.test --- URI library          -*- mode: scheme; coding: utf-8; -*-
 ;;;;
-;;;;   Copyright (C) 2010, 2011, 2012, 2014 Free Software Foundation, Inc.
+;;;;   Copyright (C) 2010-2012, 2014, 2017 Free Software Foundation, Inc.
 ;;;;
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -27,7 +27,7 @@
 
 
 (define* (uri=? uri #:key scheme userinfo host port path query fragment)
-  (and (uri? uri)
+  (and (uri-reference? uri)
        (equal? (uri-scheme uri) scheme)
        (equal? (uri-userinfo uri) userinfo)
        (equal? (uri-host uri) host)
@@ -123,6 +123,22 @@
                          "Expected.*host"
                          (build-uri 'http #:userinfo "foo")))
 
+(with-test-prefix "build-uri-reference"
+  (pass-if "//host/etc/foo"
+    (uri=? (build-uri-reference #:host "host"
+                                #:path "/etc/foo")
+           #:host "host"
+           #:path "/etc/foo"))
+
+  (pass-if "/path/to/some/foo?query"
+    (uri=? (build-uri-reference #:path "/path/to/some/foo"
+                                #:query "query")
+           #:path "/path/to/some/foo"
+           #:query "query"))
+
+  (pass-if "nextdoc/foo"
+    (uri=? (build-uri-reference #:path "nextdoc/foo")
+           #:path "nextdoc/foo")))
 
 (with-test-prefix "string->uri"
   (pass-if "ftp:"
@@ -503,6 +519,30 @@
            #:query "q"
            #:fragment "bar")))
 
+(with-test-prefix "string->uri-reference"
+  (pass-if "/"
+    (uri=? (string->uri-reference "/")
+           #:path "/"))
+
+  (pass-if "/path/to/foo"
+    (uri=? (string->uri-reference "/path/to/foo")
+           #:path "/path/to/foo"))
+
+  (pass-if "//example.org"
+    (uri=? (string->uri-reference "//example.org")
+           #:host "example.org"
+           #:path ""))
+
+  (pass-if "//address@hidden/path/to/foo"
+    (uri=? (string->uri-reference "//address@hidden/path/to/foo")
+           #:userinfo "bar"
+           #:host "example.org"
+           #:path "/path/to/foo"))
+
+  (pass-if "nextdoc/foo"
+    (uri=? (string->uri-reference "nextdoc/foo")
+           #:path "nextdoc/foo")))
+
 (with-test-prefix "uri->string"
   (pass-if "ftp:"
     (equal? "ftp:"
@@ -587,7 +627,23 @@
 
   (pass-if "foo/?bar#baz"
     (equal? "foo/?bar#baz"
-            (uri->string (string->uri-reference "foo/?bar#baz")))))
+            (uri->string (string->uri-reference "foo/?bar#baz"))))
+
+  (pass-if "/path/to/foo"
+    (equal? "/path/to/foo"
+            (uri->string (string->uri-reference "/path/to/foo"))))
+
+  (pass-if "//example.org"
+    (equal? "//example.org"
+            (uri->string (string->uri-reference "//example.org"))))
+
+  (pass-if "//address@hidden/path/to/foo"
+    (equal? "//address@hidden/path/to/foo"
+            (uri->string (string->uri-reference 
"//address@hidden/path/to/foo"))))
+
+  (pass-if "nextdoc/foo"
+    (equal? "nextdoc/foo"
+            (uri->string (string->uri-reference "nextdoc/foo")))))
 
 (with-test-prefix "decode"
   (pass-if "foo%20bar"



reply via email to

[Prev in Thread] Current Thread [Next in Thread]