guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] 01/01: Fix uri-decode behavior for "+"


From: Andy Wingo
Subject: [Guile-commits] 01/01: Fix uri-decode behavior for "+"
Date: Mon, 20 Jun 2016 12:48:28 +0000 (UTC)

wingo pushed a commit to branch master
in repository guile.

commit 687d393e2c9dbc57fa1d0290fbf3b2c93cbfcdf6
Author: Andy Wingo <address@hidden>
Date:   Mon Jun 20 14:34:19 2016 +0200

    Fix uri-decode behavior for "+"
    
    * module/web/uri.scm (uri-decode): Add #:decode-plus-to-space? keyword
      argument.
      (split-and-decode-uri-path): Don't decode plus to space.
    * doc/ref/web.texi (URIs): Update documentation.
    * test-suite/tests/web-uri.test ("decode"): Add tests.
    * NEWS: Add entry.
    
    Based on a patch by Brent <address@hidden>.
---
 NEWS                          |    7 +++++++
 doc/ref/web.texi              |    7 ++++++-
 module/web/uri.scm            |   11 ++++++++---
 test-suite/tests/web-uri.test |    5 ++++-
 4 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/NEWS b/NEWS
index a54aa1d..651d0d7 100644
--- a/NEWS
+++ b/NEWS
@@ -6,6 +6,13 @@ Please send Guile bug reports to address@hidden
 
 
 
+Changes in 2.1.4 (changes since the 2.1.3 alpha release):
+
+* Bug fixes
+** Don't replace + with space when splitting and decoding URI paths
+
+
+[TODO: Fold into generic 2.2 release notes.]
 Changes in 2.1.3 (changes since the 2.1.2 alpha release):
 
 * Notable changes
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index b078929..becdc28 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -269,7 +269,7 @@ serialization.
 Declare a default port for the given URI scheme.
 @end deffn
 
address@hidden {Scheme Procedure} uri-decode str [#:address@hidden"utf-8"}]
address@hidden {Scheme Procedure} uri-decode str [#:address@hidden"utf-8"}] 
[#:decode-plus-to-space? #t]
 Percent-decode the given @var{str}, according to @var{encoding}, which
 should be the name of a character encoding.
 
@@ -286,6 +286,11 @@ decoded bytes are not valid for the given encoding. Pass 
@code{#f} for
 @xref{Ports, @code{set-port-encoding!}}, for more information on
 character encodings.
 
+If @var{decode-plus-to-space?} is true, which is the default, also
+replace instances of the plus character @samp{+} with a space character.
+This is needed when parsing @code{application/x-www-form-urlencoded}
+data.
+
 Returns a string of the decoded characters, or a bytevector if
 @var{encoding} was @code{#f}.
 @end deffn
diff --git a/module/web/uri.scm b/module/web/uri.scm
index e1c8b39..848d500 100644
--- a/module/web/uri.scm
+++ b/module/web/uri.scm
@@ -322,7 +322,7 @@ serialization."
 (define hex-chars
   (string->char-set "0123456789abcdefABCDEF"))
 
-(define* (uri-decode str #:key (encoding "utf-8"))
+(define* (uri-decode str #:key (encoding "utf-8") (decode-plus-to-space? #t))
   "Percent-decode the given STR, according to ENCODING,
 which should be the name of a character encoding.
 
@@ -338,6 +338,10 @@ bytes are not valid for the given encoding. Pass ‘#f’ for 
ENCODING if
 you want decoded bytes as a bytevector directly.  ‘set-port-encoding!’,
 for more information on character encodings.
 
+If DECODE-PLUS-TO-SPACE? is true, which is the default, also replace
+instances of the plus character (+) with a space character.  This is
+needed when parsing application/x-www-form-urlencoded data.
+
 Returns a string of the decoded characters, or a bytevector if
 ENCODING was ‘#f’."
   (let* ((len (string-length str))
@@ -348,7 +352,7 @@ ENCODING was ‘#f’."
                (if (< i len)
                    (let ((ch (string-ref str i)))
                      (cond
-                      ((eqv? ch #\+)
+                      ((and (eqv? ch #\+) decode-plus-to-space?)
                        (put-u8 port (char->integer #\space))
                        (lp (1+ i)))
                       ((and (< (+ i 2) len) (eqv? ch #\%)
@@ -431,7 +435,8 @@ removing empty components.
 For example, ‘\"/foo/bar%20baz/\"’ decodes to the two-element list,
 ‘(\"foo\" \"bar baz\")’."
   (filter (lambda (x) (not (string-null? x)))
-          (map uri-decode (string-split path #\/))))
+          (map (lambda (s) (uri-decode s #:decode-plus-to-space? #f))
+               (string-split path #\/))))
 
 (define (encode-and-join-uri-path parts)
   "URI-encode each element of PARTS, which should be a list of
diff --git a/test-suite/tests/web-uri.test b/test-suite/tests/web-uri.test
index 4873d7f..ad56f6f 100644
--- a/test-suite/tests/web-uri.test
+++ b/test-suite/tests/web-uri.test
@@ -594,7 +594,10 @@
     (equal? "foo bar" (uri-decode "foo%20bar")))
 
   (pass-if "foo+bar"
-    (equal? "foo bar" (uri-decode "foo+bar"))))
+    (equal? "foo bar" (uri-decode "foo+bar")))
+
+  (pass-if "foo+bar"
+    (equal? '("foo+bar") (split-and-decode-uri-path "foo+bar"))))
 
 (with-test-prefix "encode"
   (pass-if (equal? "foo%20bar" (uri-encode "foo bar")))



reply via email to

[Prev in Thread] Current Thread [Next in Thread]