chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.


From: Alex Shinn
Subject: Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.
Date: Wed, 16 Jan 2013 00:39:16 +0900

On Tue, Jan 15, 2013 at 7:48 PM, Peter Bex <address@hidden> wrote:

These special characters are called "reserved" in the BNF.  As you can
see, the question mark, equals sign and ampersand is in there.
For query urlencoded query strings, these *cannot* be decoded, because
then you can't distinguish between

http://calc.example.com?bool-expr=x%26y%3D
and
http://calc.example.com?bool-expr=x&y=1

The former should be decoded in uri-common to the alist
((bool-expr . "x&y=1")) and the latter to ((bool-expr . "x") (y . "1")).
By fully decoding all reserved characters in uri-generic, you drop
important information.

The internal representation is either decoded, or it is encoded.
Either can be made to work.

In this case, the decoded uri-common representation of the former is:

  ((bool-expr . "x&y=1"))

and the decoded representation of the latter is:

  ((bool-expr . "x") (y . "1"))

just as you say, so this is how they are stored in the URI object.

In uri-generic, both get parsed to:

  ((bool-expr . "x&y=1"))

As the RFC states:

   Because the percent ("%") character serves as the indicator for
   percent-encoded octets, it must be percent-encoded as "%25" for that
   octet to be used as data within a URI.

Therefore, if you intended the raw URI data to include a "%",
then the correct representation (for either common or generic)
would have been:

  http://calc.example.com?bool-expr=x%2526y%253D

So assuming & is _not_ special to the query (as is the case
with uri-generic), escaping & with %25 or not produces the
same result.

-- 
Alex


reply via email to

[Prev in Thread] Current Thread [Next in Thread]