[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-janitors] #998: uri->string / make-uri path encoding incons
From: |
Chicken Trac |
Subject: |
Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies |
Date: |
Thu, 14 Mar 2013 21:12:33 -0000 |
#998: uri->string / make-uri path encoding inconsistencies
----------------------+-----------------------------------------------------
Reporter: andyjpb | Owner: sjamaan
Type: defect | Status: accepted
Priority: major | Milestone: someday
Component: unknown | Version: 4.8.x
Resolution: | Keywords:
----------------------+-----------------------------------------------------
Comment(by sjamaan):
I'm unsure but this appears to be correct. The fact that the original
string is read/write invariant is a feature specifically made so that non-
HTTP URIs keep their exact encoding, which makes it easier for
applications to extract the original "generic" URI from the object in
unmodified form.
When generating (or updating the component), these characters get encoded.
It is *extremely* unclear from the spec what should happen in this case.
According to RFC3986 (URI), section 2.2:
"URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component."
and section 2.3:
"URIs that differ in the replacement of an unreserved character with
its corresponding percent-encoded US-ASCII octet are equivalent:
they identify the same resource."
Coupled with RFC2616 (HTTP/1.1) section 3.2.3:
"Characters other than those in the "reserved" and "unsafe" sets (see
RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding."
Besides the fact that "unsafe" is not even declared in that RFC (which is
the 3986 predecessor), I interpret this to mean that special characters
are to be treated as special, and implementations should be as
conservative as possible, and percent-encode all these other characters.
This means that "./5:123", "5%3A123" and "./5%3A123" are all distinct URIs
which should be differentiated on the server-side. There's no sane choice
to be made except to just encode everything that isn't 100% safe.
uri-generic, on the other hand, does _not_ encode anything except the
slash, because it explicitly puts more control into the user's hands and
allows the user to determine which of these three paths from "./5:123",
"5%3A123" and "./5%3A123" he wants. In that sense, uri-generic is more
low-level and therefore allows more fine-grained control.
--
Ticket URL: <http://bugs.call-cc.org/ticket/998#comment:2>
Chicken Scheme <http://www.call-with-current-continuation.org/>
Chicken Scheme is a compiler for the Scheme programming language.
- [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies, Chicken Trac, 2013/03/14
- Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies, Chicken Trac, 2013/03/14
- Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies,
Chicken Trac <=
- Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies, Chicken Trac, 2013/03/14
- Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies, Chicken Trac, 2013/03/14
- Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies, Chicken Trac, 2013/03/14
- Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies, Chicken Trac, 2013/03/14
- Re: [Chicken-janitors] #998: uri->string / make-uri path encoding inconsistencies, Chicken Trac, 2013/03/14