chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.


From: Sungjin Chun
Subject: Re: [Chicken-users] [Q] uri-common has problem with UTF-8 uri.
Date: Mon, 14 Jan 2013 13:36:35 +0900

As far as I know, revised RFC permits UTF-8 characters in the URL without encoding. Am I wrong here?
Even Solr (the search engine) permits them.


On Mon, Jan 14, 2013 at 1:26 PM, Alex Shinn <address@hidden> wrote:
Hi,

On Mon, Jan 14, 2013 at 12:52 PM, Sungjin Chun <address@hidden> wrote:
First, I might have found wrong place but...

It seems that the main source of the my problem is related to the part of uri-generic.scm, especially;

(define char-set:uri-unreserved
  (char-set union char-set:letter+digit (string->char-set "-_.~")))

If I change this part as;

(define char-set:uri-unreserved
  (char-set union char-set:letter+digit (string->char-set "-_.~") char-set:hangul))

then, uri/url with korean characters work. How can I set those part more generic one?

I believe the ASCII definition is correct even for Unicode URLs.
You need to represent the URL in utf8 and then use percent
escapes on the utf8 bytes, which is what would happen naturally
here.

-- 
Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]