libmicrohttpd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [libmicrohttpd] Missing Feature: Custom HTTP decoding


From: Christian Grothoff
Subject: Re: [libmicrohttpd] Missing Feature: Custom HTTP decoding
Date: Fri, 20 Aug 2010 20:47:58 +0200
User-agent: KMail/1.13.2 (Linux/2.6.32-24-generic; KDE/4.4.2; i686; ; )

On Friday, August 20, 2010 04:14:34 pm Gerrit Telkamp wrote:
> I would like to know your opinion about a feature, we are currently missing
> in libmicrohttpd.
> 
> It concerns the URL decoding, implemented in function MHD_http_unescape()
> (file "internal.c"). This function escapes all characters encoded as %HH
> into a single byte. It is used e.g. to convert an URL like
> "http://hello%20world.com";, received from the browser.
> 
> If you are using libmicrohttpd for an AJAX application, MHD_http_unescape()
> will not be enough. The JavaScript code running in the browser might send
> a string containing differnt characters that those used for URLs. Some
> browsers are sending 16 bit codes, encoded as "%uXXXX" (see
> alsohttp://www.w3.org/International/O-URL-code.html). And not all browsers
> running on all operation systems are using the same characzer encoding -
> we have seen that some browsers are using ISO8859-1, others are using
> UTF-8. A German umlaut "ΓΌ" for example is encoded as "%FC" and not as
> "%C3%BC" (UTF-8).
> 
> A good solution might be to support a custom-specific character decoder,
> that is called by libmicrohttpd instead of its internal
> MHD_http_unescape(). This custom-specific decoder should be provided as a
> call-back method by the user. If it is not defined, the internal
> MHD_http_unescape() will be used.
> 
> The decoder should get a pointer to the input butter, and might return a
> new pointer to the output buffer. This would be useful if the decoded
> characters in the output buffer need more memory space than what is
> available in the input buffer.
> 
> We is your opinion?

I think your argument for a custom decoder makes sense; however, I'm not sure 
about allowing the custom decoder to return a new pointer.  That would require 
it to do memory allocation (at least as an option, and if it is optional the 
interface will be even messier), and then we'd have to handle failures of that 
(yuck). 

This is especially critical given that your points do not seem to justify any 
need for an unescape function to return a string that is longer than the 
original input.  If there is such a case, please describe it.

Thanks!

Christian



reply via email to

[Prev in Thread] Current Thread [Next in Thread]