emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: eww doesn't decode %AA%BB%CC URL names


From: Lars Ingebrigtsen
Subject: Re: eww doesn't decode %AA%BB%CC URL names
Date: Thu, 24 Dec 2015 18:40:59 +0100
User-agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.1.50 (gnu/linux)

Eli Zaretskii <address@hidden> writes:

> When I visit a URL in eww and press 'd' on a link like this:
>
>   https://ru.wikipedia.org/wiki/%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5
>
> the file Emacs creates a file whose name is made of those hex-encoded
> characters as you see them in this mail.  Shouldn't we decode them?
> Firefox does.

We should.  Let's see...

(url-unhex-string "%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5")
=> "\320\241\320\265\321\200\320\264\321\206\320\265"

Uhm...

(decode-coding-string (url-unhex-string
"%D0%A1%D0%B5%D1%80%D0%B4%D1%86%D0%B5")
'utf-8)
=> "Сердце"

Right.  What charset do we choose?  I guess using the charset of the
document we're in doesn't make much sense (because it's linking to
something off-site which may be in a different charset)...

Perhaps just run a `detect-coding-string' on it?

Or!  We've just downloaded the file, after all, and the charset of the
file itself may tell us what the charset of the name is...  On the other
hand, probably not.  (For instance, a PDF with a Cyrillic name would
probably still just be reported by the web server as being binary.)

`detect-coding-string' it is, I guess, unless anybody has a better idea?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no



reply via email to

[Prev in Thread] Current Thread [Next in Thread]