Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released)

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released)

From:	Eli Zaretskii
Subject:	Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released)
Date:	Mon, 14 Dec 2015 18:33:38 +0200

> Date: Sun, 13 Dec 2015 20:04:31 +0100
> From: "Andries E. Brouwer" <address@hidden>
> Cc: "Andries E. Brouwer" <address@hidden>, address@hidden
> 
> On Sun, Dec 13, 2015 at 08:01:27PM +0200, Eli Zaretskii wrote:
> 
> > If no one is going to pick up the gauntlet, I will sit down and do it
> > myself, although I'm terribly busy with Emacs 25.1 release.
> 
> Good!

While working on this, I bumped into 2 related issues:

 1. The functions that call 'iconv' (in iri.c) don't make a point of
    flushing the last portion of the converted URL after 'iconv'
    returns successfully having converted the input string in its
    entirety.  IME, you need then to call 'iconv' one last time with
    either the 2nd or the 3rd argument set to NULL, otherwise
    sometimes the last converted character doesn't get output.  In my
    case, some URLs converted from CP1255 to UTF-8 lost their last
    character.  It sounds like no one has actually used this
    conversion in iri.c, except for trivially converting UTF-8 to
    itself.  Is that possible/reasonable?

 2. Wget assumes that the URL given on its command line is encoded in
    the locale's encoding.  This is a good assumption when the user
    herself types the URL at the shell prompt, but not when the URL is
    copy-pasted from a browser's address bar.  In the latter case, the
    URL tends to be in UTF-8 (sometimes hex-encoded).  At least that's
    what I get from Firefox.  We don't seem to have in wget any
    facilities to specify a separate (3rd) encoding for the URLs on
    the command line, do we?

Thanks.

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-wget] GNU wget 1.17.1 released, Giuseppe Scrivano, 2015/12/11
- Re: [Bug-wget] GNU wget 1.17.1 released, Andries E. Brouwer, 2015/12/11
  - Re: [Bug-wget] GNU wget 1.17.1 released, Ander Juaristi, 2015/12/13
    - Re: [Bug-wget] GNU wget 1.17.1 released, Tim Rühsen, 2015/12/13
    - Re: [Bug-wget] GNU wget 1.17.1 released, Eli Zaretskii, 2015/12/13
    - Re: [Bug-wget] GNU wget 1.17.1 released, Tim Rühsen, 2015/12/13
    - Re: [Bug-wget] GNU wget 1.17.1 released, Eli Zaretskii, 2015/12/13
    - Re: [Bug-wget] GNU wget 1.17.1 released, Andries E. Brouwer, 2015/12/13
    - Re: [Bug-wget] GNU wget 1.17.1 released, Eli Zaretskii, 2015/12/13
    - Re: [Bug-wget] GNU wget 1.17.1 released, Andries E. Brouwer, 2015/12/13
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Eli Zaretskii <=
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Rühsen, 2015/12/14
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Eli Zaretskii, 2015/12/14
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Rühsen, 2015/12/14
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Eli Zaretskii, 2015/12/15
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/17
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Andries E. Brouwer, 2015/12/15
    - Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15

Prev by Date: Re: [Bug-wget] GNU wget 1.17.1 released
Next by Date: Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released)
Previous by thread: Re: [Bug-wget] GNU wget 1.17.1 released
Next by thread: Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released)
Index(es):
- Date
- Thread