bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Semicolon not allowed in userinfo


From: Tim Rühsen
Subject: Re: Semicolon not allowed in userinfo
Date: Tue, 3 Oct 2023 14:36:25 +0200
User-agent: Mozilla Thunderbird

Hi,

On 10/2/23 10:55, Bachir Bendrissou wrote:
Hi,

The following url example contains a semicolon in the userinfo segment:


*http://a <http://a>;b:c@xyz*
Wget rejects this url with the following error message:

*http://a <http://a>;b:c@xyz: Bad port number.*

It seems that Wget sees "c" as a port number. When "c" is replaced by a
digit, Wget accepts the url and attempts to resolve "xyz".

Wget doesn't follow the current specs and the parsing is lenient to accept some types of badly formatted URLs seen in the wild.

But we should possibly become more strict and compliant to current specs.


It's worth noting that curl and aria2 both accept the url example.

My  version of curl (8.3.0) doesn't accept it:

curl -vvv 'http://a <http://a>;b:c@xyz'
* URL rejected: Malformed input to a URL function
* Closing connection
curl: (3) URL rejected: Malformed input to a URL function

All the URL parsers are slightly different when it comes to edge cases.
I'd consider curl as a good reference.

Why is the semicolon not allowed in userinfo, despite that other special
characters are allowed?

First of all, userinfo does not allow spaces at all (look at https://datatracker.ietf.org/doc/html/rfc3986).
  userinfo    = *( unreserved / pct-encoded / sub-delims / ":" )
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
  sub-delims  = !$&'()*+,;=
  pct-encoded = "%" HEXDIG HEXDIG


Thank you,
Bachir

Regards, Tim

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]